Ouch! Looks like a logger was left behind in DEBUG mode. Can you manually turn that off?
More memory would help in this case, because it seems that the foreman node is the one running out of heap space as it goes through the metadata for all the files. Is there a reason you are generating so many files to query? There is most likely a lower threshold for a parquet file size, below which you might be better off just using something like a CSV format. -----Original Message----- From: François Méthot [mailto:fmetho...@gmail.com] Sent: Tuesday, October 17, 2017 10:35 AM To: dev@drill.apache.org Subject: log flooded by "date values definitively CORRECT" Hi again, I am running into an issue on a query done on 760 000 parquet files stored in HDFS. We are using Drill 1.10, 8GB heap, 20GB direct mem. Drill runs with debug log enabled all the time. The query is standard select on 8 fields from hdfs.`/path` where this = that .... For about an hour I see this message on the foreman: [pool-9-thread-##] DEBUG o.a.d.exec.store.parquet.Metadata - It is determined from metadata that the date values are definitely CORRECT Then [some UUID:foreman] INFO o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata : Executed 761659 out of 761659 using 16 threads. Time : 3022416ms Then : Java.lang.OutOfMemoryError: Java Heap Space at java.util.Arrays.copyOf ... at java.io.PrintWriter.println(PrintWriter.java:757) at org.apache.calcite.rel.externalize.RelWriterImplt.explain (RelWriterImpl.java:118) at org.apachje.calcite.rel.externalize.RelWriterImpl.done (RelWriterImpl.java:160) ... at org.apache.calcite.plan.RelOptUtil.toString (RelOptUtil.java:1927) at org.apache.drill.exec.planner.sql.handlers.DefaultSQLHandler.log(DefaultSQLHandler.java:138) ... at org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler:102) at org.apache.drill.exec.planner.DrillSqlWorker.getQueryPlan(DrillSqlWorker:131) ... at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:1050) at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:281) I think it might be caused by having too much files to query, chunking our select into smaller piece actually helped. Also suspect that the DEBUG logging is taxing the poor node a bit much. Do you think adding more memory would address the issue (I can't try this right now) or you would think it is caused by a bug? Thank in advance for any advises, Francois