[ https://issues.apache.org/jira/browse/DRILL-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007816#comment-15007816 ]
Khurram Faraaz commented on DRILL-3827: --------------------------------------- {code} Note that file is owned by user mapr. [root@centos-01 ~]# hadoop fs -ls /tmp/CTAS_ONE_MILN_RWS_PER_GROUP Found 99 items -rwxr-xr-x 3 mapr mapr 68554 2015-11-17 01:44 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/.drill.parquet_metadata Metadata cache file is removed. root@centos-01 ~]# hadoop fs -rmr /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/.drill.parquet_metadata rmr: DEPRECATED: Please use 'rm -r' instead. 15/11/17 01:47:01 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum 15/11/17 01:47:01 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes. Deleted /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/.drill.parquet_metadata replace with an empty file [root@centos-01 ~]# hadoop fs -put .drill.parquet_metadata /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/ Note that file is now owned by root [root@centos-01 ~]# hadoop fs -ls /tmp/CTAS_ONE_MILN_RWS_PER_GROUP Found 99 items -rwxr-xr-x 3 root root 0 2015-11-17 01:48 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/.drill.parquet_metadata {code} {code} [root@centos-01 bin]# ./sqlline -u "jdbc:drill:schema=dfs.tmp -n mapr -p mapr" apache drill 1.4.0-SNAPSHOT "got drill?" 0: jdbc:drill:schema=dfs.tmp> select count(*) from CTAS_ONE_MILN_RWS_PER_GROUP; +-----------+ | EXPR$0 | +-----------+ | 50000000 | +-----------+ 1 row selected (0.908 seconds) 0: jdbc:drill:schema=dfs.tmp> refresh table metadata CTAS_ONE_MILN_RWS_PER_GROUP; +-------+-----------------------------------------------------------------------+ | ok | summary | +-------+-----------------------------------------------------------------------+ | true | Successfully updated metadata for table CTAS_ONE_MILN_RWS_PER_GROUP. | +-------+-----------------------------------------------------------------------+ 1 row selected (0.607 seconds) 0: jdbc:drill:schema=dfs.tmp> select count(*) from CTAS_ONE_MILN_RWS_PER_GROUP; +-----------+ | EXPR$0 | +-----------+ | 50000000 | +-----------+ 1 row selected (0.439 seconds) ... 0: jdbc:drill:schema=dfs.tmp> select count(*) from CTAS_ONE_MILN_RWS_PER_GROUP; Error: SYSTEM ERROR: JsonMappingException: No content to map due to end-of-input at [Source: com.mapr.fs.MapRFsDataInputStream@5bf94307; line: 1, column: 1] [Error Id: 5826fd6a-c99e-4cb0-940f-8180a663932b on centos-01.qa.lab:31010] (state=,code=0) {code} Here is the stack trace from drillbit.log {code} 2015-11-17 01:48:34,519 [29b5788c-b51c-bba9-3328-998e82a3b462:foreman] ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: JsonMappingException: No content to map due to end-of-input at [Source: com.mapr.fs.MapRFsDataInputStream@5bf94307; line: 1, column: 1] [Error Id: 5826fd6a-c99e-4cb0-940f-8180a663932b on centos-01.qa.lab:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: JsonMappingException: No content to map due to end-of-input at [Source: com.mapr.fs.MapRFsDataInputStream@5bf94307; line: 1, column: 1] [Error Id: 5826fd6a-c99e-4cb0-940f-8180a663932b on centos-01.qa.lab:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534) ~[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:742) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:841) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:786) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) [drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:788) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:894) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:255) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception during fragment initialization: Internal error: Error while applying rule DrillPushProjIntoScan, args [rel#7343700:LogicalProject.NONE.ANY([]).[](input=rel#7343699:Subset#0.ENUMERABLE.ANY([]).[],$f0=0), rel#7343691:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[dfs, tmp, CTAS_ONE_MILN_RWS_PER_GROUP])] ... 4 common frames omitted Caused by: java.lang.AssertionError: Internal error: Error while applying rule DrillPushProjIntoScan, args [rel#7343700:LogicalProject.NONE.ANY([]).[](input=rel#7343699:Subset#0.ENUMERABLE.ANY([]).[],$f0=0), rel#7343691:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[dfs, tmp, CTAS_ONE_MILN_RWS_PER_GROUP])] at org.apache.calcite.util.Util.newInternal(Util.java:792) ~[calcite-core-1.4.0-drill-r8.jar:1.4.0-drill-r8] at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251) ~[calcite-core-1.4.0-drill-r8.jar:1.4.0-drill-r8] at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808) ~[calcite-core-1.4.0-drill-r8.jar:1.4.0-drill-r8] at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) ~[calcite-core-1.4.0-drill-r8.jar:1.4.0-drill-r8] at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303) ~[calcite-core-1.4.0-drill-r8.jar:1.4.0-drill-r8] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:164) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:184) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] ... 3 common frames omitted Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input at [Source: com.mapr.fs.MapRFsDataInputStream@5bf94307; line: 1, column: 1] at org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushProjIntoScan.java:89) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) ~[calcite-core-1.4.0-drill-r8.jar:1.4.0-drill-r8] ... 13 common frames omitted Caused by: com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input at [Source: com.mapr.fs.MapRFsDataInputStream@5bf94307; line: 1, column: 1] at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148) ~[jackson-databind-2.4.3.jar:2.4.3] at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3095) ~[jackson-databind-2.4.3.jar:2.4.3] at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3036) ~[jackson-databind-2.4.3.jar:2.4.3] at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2192) ~[jackson-databind-2.4.3.jar:2.4.3] at org.apache.drill.exec.store.parquet.Metadata.readBlockMeta(Metadata.java:363) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.Metadata.readBlockMeta(Metadata.java:113) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetGroupScan.init(ParquetGroupScan.java:555) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetGroupScan.<init>(ParquetGroupScan.java:189) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:169) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:67) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.store.dfs.FileSystemPlugin.getPhysicalScan(FileSystemPlugin.java:126) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:58) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:72) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushProjIntoScan.java:57) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] ... 14 common frames omitted {code} > Empty metadata file causes queries on the table to fail > ------------------------------------------------------- > > Key: DRILL-3827 > URL: https://issues.apache.org/jira/browse/DRILL-3827 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Affects Versions: 1.2.0 > Reporter: Victoria Markman > Assignee: Parth Chandra > Priority: Critical > Fix For: 1.4.0 > > > I ran into a situation where drill created an empty metadata file (which is a > separate issue and I will try to narrow it down. Suspicion is that this > happens when "refresh table metada x" fails with "permission denied" error). > However, we need to guard against situation where metadata file is empty or > corrupted. We probably should skip reading it if we encounter unexpected > result and continue with query planning without that information. In the same > fashion as partition pruning failure. It's also important to log this > information somewhere, drillbit.log as a start. It would be really nice to > have a flag in the query profile that tells a user if we used metadata file > for planning or not. Will help in debugging performance issues. > Very confusing exception is thrown if you have zero length meta data file in > the directory: > {code} > [Wed Sep 23 07:45:28] # ls -la > total 2 > drwxr-xr-x 2 root root 2 Sep 10 14:55 . > drwxr-xr-x 16 root root 35 Sep 15 12:54 .. > -rwxr-xr-x 1 root root 483 Jul 1 11:29 0_0_0.parquet > -rwxr-xr-x 1 root root 0 Sep 10 14:55 .drill.parquet_metadata > 0: jdbc:drill:schema=dfs> select * from t1; > Error: SYSTEM ERROR: JsonMappingException: No content to map due to > end-of-input > at [Source: com.mapr.fs.MapRFsDataInputStream@342bd88d; line: 1, column: 1] > [Error Id: c97574f6-b3e8-4183-8557-c30df6ca675f on atsqa4-133.qa.lab:31010] > (state=,code=0) > {code} > Workaround is trivial, remove the file. Marking it as critical, since we > don't have any concurrency control in place and this file can get corrupted > as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)