[ 
https://issues.apache.org/jira/browse/DRILL-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972984#comment-14972984
 ] 

Steven Phillips commented on DRILL-3975:
----------------------------------------

My approach has been to remove the scheme and authority from the paths any time 
I encounter code that uses the path as a key, or does any sort of string 
comparison. This is an area where I think we need to clean up. I don't think we 
are very consistent throughout the code base in how was handle paths.

The usual trick I use to strip away the schema and authority is the method 
Path.getPathWithoutSchemeAndAuthority(Path p). If I have String objects and not 
Path objects, I will convert the String to a path, use the utility method to 
remove scheme and authority, and then call toString().

> Partition Planning rule causes query failure due to IndexOutOfBoundsException 
> on HDFS
> -------------------------------------------------------------------------------------
>
>                 Key: DRILL-3975
>                 URL: https://issues.apache.org/jira/browse/DRILL-3975
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>            Reporter: Jacques Nadeau
>
> In attempting to run the extended test suite provided by MapR, there are a 
> large number of queries that fail due to issues in the PruneScanRule and 
> specifically the DFSPartitionLocation constructor line 31. It is likely due 
> to issues with the code that are related to running on HDFS where this code 
> path has apparently not been tested.
> An example test query this type of failure occurred: 
> /src/drill-test-framework/resources/Functional/ctas/ctas_auto_partition/tpch0.01_multiple_partitions/data/q11.q
> Example stack trace below:
> {code}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> StringIndexOutOfBoundsException: String index out of range: -12
> [Error Id: f2941267-49b1-4f67-a17f-610ffb13fcb7 on 
> ip-172-31-30-32.us-west-2.compute.internal:31010]
>         at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:742)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:841)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:786)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
> [drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:788)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:894) 
> [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:255) 
> [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_85]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_85]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_85]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: Internal error: Error while 
> applying rule PruneScanRule:Filter_On_Scan_Parquet, args 
> [rel#43148:DrillFilterRel.LOGICAL.ANY([]).[](input=rel#43147:Subset#4.LOGICAL.ANY([]).[],condition==($0,
>  1)), rel#43241:DrillScanRel.LOGICAL.ANY([]).[](table=[dfs, 
> ctasAutoPartition, 
> tpch_multiple_partitions/lineitem_twopart_ordered2],groupscan=ParquetGroupScan
>  [entries=[ReadEntryWithPath 
> [path=hdfs://ip-172-31-30-32:54310/drill/testdata/ctas_auto_partition/tpch_multiple_partitions/lineitem_twopart_ordered2]],
>  
> selectionRoot=hdfs://ip-172-31-30-32:54310/drill/testdata/ctas_auto_partition/tpch_multiple_partitions/lineitem_twopart_ordered2,
>  numFiles=1, usedMetadataFile=false, columns=[`l_modline`, `l_moddate`]])]
>         ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Internal error: Error while applying 
> rule PruneScanRule:Filter_On_Scan_Parquet, args 
> [rel#43148:DrillFilterRel.LOGICAL.ANY([]).[](input=rel#43147:Subset#4.LOGICAL.ANY([]).[],condition==($0,
>  1)), rel#43241:DrillScanRel.LOGICAL.ANY([]).[](table=[dfs, 
> ctasAutoPartition, 
> tpch_multiple_partitions/lineitem_twopart_ordered2],groupscan=ParquetGroupScan
>  [entries=[ReadEntryWithPath 
> [path=hdfs://ip-172-31-30-32:54310/drill/testdata/ctas_auto_partition/tpch_multiple_partitions/lineitem_twopart_ordered2]],
>  
> selectionRoot=hdfs://ip-172-31-30-32:54310/drill/testdata/ctas_auto_partition/tpch_multiple_partitions/lineitem_twopart_ordered2,
>  numFiles=1, usedMetadataFile=false, columns=[`l_modline`, `l_moddate`]])]
>         at org.apache.calcite.util.Util.newInternal(Util.java:792) 
> ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>         at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
>  ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>         at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808)
>  ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>         at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) 
> ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>         at 
> org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303) 
> ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>         at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:164)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) 
> [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) 
> [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         ... 3 common frames omitted
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
> range: -12
>         at java.lang.String.substring(String.java:1875) ~[na:1.7.0_85]
>         at 
> org.apache.drill.exec.planner.DFSPartitionLocation.<init>(DFSPartitionLocation.java:31)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>         ... 13 common frames omitted
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to