[
https://issues.apache.org/jira/browse/DRILL-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014410#comment-14014410
]
Ramana Inukonda Nagaraj commented on DRILL-882:
-----------------------------------------------
Complete error:
Root: rel#2666:Subset#26.PHYSICAL.SINGLETON([]).[]
Original rel:
AbstractConverter(subset=[rel#2666:Subset#26.PHYSICAL.SINGLETON([]).[]],
convention=[PHYSICAL], DrillDistributionTraitDef=[SINGLETON([])], sort=[[]]):
rowcount = 1800.0, cumulative cost = {inf}, id = 2668
DrillScreenRel(subset=[rel#2665:Subset#26.LOGICAL.ANY([]).[]]): rowcount =
1800.0, cumulative cost = {180.0 rows, 180.0 cpu, 0.0 io, 0.0 network}, id =
2664
DrillProjectRel(subset=[rel#2663:Subset#25.LOGICAL.ANY([]).[]],
p_partkey=[$0]): rowcount = 1800.0, cumulative cost = {1800.0 rows, 4.0 cpu,
0.0 io, 0.0 network}, id = 2662
DrillFilterRel(subset=[rel#2661:Subset#24.LOGICAL.ANY([]).[]],
condition=[AND(=(CAST($0):ANY NOT NULL, $10), =($5, 41))]): rowcount = 1800.0,
cumulative cost = {80000.0 rows, 640000.0 cpu, 0.0 io, 0.0 network}, id = 2660
DrillJoinRel(subset=[rel#2659:Subset#23.LOGICAL.ANY([]).[]],
condition=[true], joinType=[inner]): rowcount = 80000.0, cumulative cost =
{80000.0 rows, 0.0 cpu, 0.0 io, 0.0 network}, id = 2658
DrillScanRel(subset=[rel#2656:Subset#21.LOGICAL.ANY([]).[]],
table=[[hive, part]]): rowcount = 10.0, cumulative cost = {10.0 rows, 90.0 cpu,
0.0 io, 0.0 network}, id = 2533
DrillScanRel(subset=[rel#2657:Subset#22.LOGICAL.ANY([]).[]],
table=[[dfs, drillTestDir, tpch-multi/partsupp]]): rowcount = 8000.0,
cumulative cost = {8000.0 rows, 16000.0 cpu, 0.0 io, 0.0 network}, id = 2446
Sets:
Set#21, type: RecordType(INTEGER p_partkey, VARCHAR(1) p_name, VARCHAR(1)
p_mfgr, VARCHAR(1) p_brand, VARCHAR(1) p_type, INTEGER p_size, VARCHAR(1)
p_container, FLOAT p_retailprice, VARCHAR(1) p_comment)
rel#2656:Subset#21.LOGICAL.ANY([]).[], best=rel#2533,
importance=0.5904900000000001
rel#2533:DrillScanRel.LOGICAL.ANY([]).[](table=[hive, part]),
rowcount=10.0, cumulative cost={10.0 rows, 90.0 cpu, 0.0 io, 0.0 network}
rel#2682:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2681:Subset#21.PHYSICAL.SINGLETON([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=10.0, cumulative cost={inf}
rel#2681:Subset#21.PHYSICAL.SINGLETON([]).[], best=rel#2680,
importance=0.531441
rel#2683:AbstractConverter.PHYSICAL.SINGLETON([]).[](child=rel#2656:Subset#21.LOGICAL.ANY([]).[],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[]),
rowcount=10.0, cumulative cost={inf}
rel#2680:ScanPrel.PHYSICAL.SINGLETON([]).[](groupscan=HiveScan
[table=Table(tableName:part, dbName:default, owner:root, createTime:1401494656,
lastAccessTime:0, retention:0,
sd:StorageDescriptor(cols:[FieldSchema(name:p_partkey, type:int, comment:null),
FieldSchema(name:p_name, type:string, comment:null), FieldSchema(name:p_mfgr,
type:string, comment:null), FieldSchema(name:p_brand, type:string,
comment:null), FieldSchema(name:p_type, type:string, comment:null),
FieldSchema(name:p_size, type:int, comment:null), FieldSchema(name:p_container,
type:string, comment:null), FieldSchema(name:p_retailprice, type:float,
comment:null), FieldSchema(name:p_comment, type:string, comment:null)],
location:maprfs:/drill/testdata/hive_storage/part,
inputFormat:org.apache.hadoop.mapred.TextInputFormat,
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
parameters:{serialization.format=|, field.delim=|}), bucketCols:[],
sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[],
skewedColValues:[], skewedColValueLocationMaps:{}),
storedAsSubDirectories:false), partitionKeys:[], parameters:{EXTERNAL=TRUE,
transient_lastDdlTime=1401494656}, viewOriginalText:null,
viewExpandedText:null, tableType:EXTERNAL_TABLE),
inputSplits=[maprfs:/drill/testdata/hive_storage/part/part.tbl:0+236074],
columns=null]), rowcount=10.0, cumulative cost={10.0 rows, 90.0 cpu, 0.0 io,
0.0 network}
Set#22, type: (DrillRecordRow[*, ps_partkey])
rel#2657:Subset#22.LOGICAL.ANY([]).[], best=rel#2446,
importance=0.5904900000000001
rel#2446:DrillScanRel.LOGICAL.ANY([]).[](table=[dfs,
drillTestDir, tpch-multi/partsupp]), rowcount=8000.0, cumulative cost={8000.0
rows, 16000.0 cpu, 0.0 io, 0.0 network}
rel#2686:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2685:Subset#22.PHYSICAL.RANDOM_DISTRIBUTED([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=8000.0, cumulative cost={inf}
rel#2685:Subset#22.PHYSICAL.RANDOM_DISTRIBUTED([]).[], best=rel#2684,
importance=0.531441
rel#2687:AbstractConverter.PHYSICAL.RANDOM_DISTRIBUTED([]).[](child=rel#2657:Subset#22.LOGICAL.ANY([]).[],convention=PHYSICAL,DrillDistributionTraitDef=RANDOM_DISTRIBUTED([]),sort=[]),
rowcount=8000.0, cumulative cost={inf}
rel#2684:ScanPrel.PHYSICAL.RANDOM_DISTRIBUTED([]).[](groupscan=ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/partsupp]],
selectionRoot=/drill/testdata/tpch-multi/partsupp, columns=[SchemaPath
[`ps_partkey`]]]), rowcount=8000.0, cumulative cost={8000.0 rows, 16000.0 cpu,
0.0 io, 0.0 network}
Set#23, type: RecordType(INTEGER p_partkey, VARCHAR(1) p_name, VARCHAR(1)
p_mfgr, VARCHAR(1) p_brand, VARCHAR(1) p_type, INTEGER p_size, VARCHAR(1)
p_container, FLOAT p_retailprice, VARCHAR(1) p_comment, ANY *, ANY ps_partkey)
rel#2659:Subset#23.LOGICAL.ANY([]).[], best=rel#2658, importance=0.6561
rel#2658:DrillJoinRel.LOGICAL.ANY([]).[](left=rel#2656:Subset#21.LOGICAL.ANY([]).[],right=rel#2657:Subset#22.LOGICAL.ANY([]).[],condition=true,joinType=inner),
rowcount=80000.0, cumulative cost={8011.0 rows, 16091.0 cpu, 0.0 io, 0.0
network}
rel#2678:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2677:Subset#23.PHYSICAL.ANY([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
rel#2677:Subset#23.PHYSICAL.ANY([]).[], best=null,
importance=0.5904900000000001
rel#2679:AbstractConverter.PHYSICAL.ANY([]).[](child=rel#2659:Subset#23.LOGICAL.ANY([]).[],convention=PHYSICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=80000.0, cumulative cost={inf}
Set#24, type: RecordType(INTEGER p_partkey, VARCHAR(1) p_name, VARCHAR(1)
p_mfgr, VARCHAR(1) p_brand, VARCHAR(1) p_type, INTEGER p_size, VARCHAR(1)
p_container, FLOAT p_retailprice, VARCHAR(1) p_comment, ANY *, ANY ps_partkey)
rel#2661:Subset#24.LOGICAL.ANY([]).[], best=rel#2660,
importance=0.7290000000000001
rel#2660:DrillFilterRel.LOGICAL.ANY([]).[](child=rel#2659:Subset#23.LOGICAL.ANY([]).[],condition=AND(=(CAST($0):ANY
NOT NULL, $10), =($5, 41))), rowcount=1800.0, cumulative cost={88011.0 rows,
656091.0 cpu, 0.0 io, 0.0 network}
rel#2675:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2674:Subset#24.PHYSICAL.ANY([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
rel#2674:Subset#24.PHYSICAL.ANY([]).[], best=null, importance=0.6561
rel#2676:AbstractConverter.PHYSICAL.ANY([]).[](child=rel#2661:Subset#24.LOGICAL.ANY([]).[],convention=PHYSICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1800.0, cumulative cost={inf}
Set#25, type: RecordType(INTEGER p_partkey)
rel#2663:Subset#25.LOGICAL.ANY([]).[], best=rel#2662, importance=0.81
rel#2662:DrillProjectRel.LOGICAL.ANY([]).[](child=rel#2661:Subset#24.LOGICAL.ANY([]).[],p_partkey=$0),
rowcount=1800.0, cumulative cost={89811.0 rows, 656095.0 cpu, 0.0 io, 0.0
network}
rel#2670:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2669:Subset#25.PHYSICAL.SINGLETON([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
rel#2669:Subset#25.PHYSICAL.SINGLETON([]).[], best=null,
importance=0.7290000000000001
rel#2671:AbstractConverter.PHYSICAL.SINGLETON([]).[](child=rel#2663:Subset#25.LOGICAL.ANY([]).[],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[]),
rowcount=1800.0, cumulative cost={inf}
Set#26, type: RecordType(INTEGER p_partkey)
rel#2665:Subset#26.LOGICAL.ANY([]).[], best=rel#2664, importance=0.9
rel#2664:DrillScreenRel.LOGICAL.ANY([]).[](child=rel#2663:Subset#25.LOGICAL.ANY([]).[]),
rowcount=1800.0, cumulative cost={89991.0 rows, 656275.0 cpu, 0.0 io, 0.0
network}
rel#2667:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2666:Subset#26.PHYSICAL.SINGLETON([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
rel#2666:Subset#26.PHYSICAL.SINGLETON([]).[], best=null, importance=1.0
rel#2668:AbstractConverter.PHYSICAL.SINGLETON([]).[](child=rel#2665:Subset#26.LOGICAL.ANY([]).[],convention=PHYSICAL,DrillDistributionTraitDef=SINGLETON([]),sort=[]),
rowcount=1800.0, cumulative cost={inf}
rel#2672:ScreenPrel.PHYSICAL.SINGLETON([]).[](child=rel#2669:Subset#25.PHYSICAL.SINGLETON([]).[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
]"
]
> Join between hive table and parquet file fail
> ---------------------------------------------
>
> Key: DRILL-882
> URL: https://issues.apache.org/jira/browse/DRILL-882
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Reporter: Ramana Inukonda Nagaraj
>
> The following query fails with a cannot plan error
> select p.p_partkey
> from hive.part p, `tpch-multi/partsupp` ps
> where p.p_partkey = ps.ps_partkey
> and p.p_size = 41
> order by p.p_partkey
> limit 20;
> The below queries work fine implying nothing is wrong with the source
> select p.p_partkey
> from hive.part p;
>
> select ps.ps_partkey from `tpch-multi/partsupp` ps;
> The same query also works when both sides of join is from parquet or hive.
> Its only when they are different that I get the below cannot plan error.
> message: "Failure while parsing sql. < CannotPlanException:[ Node
> [rel#2666:Subset#26.PHYSICAL.SINGLETON([]).[]] could not be implemented;
> planner state:
--
This message was sent by Atlassian JIRA
(v6.2#6252)