[
https://issues.apache.org/jira/browse/DRILL-862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Krystal updated DRILL-862:
--------------------------
Comment: was deleted
(was: {code}
Removing with clause and substituting sub queries failed to plan
message: "Failure while parsing sql. < CannotPlanException:[ Node
[rel#4740:Subset#143.PHYSICAL.SINGLETON([]).[0]] could not be implemented;
planner state:
Root: rel#4740:Subset#143.PHYSICAL.SINGLETON([]).[0]
Original rel:
AbstractConverter(subset=[rel#4740:Subset#143.PHYSICAL.SINGLETON([]).[0]],
convention=[PHYSICAL], DrillDistributionTraitDef=[SINGLETON([])], sort=[[0]]):
rowcount = 5.769738915947539E28, cumulative cost = {inf}, id = 4742
DrillScreenRel(subset=[rel#4739:Subset#143.LOGICAL.ANY([]).[0]]): rowcount =
5.769738915947539E28, cumulative cost = {5.76973891594754E27 rows,
5.76973891594754E27 cpu, 0.0 io, 0.0 network}, id = 4738
DrillLimitRel(subset=[rel#4737:Subset#142.LOGICAL.ANY([]).[0]],
fetch=[100]): rowcount = 5.769738915947539E28, cumulative cost = {100.0 rows,
400.0 cpu, 0.0 io, 0.0 network}, id = 4736
DrillSortRel(subset=[rel#4735:Subset#141.LOGICAL.ANY([]).[0]],
sort0=[$0], dir0=[ASC]): rowcount = 5.769738915947539E28, cumulative cost =
{1.528404056579686E31 rows, 5.769738915947539E28 cpu, 0.0 io, 0.0 network}, id
= 4734
DrillProjectRel(subset=[rel#4733:Subset#140.LOGICAL.ANY([]).[]],
c_customer_id=[$2]): rowcount = 5.769738915947539E28, cumulative cost =
{5.769738915947539E28 rows, 4.0 cpu, 0.0 io, 0.0 network}, id = 4732
DrillFilterRel(subset=[rel#4731:Subset#139.LOGICAL.ANY([]).[]],
condition=[>($1, CAST(*($4, 1.2)):DECIMAL(2, 1))]): rowcount =
5.769738915947539E28, cumulative cost = {1.1539477831895079E29 rows,
4.6157911327580315E29 cpu, 0.0 io, 0.0 network}, id = 4730
DrillJoinRel(subset=[rel#4729:Subset#138.LOGICAL.ANY([]).[]],
condition=[=($0, $3)], joinType=[left]): rowcount = 1.1539477831895079E29,
cumulative cost = {1.1539477831895079E29 rows, 0.0 cpu, 0.0 io, 0.0 network},
id = 4728
DrillProjectRel(subset=[rel#4705:Subset#130.LOGICAL.ANY([]).[]],
ctr_store_sk=[$1], ctr_total_return=[$2], c_customer_id=[$5]): rowcount =
1.9138628531992496E11, cumulative cost = {1.9138628531992496E11 rows, 12.0 cpu,
0.0 io, 0.0 network}, id = 4704
DrillJoinRel(subset=[rel#4703:Subset#129.LOGICAL.ANY([]).[]],
condition=[=($7, $1)], joinType=[inner]): rowcount = 1.9138628531992496E11,
cumulative cost = {1.9138628531992496E11 rows, 0.0 cpu, 0.0 io, 0.0 network},
id = 4702
DrillJoinRel(subset=[rel#4698:Subset#126.LOGICAL.ANY([]).[]],
condition=[=($0, $4)], joinType=[inner]): rowcount = 7.088380937775E11,
cumulative cost = {7.088380937775E11 rows, 0.0 cpu, 0.0 io, 0.0 network}, id =
4697
DrillAggregateRel(subset=[rel#4695:Subset#124.LOGICAL.ANY([]).[]], group=[{0,
1}], ctr_total_return=[SUM($2)]): rowcount = 4.72558729185E7, cumulative cost =
{4.72558729185E7 rows, 0.0 cpu, 0.0 io, 0.0 network}, id = 4694
DrillProjectRel(subset=[rel#4693:Subset#123.LOGICAL.ANY([]).[]],
ctr_customer_sk=[$1], ctr_store_sk=[$2], SR_REVERSED_CHARGE=[$4]): rowcount =
4.72558729185E8, cumulative cost = {4.72558729185E8 rows, 12.0 cpu, 0.0 io, 0.0
network}, id = 4692
DrillJoinRel(subset=[rel#4691:Subset#122.LOGICAL.ANY([]).[]], condition=[=($3,
$7)], joinType=[inner]): rowcount = 4.72558729185E8, cumulative cost =
{4.72558729185E8 rows, 0.0 cpu, 0.0 io, 0.0 network}, id = 4690
DrillScanRel(subset=[rel#4686:Subset#119.LOGICAL.ANY([]).[]], table=[[dfs,
tpcds, store_returns]]): rowcount = 287514.0, cumulative cost = {287514.0 rows,
1437570.0 cpu, 0.0 io, 0.0 network}, id = 2472
DrillFilterRel(subset=[rel#4689:Subset#121.LOGICAL.ANY([]).[]],
condition=[=(CAST($1):INTEGER, 1998)]): rowcount = 10957.35, cumulative cost =
{73049.0 rows, 292196.0 cpu, 0.0 io, 0.0 network}, id = 4688
DrillScanRel(subset=[rel#4687:Subset#120.LOGICAL.ANY([]).[]], table=[[dfs,
tpcds, date_dim]]): rowcount = 73049.0, cumulative cost = {73049.0 rows,
219147.0 cpu, 0.0 io, 0.0 network}, id = 2568
DrillScanRel(subset=[rel#4696:Subset#125.LOGICAL.ANY([]).[]], table=[[dfs,
tpcds, customer]]): rowcount = 100000.0, cumulative cost = {100000.0 rows,
300000.0 cpu, 0.0 io, 0.0 network}, id = 2649
DrillFilterRel(subset=[rel#4701:Subset#128.LOGICAL.ANY([]).[]],
condition=[=(CAST($2):CHAR(2) CHARACTER SET "ISO-8859-1" COLLATE
"ISO-8859-1$en_US$primary", 'TN')]): rowcount = 1.7999999999999998, cumulative
cost = {12.0 rows, 48.0 cpu, 0.0 io, 0.0 network}, id = 4700
DrillScanRel(subset=[rel#4699:Subset#127.LOGICAL.ANY([]).[]], table=[[dfs,
tpcds, store]]): rowcount = 12.0, cumulative cost = {12.0 rows, 36.0 cpu, 0.0
io, 0.0 network}, id = 2783
DrillAggregateRel(subset=[rel#4727:Subset#137.LOGICAL.ANY([]).[]], group=[{0}],
agg#0=[AVG($1)]): rowcount = 4.019611545520958E18, cumulative cost =
{4.019611545520958E18 rows, 0.0 cpu, 0.0 io, 0.0 network}, id = 4726
DrillProjectRel(subset=[rel#4725:Subset#136.LOGICAL.ANY([]).[]], $f0=[$3],
ctr_total_return=[$2]): rowcount = 4.019611545520958E19, cumulative cost =
{4.019611545520958E19 rows, 8.0 cpu, 0.0 io, 0.0 network}, id = 4724
DrillJoinRel(subset=[rel#4723:Subset#135.LOGICAL.ANY([]).[]],
condition=[=($3, $1)], joinType=[inner]): rowcount = 4.019611545520958E19,
cumulative cost = {4.019611545520958E19 rows, 0.0 cpu, 0.0 io, 0.0 network}, id
= 4722
DrillAggregateRel(subset=[rel#4695:Subset#124.LOGICAL.ANY([]).[]], group=[{0,
1}], ctr_total_return=[SUM($2)]): rowcount = 4.72558729185E7, cumulative cost =
{4.72558729185E7 rows, 0.0 cpu, 0.0 io, 0.0 network}, id = 4694
DrillProjectRel(subset=[rel#4693:Subset#123.LOGICAL.ANY([]).[]],
ctr_customer_sk=[$1], ctr_store_sk=[$2], SR_REVERSED_CHARGE=[$4]): rowcount =
4.72558729185E8, cumulative cost = {4.72558729185E8 rows, 12.0 cpu, 0.0 io, 0.0
network}, id = 4692
DrillJoinRel(subset=[rel#4691:Subset#122.LOGICAL.ANY([]).[]], condition=[=($3,
$7)], joinType=[inner]): rowcount = 4.72558729185E8, cumulative cost =
{4.72558729185E8 rows, 0.0 cpu, 0.0 io, 0.0 network}, id = 4690
DrillScanRel(subset=[rel#4686:Subset#119.LOGICAL.ANY([]).[]], table=[[dfs,
tpcds, store_returns]]): rowcount = 287514.0, cumulative cost = {287514.0 rows,
1437570.0 cpu, 0.0 io, 0.0 network}, id = 2472
DrillFilterRel(subset=[rel#4689:Subset#121.LOGICAL.ANY([]).[]],
condition=[=(CAST($1):INTEGER, 1998)]): rowcount = 10957.35, cumulative cost =
{73049.0 rows, 292196.0 cpu, 0.0 io, 0.0 network}, id = 4688
DrillScanRel(subset=[rel#4687:Subset#120.LOGICAL.ANY([]).[]], table=[[dfs,
tpcds, date_dim]]): rowcount = 73049.0, cumulative cost = {73049.0 rows,
219147.0 cpu, 0.0 io, 0.0 network}, id = 2568
DrillAggregateRel(subset=[rel#4721:Subset#134.LOGICAL.ANY([]).[]],
group=[{0}]): rowcount = 5.670704750219999E12, cumulative cost =
{5.670704750219999E12 rows, 0.0 cpu, 0.0 io, 0.0 network}, id = 4720
DrillProjectRel(subset=[rel#4719:Subset#133.LOGICAL.ANY([]).[]], $f0=[$1]):
rowcount = 5.670704750219999E13, cumulative cost = {5.670704750219999E13 rows,
4.0 cpu, 0.0 io, 0.0 network}, id = 4718
DrillJoinRel(subset=[rel#4717:Subset#132.LOGICAL.ANY([]).[]], condition=[true],
joinType=[inner]): rowcount = 5.670704750219999E13, cumulative cost =
{5.670704750219999E13 rows, 0.0 cpu, 0.0 io, 0.0 network}, id = 4716
DrillJoinRel(subset=[rel#4715:Subset#131.LOGICAL.ANY([]).[]], condition=[true],
joinType=[inner]): rowcount = 5.67070475022E8, cumulative cost =
{5.67070475022E8 rows, 0.0 cpu, 0.0 io, 0.0 network}, id = 4714
DrillAggregateRel(subset=[rel#4695:Subset#124.LOGICAL.ANY([]).[]], group=[{0,
1}], ctr_total_return=[SUM($2)]): rowcount = 4.72558729185E7, cumulative cost =
{4.72558729185E7 rows, 0.0 cpu, 0.0 io, 0.0 network}, id = 4694
DrillProjectRel(subset=[rel#4693:Subset#123.LOGICAL.ANY([]).[]],
ctr_customer_sk=[$1], ctr_store_sk=[$2], SR_REVERSED_CHARGE=[$4]): rowcount =
4.72558729185E8, cumulative cost = {4.72558729185E8 rows, 12.0 cpu, 0.0 io, 0.0
network}, id = 4692
DrillJoinRel(subset=[rel#4691:Subset#122.LOGICAL.ANY([]).[]], condition=[=($3,
$7)], joinType=[inner]): rowcount = 4.72558729185E8, cumulative cost =
{4.72558729185E8 rows, 0.0 cpu, 0.0 io, 0.0 network}, id = 4690
DrillScanRel(subset=[rel#4686:Subset#119.LOGICAL.ANY([]).[]], table=[[dfs,
tpcds, store_returns]]): rowcount = 287514.0, cumulative cost = {287514.0 rows,
1437570.0 cpu, 0.0 io, 0.0 network}, id = 2472
DrillFilterRel(subset=[rel#4689:Subset#121.LOGICAL.ANY([]).[]],
condition=[=(CAST($1):INTEGER, 1998)]): rowcount = 10957.35, cumulative cost =
{73049.0 rows, 292196.0 cpu, 0.0 io, 0.0 network}, id = 4688
DrillScanRel(subset=[rel#4687:Subset#120.LOGICAL.ANY([]).[]], table=[[dfs,
tpcds, date_dim]]): rowcount = 73049.0, cumulative cost = {73049.0 rows,
219147.0 cpu, 0.0 io, 0.0 network}, id = 2568
DrillScanRel(subset=[rel#4699:Subset#127.LOGICAL.ANY([]).[]], table=[[dfs,
tpcds, store]]): rowcount = 12.0, cumulative cost = {12.0 rows, 36.0 cpu, 0.0
io, 0.0 network}, id = 2783
DrillScanRel(subset=[rel#4696:Subset#125.LOGICAL.ANY([]).[]], table=[[dfs,
tpcds, customer]]): rowcount = 100000.0, cumulative cost = {100000.0 rows,
300000.0 cpu, 0.0 io, 0.0 network}, id = 2649
{code})
> Select against hbase tables with empty number values fails
> -----------------------------------------------------------
>
> Key: DRILL-862
> URL: https://issues.apache.org/jira/browse/DRILL-862
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - HBase
> Reporter: Krystal
> Assignee: Suresh Ollala
> Fix For: 0.5.0
>
>
> git.commit.id.abbrev=01bf849
> I have a hbase table that contains empty values for some columns/rows. This
> was due to importing data from a file into hbase. For example a get of a row
> containing empty age value in hbase:
> hbase(main):003:0> get 'voterspaces', '17'
> COLUMN CELL
>
> fourcf:create_date timestamp=1401380832939, value=2014-06-04
> 06:49:06
> onecf:name timestamp=1401380832939, value=alice garcia
>
> threecf:contributions timestamp=1401380832939, value=468.51
>
> threecf:voterzone timestamp=1401380832939, value=18555
>
> twocf:age timestamp=1401380832939, value=
>
> twocf:registration timestamp=1401380832939, value=republican
> The following query from drill fails due to empty values instead of null:
> select cast(row_key as integer) voter_id, convert_from(onecf['name'], 'UTF8')
> name, cast(twocf['age'] as integer) age, cast(twocf['registration'] as
> varchar(20)) registration, cast(threecf['contributions'] as decimal(6,2))
> contributions, cast(threecf['voterzone'] as integer)
> voterzone,cast(fourcf['create_date'] as timestamp) create_date from
> voterspaces where row_key < 20;
> message: "Failure while running fragment. < NumberFormatException:[ ]"
> From hive, running the same query against a hive table that is mapped to the
> same hbase table succeeded. If this how it expected to work in drill, then
> we should to document it.
--
This message was sent by Atlassian JIRA
(v6.2#6252)