[
https://issues.apache.org/jira/browse/HIVE-10116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389422#comment-14389422
]
Mostafa Mokhtar commented on HIVE-10116:
----------------------------------------
[~jcamachorodriguez]
Is this the same issue
{code}
explain
select ca_zip, ca_county, sum(ws_sales_price)
from
web_sales
JOIN customer ON web_sales.ws_bill_customer_sk = customer.c_customer_sk
JOIN customer_address ON customer.c_current_addr_sk =
customer_address.ca_address_sk
JOIN date_dim ON web_sales.ws_sold_date_sk = date_dim.d_date_sk
JOIN item ON web_sales.ws_item_sk = item.i_item_sk
where
( item.i_item_id in (select i_item_id
from item i2
where i2.i_item_sk in (2, 3, 5, 7, 11, 13, 17, 19,
23, 29)
)
)
and d_qoy = 2 and d_year = 2000
group by ca_zip, ca_county
order by ca_zip, ca_county
limit 100
15/03/27 12:16:48 [main]: ERROR parse.CalcitePlanner: CBO failed, skipping CBO.
java.lang.ArrayIndexOutOfBoundsException: 2
at
org.apache.calcite.rel.metadata.RelMdSize.averageColumnSizes(RelMdSize.java:193)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$1$1.invoke(ReflectiveRelMetadataProvider.java:182)
at com.sun.proxy.$Proxy51.averageColumnSizes(Unknown Source)
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(ChainedRelMetadataProvider.java:109)
at com.sun.proxy.$Proxy51.averageColumnSizes(Unknown Source)
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(ChainedRelMetadataProvider.java:109)
at com.sun.proxy.$Proxy51.averageColumnSizes(Unknown Source)
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.CachingRelMetadataProvider$CachingInvocationHandler.invoke(CachingRelMetadataProvider.java:131)
at com.sun.proxy.$Proxy51.averageColumnSizes(Unknown Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getAverageColumnSizes(RelMetadataQuery.java:360)
at
org.apache.calcite.rel.metadata.RelMdSize.averageRowSize(RelMdSize.java:82)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$1$1.invoke(ReflectiveRelMetadataProvider.java:182)
at com.sun.proxy.$Proxy51.averageRowSize(Unknown Source)
at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(ChainedRelMetadataProvider.java:109)
at com.sun.proxy.$Proxy51.averageRowSize(Unknown Source)
at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(ChainedRelMetadataProvider.java:109)
at com.sun.proxy.$Proxy51.averageRowSize(Unknown Source)
at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.CachingRelMetadataProvider$CachingInvocationHandler.invoke(CachingRelMetadataProvider.java:131)
at com.sun.proxy.$Proxy51.averageRowSize(Unknown Source)
at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(ChainedRelMetadataProvider.java:109)
at com.sun.proxy.$Proxy51.averageRowSize(Unknown Source)
at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.CachingRelMetadataProvider$CachingInvocationHandler.invoke(CachingRelMetadataProvider.java:131)
at com.sun.proxy.$Proxy51.averageRowSize(Unknown Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getAverageRowSize(RelMetadataQuery.java:344)
at
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveJoin.computeSelfCostCommonJoin(HiveJoin.java:360)
at
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveJoin.chooseJoinAlgorithmAndGetCost(HiveJoin.java:161)
at
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveJoin.computeSelfCost(HiveJoin.java:145)
at
org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.getNonCumulativeCost(RelMdPercentageOriginalRows.java:165)
at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$1$1.invoke(ReflectiveRelMetadataProvider.java:182)
at com.sun.proxy.$Proxy41.getNonCumulativeCost(Unknown Source)
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(ChainedRelMetadataProvider.java:109)
at com.sun.proxy.$Proxy41.getNonCumulativeCost(Unknown Source)
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.CachingRelMetadataProvider$CachingInvocationHandler.invoke(CachingRelMetadataProvider.java:131)
at com.sun.proxy.$Proxy41.getNonCumulativeCost(Unknown Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getNonCumulativeCost(RelMetadataQuery.java:115)
at
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getCumulativeCost(HiveRelMdDistinctRowCount.java:114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$1$1.invoke(ReflectiveRelMetadataProvider.java:182)
at com.sun.proxy.$Proxy40.getCumulativeCost(Unknown Source)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(ChainedRelMetadataProvider.java:109)
at com.sun.proxy.$Proxy40.getCumulativeCost(Unknown Source)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(ChainedRelMetadataProvider.java:109)
at com.sun.proxy.$Proxy40.getCumulativeCost(Unknown Source)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.calcite.rel.metadata.CachingRelMetadataProvider$CachingInvocationHandler.invoke(CachingRelMetadataProvider.java:131)
at com.sun.proxy.$Proxy40.getCumulativeCost(Unknown Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getCumulativeCost(RelMetadataQuery.java:101)
at
org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addFactorToTree(LoptOptimizeJoinRule.java:944)
at
org.apache.calcite.rel.rules.LoptOptimizeJoinRule.pushDownFactor(LoptOptimizeJoinRule.java:1082)
at
org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addFactorToTree(LoptOptimizeJoinRule.java:924)
at
org.apache.calcite.rel.rules.LoptOptimizeJoinRule.pushDownFactor(LoptOptimizeJoinRule.java:1082)
at
org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addFactorToTree(LoptOptimizeJoinRule.java:924)
at
org.apache.calcite.rel.rules.LoptOptimizeJoinRule.pushDownFactor(LoptOptimizeJoinRule.java:1082)
at
org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addFactorToTree(LoptOptimizeJoinRule.java:924)
at
org.apache.calcite.rel.rules.LoptOptimizeJoinRule.createOrdering(LoptOptimizeJoinRule.java:726)
at
org.apache.calcite.rel.rules.LoptOptimizeJoinRule.findBestOrderings(LoptOptimizeJoinRule.java:458)
at
org.apache.calcite.rel.rules.LoptOptimizeJoinRule.onMatch(LoptOptimizeJoinRule.java:128)
at
org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:326)
at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:515)
at
org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:392)
at
org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:255)
at
org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:125)
at
org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:207)
at
org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:194)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:824)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:742)
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109)
at
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:730)
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145)
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:583)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:238)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9998)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:201)
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224)
at
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:425)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:309)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1114)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1162)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1051)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1041)
at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
at
org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:419)
at
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:708)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
OK
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
Stage: Stage-1
Tez
Edges:
Map 1 <- Map 4 (BROADCAST_EDGE), Map 5 (BROADCAST_EDGE), Map 6
(BROADCAST_EDGE), Map 7 (BROADCAST_EDGE), Map 8 (BROADCAST_EDGE)
Reducer 2 <- Map 1 (SIMPLE_EDGE)
Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
DagName: mmokhtar_20150327121646_974fccea-33a3-4110-92fd-cb0f5d46fb4c:1
Vertices:
Map 1
Map Operator Tree:
TableScan
alias: web_sales
filterExpr: ((ws_bill_customer_sk is not null and
ws_sold_date_sk is not null) and ws_item_sk is not null) (type: boolean)
Statistics: Num rows: 143966864 Data size: 33110363004 Basic
stats: COMPLETE Column stats: COMPLETE
Filter Operator
predicate: ((ws_bill_customer_sk is not null and
ws_sold_date_sk is not null) and ws_item_sk is not null) (type: boolean)
Statistics: Num rows: 143948975 Data size: 2303040076 Basic
stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
Inner Join 0 to 1
keys:
0 ws_bill_customer_sk (type: int)
1 c_customer_sk (type: int)
outputColumnNames: _col0, _col3, _col21, _col42
input vertices:
1 Map 4
Statistics: Num rows: 143948976 Data size: 2303183616
Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
Inner Join 0 to 1
keys:
0 _col42 (type: int)
1 ca_address_sk (type: int)
outputColumnNames: _col0, _col3, _col21, _col66, _col68
input vertices:
1 Map 5
Statistics: Num rows: 143948976 Data size: 28645846224
Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
Inner Join 0 to 1
keys:
0 _col0 (type: int)
1 d_date_sk (type: int)
outputColumnNames: _col3, _col21, _col66, _col68
input vertices:
1 Map 6
Statistics: Num rows: 1251319 Data size: 244007205
Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
Inner Join 0 to 1
keys:
0 _col3 (type: int)
1 i_item_sk (type: int)
outputColumnNames: _col21, _col66, _col68, _col107
input vertices:
1 Map 7
Statistics: Num rows: 1251319 Data size: 364133829
Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
Left Semi Join 0 to 1
keys:
0 _col107 (type: string)
1 _col0 (type: string)
outputColumnNames: _col21, _col66, _col68
input vertices:
1 Map 8
Statistics: Num rows: 625687 Data size: 119506217
Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
expressions: _col68 (type: string), _col66
(type: string), _col21 (type: float)
outputColumnNames: _col68, _col66, _col21
Statistics: Num rows: 625687 Data size:
119506217 Basic stats: COMPLETE Column stats: COMPLETE
Group By Operator
aggregations: sum(_col21)
keys: _col68 (type: string), _col66 (type:
string)
mode: hash
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 512 Data size: 99840
Basic stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: _col0 (type: string),
_col1 (type: string)
sort order: ++
Map-reduce partition columns: _col0 (type:
string), _col1 (type: string)
Statistics: Num rows: 512 Data size: 99840
Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col2 (type: double)
Execution mode: vectorized
Map 4
Map Operator Tree:
TableScan
alias: customer
filterExpr: (c_customer_sk is not null and c_current_addr_sk
is not null) (type: boolean)
Statistics: Num rows: 1600000 Data size: 1241633212 Basic
stats: COMPLETE Column stats: COMPLETE
Filter Operator
predicate: (c_customer_sk is not null and c_current_addr_sk
is not null) (type: boolean)
Statistics: Num rows: 1600000 Data size: 12800000 Basic
stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: c_customer_sk (type: int)
sort order: +
Map-reduce partition columns: c_customer_sk (type: int)
Statistics: Num rows: 1600000 Data size: 12800000 Basic
stats: COMPLETE Column stats: COMPLETE
value expressions: c_current_addr_sk (type: int)
Execution mode: vectorized
Map 5
Map Operator Tree:
TableScan
alias: customer_address
filterExpr: ca_address_sk is not null (type: boolean)
Statistics: Num rows: 800000 Data size: 811903688 Basic
stats: COMPLETE Column stats: COMPLETE
Filter Operator
predicate: ca_address_sk is not null (type: boolean)
Statistics: Num rows: 800000 Data size: 152800000 Basic
stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: ca_address_sk (type: int)
sort order: +
Map-reduce partition columns: ca_address_sk (type: int)
Statistics: Num rows: 800000 Data size: 152800000 Basic
stats: COMPLETE Column stats: COMPLETE
value expressions: ca_county (type: string), ca_zip
(type: string)
Execution mode: vectorized
Map 6
Map Operator Tree:
TableScan
alias: date_dim
filterExpr: ((d_date_sk is not null and (d_qoy = 2)) and
(d_year = 2000)) (type: boolean)
Statistics: Num rows: 73049 Data size: 81741831 Basic stats:
COMPLETE Column stats: COMPLETE
Filter Operator
predicate: ((d_date_sk is not null and (d_qoy = 2)) and
(d_year = 2000)) (type: boolean)
Statistics: Num rows: 635 Data size: 7620 Basic stats:
COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: d_date_sk (type: int)
sort order: +
Map-reduce partition columns: d_date_sk (type: int)
Statistics: Num rows: 635 Data size: 7620 Basic stats:
COMPLETE Column stats: COMPLETE
Execution mode: vectorized
Map 7
Map Operator Tree:
TableScan
alias: item
filterExpr: (i_item_sk is not null and i_item_id is not null)
(type: boolean)
Statistics: Num rows: 48000 Data size: 68732712 Basic stats:
COMPLETE Column stats: COMPLETE
Filter Operator
predicate: (i_item_sk is not null and i_item_id is not
null) (type: boolean)
Statistics: Num rows: 48000 Data size: 4992000 Basic stats:
COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: i_item_sk (type: int)
sort order: +
Map-reduce partition columns: i_item_sk (type: int)
Statistics: Num rows: 48000 Data size: 4992000 Basic
stats: COMPLETE Column stats: COMPLETE
value expressions: i_item_id (type: string)
Execution mode: vectorized
Map 8
Map Operator Tree:
TableScan
alias: i2
filterExpr: ((i_item_sk) IN (2, 3, 5, 7, 11, 13, 17, 19, 23,
29) and i_item_id is not null) (type: boolean)
Statistics: Num rows: 48000 Data size: 68732712 Basic stats:
COMPLETE Column stats: COMPLETE
Filter Operator
predicate: ((i_item_sk) IN (2, 3, 5, 7, 11, 13, 17, 19, 23,
29) and i_item_id is not null) (type: boolean)
Statistics: Num rows: 24000 Data size: 2496000 Basic stats:
COMPLETE Column stats: COMPLETE
Select Operator
expressions: i_item_id (type: string)
outputColumnNames: _col0
Statistics: Num rows: 24000 Data size: 2400000 Basic
stats: COMPLETE Column stats: COMPLETE
Group By Operator
keys: _col0 (type: string)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 11060 Data size: 1106000 Basic
stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: _col0 (type: string)
sort order: +
Map-reduce partition columns: _col0 (type: string)
Statistics: Num rows: 11060 Data size: 1106000 Basic
stats: COMPLETE Column stats: COMPLETE
Execution mode: vectorized
Reducer 2
Reduce Operator Tree:
Group By Operator
aggregations: sum(VALUE._col0)
keys: KEY._col0 (type: string), KEY._col1 (type: string)
mode: mergepartial
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 256 Data size: 49920 Basic stats:
COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: _col0 (type: string), _col1 (type: string)
sort order: ++
Statistics: Num rows: 256 Data size: 49920 Basic stats:
COMPLETE Column stats: COMPLETE
TopN Hash Memory Usage: 0.04
value expressions: _col2 (type: double)
Reducer 3
Reduce Operator Tree:
Select Operator
expressions: KEY.reducesinkkey0 (type: string),
KEY.reducesinkkey1 (type: string), VALUE._col0 (type: double)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 256 Data size: 49920 Basic stats:
COMPLETE Column stats: COMPLETE
Limit
Number of rows: 100
Statistics: Num rows: 100 Data size: 19500 Basic stats:
COMPLETE Column stats: COMPLETE
File Output Operator
compressed: false
Statistics: Num rows: 100 Data size: 19500 Basic stats:
COMPLETE Column stats: COMPLETE
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: 100
Processor Tree:
ListSink
Time taken: 4.151 seconds, Fetched: 213 row(s)
{code}
> CBO (Calcite Return Path): RelMdSize throws an Exception when Join is
> actually a Semijoin [CBO branch]
> ------------------------------------------------------------------------------------------------------
>
> Key: HIVE-10116
> URL: https://issues.apache.org/jira/browse/HIVE-10116
> Project: Hive
> Issue Type: Sub-task
> Components: CBO
> Affects Versions: cbo-branch
> Reporter: Jesus Camacho Rodriguez
> Assignee: Jesus Camacho Rodriguez
> Fix For: cbo-branch
>
> Attachments: HIVE-10116.cbo.patch
>
>
> {{cbo_semijoin.q}} reproduces the error.
> Stacktrace:
> {noformat}
> 2015-03-26 09:55:20,652 ERROR [main]: parse.CalcitePlanner
> (CalcitePlanner.java:genOPTree(269)) - CBO failed, skipping CBO.
> java.lang.ArrayIndexOutOfBoundsException: 3
> at
> org.apache.calcite.rel.metadata.RelMdSize.averageColumnSizes(RelMdSize.java:193)
> at sun.reflect.GeneratedMethodAccessor134.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$2$1.invoke(ReflectiveRelMetadataProvider.java:194)
> at com.sun.proxy.$Proxy30.averageColumnSizes(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at
> at java.lang.reflect.Method.invoke(Method.java:606)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)