[
https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548058#comment-14548058
]
Hive QA commented on HIVE-9069:
-------------------------------
{color:red}Overall{color}: -1 at least one tests failed
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12733509/HIVE-9069.03.patch
{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 8736 tests
executed
*Failed tests:*
{noformat}
TestCliDriver-authorization_role_grant2.q-show_create_table_partitioned.q-create_alter_list_bucketing_table1.q-and-12-more
- did not produce a TEST-*.xml file
TestCliDriver-bucketmapjoin3.q-udf_between.q-auto_join10.q-and-12-more - did
not produce a TEST-*.xml file
TestCliDriver-bucketsortoptimize_insert_7.q-index_bitmap_rc.q-load_nonpart_authsuccess.q-and-12-more
- did not produce a TEST-*.xml file
TestCliDriver-groupby_grouping_sets5.q-temp_table_display_colstats_tbllvl.q-nonreserved_keywords_input37.q-and-12-more
- did not produce a TEST-*.xml file
TestCliDriver-input17.q-bucket_map_join_tez1.q-ppd_random.q-and-12-more - did
not produce a TEST-*.xml file
TestCliDriver-join_filters.q-bucketmapjoin12.q-alter_view_as_select.q-and-12-more
- did not produce a TEST-*.xml file
TestCliDriver-orc_createas1.q-encryption_join_with_different_encryption_keys.q-nestedvirtual.q-and-12-more
- did not produce a TEST-*.xml file
TestCliDriver-orc_ppd_decimal.q-vector_decimal_round.q-metadata_export_drop.q-and-12-more
- did not produce a TEST-*.xml file
TestCliDriver-script_pipe.q-orc_split_elimination.q-udf_tan.q-and-12-more - did
not produce a TEST-*.xml file
TestCliDriver-udf_bitwise_shiftleft.q-binarysortable_1.q-orc_merge5.q-and-12-more
- did not produce a TEST-*.xml file
TestCliDriver-udf_repeat.q-list_bucket_query_multiskew_2.q-rcfile_merge2.q-and-12-more
- did not produce a TEST-*.xml file
TestHBaseCliDriver-hbase_binary_storage_queries.q - did not produce a
TEST-*.xml file
TestMiniTezCliDriver-auto_sortmerge_join_13.q-orc_vectorization_ppd.q-cbo_subq_not_in.q-and-12-more
- did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_decimal_round.q-cbo_windowing.q-tez_schema_evolution.q-and-12-more
- did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_distinct_2.q-vector_interval_2.q-bucket3.q-and-12-more
- did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-parallel_orderby.q-smb_mapjoin_8.q -
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_context_ngrams
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
{noformat}
Test results:
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3928/testReport
Console output:
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3928/console
Test logs:
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3928/
Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 20 tests failed
{noformat}
This message is automatically generated.
ATTACHMENT ID: 12733509 - PreCommit-HIVE-TRUNK-Build
> Simplify filter predicates for CBO
> ----------------------------------
>
> Key: HIVE-9069
> URL: https://issues.apache.org/jira/browse/HIVE-9069
> Project: Hive
> Issue Type: Bug
> Components: CBO
> Affects Versions: 0.14.0
> Reporter: Mostafa Mokhtar
> Assignee: Jesus Camacho Rodriguez
> Fix For: 0.14.1
>
> Attachments: HIVE-9069.01.patch, HIVE-9069.02.patch,
> HIVE-9069.03.patch, HIVE-9069.patch
>
>
> Simplify predicates for disjunctive predicates so that can get pushed down to
> the scan.
> Looks like this is still an issue, some of the filters can be pushed down to
> the scan.
> {code}
> set hive.cbo.enable=true
> set hive.stats.fetch.column.stats=true
> set hive.exec.dynamic.partition.mode=nonstrict
> set hive.tez.auto.reducer.parallelism=true
> set hive.auto.convert.join.noconditionaltask.size=320000000
> set hive.exec.reducers.bytes.per.reducer=100000000
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager
> set hive.support.concurrency=false
> set hive.tez.exec.print.summary=true
> explain
> select substr(r_reason_desc,1,20) as r
> ,avg(ws_quantity) wq
> ,avg(wr_refunded_cash) ref
> ,avg(wr_fee) fee
> from web_sales, web_returns, web_page, customer_demographics cd1,
> customer_demographics cd2, customer_address, date_dim, reason
> where web_sales.ws_web_page_sk = web_page.wp_web_page_sk
> and web_sales.ws_item_sk = web_returns.wr_item_sk
> and web_sales.ws_order_number = web_returns.wr_order_number
> and web_sales.ws_sold_date_sk = date_dim.d_date_sk and d_year = 1998
> and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk
> and cd2.cd_demo_sk = web_returns.wr_returning_cdemo_sk
> and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk
> and reason.r_reason_sk = web_returns.wr_reason_sk
> and
> (
> (
> cd1.cd_marital_status = 'M'
> and
> cd1.cd_marital_status = cd2.cd_marital_status
> and
> cd1.cd_education_status = '4 yr Degree'
> and
> cd1.cd_education_status = cd2.cd_education_status
> and
> ws_sales_price between 100.00 and 150.00
> )
> or
> (
> cd1.cd_marital_status = 'D'
> and
> cd1.cd_marital_status = cd2.cd_marital_status
> and
> cd1.cd_education_status = 'Primary'
> and
> cd1.cd_education_status = cd2.cd_education_status
> and
> ws_sales_price between 50.00 and 100.00
> )
> or
> (
> cd1.cd_marital_status = 'U'
> and
> cd1.cd_marital_status = cd2.cd_marital_status
> and
> cd1.cd_education_status = 'Advanced Degree'
> and
> cd1.cd_education_status = cd2.cd_education_status
> and
> ws_sales_price between 150.00 and 200.00
> )
> )
> and
> (
> (
> ca_country = 'United States'
> and
> ca_state in ('KY', 'GA', 'NM')
> and ws_net_profit between 100 and 200
> )
> or
> (
> ca_country = 'United States'
> and
> ca_state in ('MT', 'OR', 'IN')
> and ws_net_profit between 150 and 300
> )
> or
> (
> ca_country = 'United States'
> and
> ca_state in ('WI', 'MO', 'WV')
> and ws_net_profit between 50 and 250
> )
> )
> group by r_reason_desc
> order by r, wq, ref, fee
> limit 100
> OK
> STAGE DEPENDENCIES:
> Stage-1 is a root stage
> Stage-0 depends on stages: Stage-1
> STAGE PLANS:
> Stage: Stage-1
> Tez
> Edges:
> Map 9 <- Map 1 (BROADCAST_EDGE)
> Reducer 3 <- Map 13 (SIMPLE_EDGE), Map 2 (SIMPLE_EDGE)
> Reducer 4 <- Map 9 (SIMPLE_EDGE), Reducer 3 (SIMPLE_EDGE)
> Reducer 5 <- Map 14 (SIMPLE_EDGE), Reducer 4 (SIMPLE_EDGE)
> Reducer 6 <- Map 10 (SIMPLE_EDGE), Map 11 (BROADCAST_EDGE), Map 12
> (BROADCAST_EDGE), Reducer 5 (SIMPLE_EDGE)
> Reducer 7 <- Reducer 6 (SIMPLE_EDGE)
> Reducer 8 <- Reducer 7 (SIMPLE_EDGE)
> DagName: mmokhtar_20141111161818_f5fd23ba-d783-4b13-8507-7faa65851798:1
> Vertices:
> Map 1
> Map Operator Tree:
> TableScan
> alias: web_page
> filterExpr: wp_web_page_sk is not null (type: boolean)
> Statistics: Num rows: 4602 Data size: 2696178 Basic stats:
> COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: wp_web_page_sk is not null (type: boolean)
> Statistics: Num rows: 4602 Data size: 18408 Basic stats:
> COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: wp_web_page_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 4602 Data size: 18408 Basic
> stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 4602 Data size: 18408 Basic
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 10
> Map Operator Tree:
> TableScan
> alias: customer_address
> filterExpr: ((ca_country = 'United States') and
> ca_address_sk is not null) (type: boolean)
> Statistics: Num rows: 40000000 Data size: 40595195284 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: ((ca_country = 'United States') and
> ca_address_sk is not null) (type: boolean)
> Statistics: Num rows: 20000000 Data size: 3740000000
> Basic stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: ca_address_sk (type: int), ca_state (type:
> string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 20000000 Data size: 1800000000
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 20000000 Data size: 1800000000
> Basic stats: COMPLETE Column stats: COMPLETE
> value expressions: _col1 (type: string)
> Execution mode: vectorized
> Map 11
> Map Operator Tree:
> TableScan
> alias: date_dim
> filterExpr: ((d_year = 1998) and d_date_sk is not null)
> (type: boolean)
> Statistics: Num rows: 73049 Data size: 81741831 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: ((d_year = 1998) and d_date_sk is not null)
> (type: boolean)
> Statistics: Num rows: 652 Data size: 5216 Basic stats:
> COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: d_date_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 652 Data size: 2608 Basic stats:
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 652 Data size: 2608 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: _col0 (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 652 Data size: 2608 Basic
> stats: COMPLETE Column stats: COMPLETE
> Group By Operator
> keys: _col0 (type: int)
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 326 Data size: 1304 Basic
> stats: COMPLETE Column stats: COMPLETE
> Dynamic Partitioning Event Operator
> Target Input: web_sales
> Partition key expr: ws_sold_date_sk
> Statistics: Num rows: 326 Data size: 1304 Basic
> stats: COMPLETE Column stats: COMPLETE
> Target column: ws_sold_date_sk
> Target Vertex: Map 9
> Execution mode: vectorized
> Map 12
> Map Operator Tree:
> TableScan
> alias: reason
> filterExpr: r_reason_sk is not null (type: boolean)
> Statistics: Num rows: 72 Data size: 14400 Basic stats:
> COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: r_reason_sk is not null (type: boolean)
> Statistics: Num rows: 72 Data size: 7272 Basic stats:
> COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: r_reason_sk (type: int), r_reason_desc
> (type: string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 72 Data size: 7272 Basic stats:
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 72 Data size: 7272 Basic stats:
> COMPLETE Column stats: COMPLETE
> value expressions: _col1 (type: string)
> Execution mode: vectorized
> Map 13
> Map Operator Tree:
> TableScan
> alias: web_returns
> filterExpr: (((((wr_refunded_cdemo_sk is not null and
> wr_item_sk is not null) and wr_order_number is not null) and
> wr_returning_cdemo_sk is not null) and wr_refunded_addr_sk is not null) and
> wr_reason_sk is not null) (type: boolean)
> Statistics: Num rows: 2062802370 Data size: 185695406284
> Basic stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: (((((wr_refunded_cdemo_sk is not null and
> wr_item_sk is not null) and wr_order_number is not null) and
> wr_returning_cdemo_sk is not null) and wr_refunded_addr_sk is not null) and
> wr_reason_sk is not null) (type: boolean)
> Statistics: Num rows: 1875154722 Data size: 58944640412
> Basic stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: wr_item_sk (type: int),
> wr_refunded_cdemo_sk (type: int), wr_refunded_addr_sk (type: int),
> wr_returning_cdemo_sk (type: int), wr_reason_sk (type: int), wr_order_number
> (type: int), wr_fee (type: float), wr_refunded_cash (type: float)
> outputColumnNames: _col0, _col1, _col2, _col3, _col4,
> _col5, _col6, _col7
> Statistics: Num rows: 1875154722 Data size: 58944640412
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col1 (type: int)
> sort order: +
> Map-reduce partition columns: _col1 (type: int)
> Statistics: Num rows: 1875154722 Data size:
> 58944640412 Basic stats: COMPLETE Column stats: COMPLETE
> value expressions: _col0 (type: int), _col2 (type:
> int), _col3 (type: int), _col4 (type: int), _col5 (type: int), _col6 (type:
> float), _col7 (type: float)
> Execution mode: vectorized
> Map 14
> Map Operator Tree:
> TableScan
> alias: cd1
> filterExpr: ((cd_demo_sk is not null and cd_marital_status
> is not null) and cd_education_status is not null) (type: boolean)
> Statistics: Num rows: 1920800 Data size: 718379200 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: ((cd_demo_sk is not null and cd_marital_status
> is not null) and cd_education_status is not null) (type: boolean)
> Statistics: Num rows: 1920800 Data size: 351506400 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: cd_demo_sk (type: int), cd_marital_status
> (type: string), cd_education_status (type: string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 1920800 Data size: 351506400
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int), _col1 (type:
> string), _col2 (type: string)
> sort order: +++
> Map-reduce partition columns: _col0 (type: int),
> _col1 (type: string), _col2 (type: string)
> Statistics: Num rows: 1920800 Data size: 351506400
> Basic stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 2
> Map Operator Tree:
> TableScan
> alias: cd1
> filterExpr: ((cd_demo_sk is not null and cd_marital_status
> is not null) and cd_education_status is not null) (type: boolean)
> Statistics: Num rows: 1920800 Data size: 718379200 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: ((cd_demo_sk is not null and cd_marital_status
> is not null) and cd_education_status is not null) (type: boolean)
> Statistics: Num rows: 1920800 Data size: 351506400 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: cd_demo_sk (type: int), cd_marital_status
> (type: string), cd_education_status (type: string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 1920800 Data size: 351506400
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 1920800 Data size: 351506400
> Basic stats: COMPLETE Column stats: COMPLETE
> value expressions: _col1 (type: string), _col2 (type:
> string)
> Execution mode: vectorized
> Map 9
> Map Operator Tree:
> TableScan
> alias: web_sales
> filterExpr: ((ws_web_page_sk is not null and ws_item_sk is
> not null) and ws_order_number is not null) (type: boolean)
> Statistics: Num rows: 21594638446 Data size: 2850189889652
> Basic stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: ((ws_web_page_sk is not null and ws_item_sk is
> not null) and ws_order_number is not null) (type: boolean)
> Statistics: Num rows: 21591939929 Data size: 604541956128
> Basic stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: ws_item_sk (type: int), ws_web_page_sk
> (type: int), ws_order_number (type: int), ws_quantity (type: int),
> ws_sales_price (type: float), ws_net_profit (type: float), ws_sold_date_sk
> (type: int)
> outputColumnNames: _col0, _col1, _col2, _col3, _col4,
> _col5, _col6
> Statistics: Num rows: 21591939929 Data size:
> 604541956128 Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col0} {_col2} {_col3} {_col4} {_col5} {_col6}
> 1
> keys:
> 0 _col1 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col0, _col2, _col3, _col4, _col5,
> _col6
> input vertices:
> 1 Map 1
> Statistics: Num rows: 21591939072 Data size:
> 518206537728 Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int), _col2 (type:
> int)
> sort order: ++
> Map-reduce partition columns: _col0 (type: int),
> _col2 (type: int)
> Statistics: Num rows: 21591939072 Data size:
> 518206537728 Basic stats: COMPLETE Column stats: COMPLETE
> value expressions: _col3 (type: int), _col4 (type:
> float), _col5 (type: float), _col6 (type: int)
> Execution mode: vectorized
> Reducer 3
> Reduce Operator Tree:
> Merge Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1} {VALUE._col2} {VALUE._col3}
> {VALUE._col4} {VALUE._col5} {VALUE._col6}
> 1 {VALUE._col0} {VALUE._col1}
> outputColumnNames: _col0, _col2, _col3, _col4, _col5, _col6,
> _col7, _col9, _col10
> Statistics: Num rows: 1875154688 Data size: 373155782912
> Basic stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: _col0 (type: int), _col10 (type: string),
> _col2 (type: int), _col3 (type: int), _col4 (type: int), _col5 (type: int),
> _col6 (type: float), _col7 (type: float), _col9 (type: string)
> outputColumnNames: _col0, _col10, _col2, _col3, _col4,
> _col5, _col6, _col7, _col9
> Statistics: Num rows: 1875154688 Data size: 373155782912
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int), _col5 (type: int)
> sort order: ++
> Map-reduce partition columns: _col0 (type: int), _col5
> (type: int)
> Statistics: Num rows: 1875154688 Data size: 373155782912
> Basic stats: COMPLETE Column stats: COMPLETE
> value expressions: _col2 (type: int), _col3 (type: int),
> _col4 (type: int), _col6 (type: float), _col7 (type: float), _col9 (type:
> string), _col10 (type: string)
> Reducer 4
> Reduce Operator Tree:
> Merge Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {VALUE._col1} {VALUE._col2} {VALUE._col3} {VALUE._col4}
> 1 {VALUE._col1} {VALUE._col2} {VALUE._col3} {VALUE._col4}
> {VALUE._col5} {VALUE._col7} {VALUE._col8}
> outputColumnNames: _col3, _col4, _col5, _col6, _col10,
> _col11, _col12, _col14, _col15, _col17, _col18
> Statistics: Num rows: 57653145 Data size: 11472975855 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: (((_col17 = 'M') and ((_col18 = '4 yr Degree')
> and _col4 BETWEEN 100.0 AND 150.0)) or (((_col17 = 'D') and ((_col18 =
> 'Primary') and _col4 BETWEEN 50.0 AND 100.0)) or ((_col17 = 'U') and ((_col18
> = 'Advanced Degree') and _col4 BETWEEN 150.0 AND 200.0)))) (type: boolean)
> Statistics: Num rows: 57653145 Data size: 11472975855 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: _col11 (type: int), _col12 (type: int),
> _col14 (type: float), _col15 (type: float), _col17 (type: string), _col18
> (type: string), _col3 (type: int), _col5 (type: float), _col6 (type: int),
> _col10 (type: int)
> outputColumnNames: _col10, _col11, _col13, _col14,
> _col17, _col18, _col3, _col5, _col6, _col9
> Statistics: Num rows: 57653145 Data size: 11472975855
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col10 (type: int), _col17 (type:
> string), _col18 (type: string)
> sort order: +++
> Map-reduce partition columns: _col10 (type: int),
> _col17 (type: string), _col18 (type: string)
> Statistics: Num rows: 57653145 Data size: 11472975855
> Basic stats: COMPLETE Column stats: COMPLETE
> value expressions: _col3 (type: int), _col5 (type:
> float), _col6 (type: int), _col9 (type: int), _col11 (type: int), _col13
> (type: float), _col14 (type: float)
> Reducer 5
> Reduce Operator Tree:
> Merge Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0
> 1 {VALUE._col3} {VALUE._col5} {VALUE._col6} {VALUE._col9}
> {VALUE._col10} {VALUE._col12} {VALUE._col13}
> outputColumnNames: _col6, _col8, _col9, _col12, _col14,
> _col16, _col17
> Statistics: Num rows: 3187317548 Data size: 50997080768 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: _col12 (type: int), _col14 (type: int), _col16
> (type: float), _col17 (type: float), _col6 (type: int), _col8 (type: float),
> _col9 (type: int)
> outputColumnNames: _col12, _col14, _col16, _col17, _col6,
> _col8, _col9
> Statistics: Num rows: 3187317548 Data size: 50997080768
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col12 (type: int)
> sort order: +
> Map-reduce partition columns: _col12 (type: int)
> Statistics: Num rows: 3187317548 Data size: 50997080768
> Basic stats: COMPLETE Column stats: COMPLETE
> value expressions: _col6 (type: int), _col8 (type:
> float), _col9 (type: int), _col14 (type: int), _col16 (type: float), _col17
> (type: float)
> Reducer 6
> Reduce Operator Tree:
> Merge Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {VALUE._col0}
> 1 {VALUE._col6} {VALUE._col8} {VALUE._col9} {VALUE._col13}
> {VALUE._col15} {VALUE._col16}
> outputColumnNames: _col1, _col9, _col11, _col12, _col17,
> _col19, _col20
> Statistics: Num rows: 1593658752 Data size: 156178557696
> Basic stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: (((_col1) IN ('KY', 'GA', 'NM') and _col11
> BETWEEN 100 AND 200) or (((_col1) IN ('MT', 'OR', 'IN') and _col11 BETWEEN
> 150 AND 300) or ((_col1) IN ('WI', 'MO', 'WV') and _col11 BETWEEN 50 AND
> 250))) (type: boolean)
> Statistics: Num rows: 1195244064 Data size: 117133918272
> Basic stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: _col17 (type: int), _col19 (type: float),
> _col20 (type: float), _col9 (type: int), _col12 (type: int)
> outputColumnNames: _col11, _col13, _col14, _col3, _col6
> Statistics: Num rows: 1195244064 Data size: 14342928768
> Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0
> 1 {_col3} {_col11} {_col13} {_col14}
> keys:
> 0 _col0 (type: int)
> 1 _col6 (type: int)
> outputColumnNames: _col5, _col13, _col15, _col16
> input vertices:
> 0 Map 11
> Statistics: Num rows: 1334416318 Data size: 16012995816
> Basic stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: _col13 (type: int), _col15 (type:
> float), _col16 (type: float), _col5 (type: int)
> outputColumnNames: _col13, _col15, _col16, _col5
> Statistics: Num rows: 1334416318 Data size:
> 16012995816 Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col1}
> 1 {_col5} {_col15} {_col16}
> keys:
> 0 _col0 (type: int)
> 1 _col13 (type: int)
> outputColumnNames: _col1, _col7, _col17, _col18
> input vertices:
> 0 Map 12
> Statistics: Num rows: 1334416256 Data size:
> 140113706880 Basic stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: _col1 (type: string), _col7 (type:
> int), _col18 (type: float), _col17 (type: float)
> outputColumnNames: _col0, _col1, _col2, _col3
> Statistics: Num rows: 1334416256 Data size:
> 140113706880 Basic stats: COMPLETE Column stats: COMPLETE
> Group By Operator
> aggregations: avg(_col1), avg(_col2), avg(_col3)
> keys: _col0 (type: string)
> mode: hash
> outputColumnNames: _col0, _col1, _col2, _col3
> Statistics: Num rows: 157024 Data size:
> 15231328 Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: string)
> sort order: +
> Map-reduce partition columns: _col0 (type:
> string)
> Statistics: Num rows: 157024 Data size:
> 15231328 Basic stats: COMPLETE Column stats: COMPLETE
> value expressions: _col1 (type:
> struct<count:bigint,sum:double,input:int>), _col2 (type:
> struct<count:bigint,sum:double,input:float>), _col3 (type:
> struct<count:bigint,sum:double,input:float>)
> Reducer 7
> Reduce Operator Tree:
> Group By Operator
> aggregations: avg(VALUE._col0), avg(VALUE._col1),
> avg(VALUE._col2)
> keys: KEY._col0 (type: string)
> mode: mergepartial
> outputColumnNames: _col0, _col1, _col2, _col3
> Statistics: Num rows: 112 Data size: 13552 Basic stats:
> COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: substr(_col0, 1, 20) (type: string), _col1
> (type: double), _col2 (type: double), _col3 (type: double)
> outputColumnNames: _col0, _col1, _col2, _col3
> Statistics: Num rows: 112 Data size: 23296 Basic stats:
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: string), _col1 (type:
> double), _col2 (type: double), _col3 (type: double)
> sort order: ++++
> Statistics: Num rows: 112 Data size: 23296 Basic stats:
> COMPLETE Column stats: COMPLETE
> TopN Hash Memory Usage: 0.04
> Reducer 8
> Reduce Operator Tree:
> Select Operator
> expressions: KEY.reducesinkkey0 (type: string),
> KEY.reducesinkkey1 (type: double), KEY.reducesinkkey2 (type: double),
> KEY.reducesinkkey3 (type: double)
> outputColumnNames: _col0, _col1, _col2, _col3
> Statistics: Num rows: 112 Data size: 23296 Basic stats:
> COMPLETE Column stats: COMPLETE
> Limit
> Number of rows: 100
> Statistics: Num rows: 100 Data size: 20800 Basic stats:
> COMPLETE Column stats: COMPLETE
> File Output Operator
> compressed: false
> Statistics: Num rows: 100 Data size: 20800 Basic stats:
> COMPLETE Column stats: COMPLETE
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format:
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> serde:
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> Execution mode: vectorized
> Stage: Stage-0
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)