[
https://issues.apache.org/jira/browse/HIVE-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016972#comment-14016972
]
Hive QA commented on HIVE-4867:
-------------------------------
{color:red}Overall{color}: -1 at least one tests failed
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12648073/HIVE-4867.3.patch.txt
{color:red}ERROR:{color} -1 due to 58 failed/errored test(s), 5510 tests
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_like_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_display_colstats_tbllvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join33
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters_overlap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_fs_overwrite
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_louter_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_outer_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_vc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_push_or
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_regexp_extract
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_router_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_create_table_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin_union_remove_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin_union_remove_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_ppr1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_ppr2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udtf_explode
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_25
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketizedhiveinputformat
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_disable_merge_for_bucketing
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testSubmit
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input20
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input4
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input5
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimal
org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalX
org.apache.hive.hcatalog.pig.TestOrcHCatPigStorer.testWriteDecimalXY
{noformat}
Test results:
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/379/testReport
Console output:
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/379/console
Test logs:
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-379/
Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 58 tests failed
{noformat}
This message is automatically generated.
ATTACHMENT ID: 12648073
> Deduplicate columns appearing in both the key list and value list of
> ReduceSinkOperator
> ---------------------------------------------------------------------------------------
>
> Key: HIVE-4867
> URL: https://issues.apache.org/jira/browse/HIVE-4867
> Project: Hive
> Issue Type: Improvement
> Reporter: Yin Huai
> Assignee: Navis
> Attachments: HIVE-4867.1.patch.txt, HIVE-4867.2.patch.txt,
> HIVE-4867.3.patch.txt, source_only.txt
>
>
> A ReduceSinkOperator emits data in the format of keys and values. Right now,
> a column may appear in both the key list and value list, which result in
> unnecessary overhead for shuffling.
> Example:
> We have a query shown below ...
> {code:sql}
> explain select ss_ticket_number from store_sales cluster by ss_ticket_number;
> {\code}
> The plan is ...
> {code}
> STAGE DEPENDENCIES:
> Stage-1 is a root stage
> Stage-0 is a root stage
> STAGE PLANS:
> Stage: Stage-1
> Map Reduce
> Alias -> Map Operator Tree:
> store_sales
> TableScan
> alias: store_sales
> Select Operator
> expressions:
> expr: ss_ticket_number
> type: int
> outputColumnNames: _col0
> Reduce Output Operator
> key expressions:
> expr: _col0
> type: int
> sort order: +
> Map-reduce partition columns:
> expr: _col0
> type: int
> tag: -1
> value expressions:
> expr: _col0
> type: int
> Reduce Operator Tree:
> Extract
> File Output Operator
> compressed: false
> GlobalTableId: 0
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format:
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Stage: Stage-0
> Fetch Operator
> limit: -1
> {\code}
> The column 'ss_ticket_number' is in both the key list and value list of the
> ReduceSinkOperator. The type of ss_ticket_number is int. For this case,
> BinarySortableSerDe will introduce 1 byte more for every int in the key.
> LazyBinarySerDe will also introduce overhead when recording the length of a
> int. For every int, 10 bytes should be a rough estimation of the size of data
> emitted from the Map phase.
--
This message was sent by Atlassian JIRA
(v6.2#6252)