[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

Hive QA (JIRA) Mon, 26 Nov 2018 09:51:31 -0800


    [ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16699361#comment-16699361
 ]


Hive QA commented on HIVE-20954:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949508/HIVE-20954.3.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 42 failed/errored test(s), 15542 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=197)
        [druidmini_masking.q,druidmini_joins.q,druid_timestamptz.q]
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testACIDwithSchemaEvolutionAndCompaction
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testAcidOrcWritePreservesFieldNames
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testAcidWithSchemaEvolution
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testAlterTable
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testBucketCodec
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testBucketizedInputFormat
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testCleanerForTxnToWriteId
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testCompactWithDelete
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDeleteIn
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDynamicPartitionsMerge
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDynamicPartitionsMerge2
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testETLSplitStrategyForACID
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testEmptyInTblproperties
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testFailHeartbeater
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testFileSystemUnCaching
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInitiatorWithMultipleFailedCompactions
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInsertOverwrite1
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInsertOverwrite2
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInsertOverwriteWithSelfJoin
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge2
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge3
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMergeWithPredicate
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMmTableCompaction
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMultiInsert
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMultiInsertStatement
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNoHistory
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidInsert
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion02
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion1
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion2
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion3
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOpenTxnsCounter
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOrcNoPPD
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOrcPPD
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOriginalFileReaderWhenNonAcidConvertedToAcid
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testUpdateMixedCase
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testValidTxnsBookkeeping
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.updateDeletePartitioned
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.writeBetweenWorkerAndCleaner
 (batchId=320)
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testTokenAuth 
(batchId=274)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15058/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15058/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15058/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 42 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949508 - PreCommit-HIVE-Build

> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -------------------------------------------------------------------------
>
>                 Key: HIVE-20954
>                 URL: https://issues.apache.org/jira/browse/HIVE-20954
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Teddy Choi
>            Assignee: Teddy Choi
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-20954.1.patch, HIVE-20954.2.patch, 
> HIVE-20954.3.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> |                     Select Operator                |
> |                       expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |                       outputColumnNames: _col0, _col1 |
> |                       Select Vectorization:        |
> |                           className: VectorSelectOperator |
> |                           native: true             |
> |                           projectedOutputColumnNums: [14, 16] |
> |                       Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |                       Reduce Output Operator       |
> |                         key expressions: _col1 (type: bigint) |
> |                         sort order: +              |
> |                         Map-reduce partition columns: _col1 (type: bigint) |
> |                         Reduce Sink Vectorization: |
> |                             className: VectorReduceSinkObjectHashOperator |
> |                             keyColumnNums: [16]    |
> |                             native: true           |
> |                             nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> |                             partitionColumnNums: [16] |
> |                             valueColumnNums: [14]  |
> +----------------------------------------------------+
> |                      Explain                       |
> +----------------------------------------------------+
> |                         Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |                         value expressions: _col0 (type: bigint) |
> |                       Reduce Output Operator       |
> |                         key expressions: _col1 (type: bigint) |
> |                         sort order: +              |
> |                         Map-reduce partition columns: _col1 (type: bigint) |
> |                         Reduce Sink Vectorization: |
> |                             className: VectorReduceSinkLongOperator |
> |                             keyColumnNums: [16]    |
> |                             native: true           |
> |                             nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> |                             valueColumnNums: [14]  |
> |                         Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |                         value expressions: _col0 (type: bigint) |
> |             Execution mode: vectorized, llap       |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

Reply via email to