[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

Hive QA (JIRA) Tue, 04 Oct 2016 04:44:35 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15545138#comment-15545138
 ]


Hive QA commented on HIVE-11394:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12831502/HIVE-11394.03.patch

{color:green}SUCCESS:{color} +1 due to 132 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 10655 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_complex_all]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_udf1]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_all]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_cast_constant]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_count_distinct]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_decimal_aggregate]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_distinct_2]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_groupby_3]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_mapjoin_reduce]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_orderby_5]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_string_concat]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_13]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_14]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_15]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_16]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_9]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_short_regress]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_ptf]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_shufflejoin]
org.apache.hadoop.hive.ql.optimizer.physical.TestVectorizer.testValidateNestedExpressions
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1385/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1385/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1385/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12831502 - PreCommit-HIVE-Build

> Enhance EXPLAIN display for vectorization
> -----------------------------------------
>
>                 Key: HIVE-11394
>                 URL: https://issues.apache.org/jira/browse/HIVE-11394
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>         Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch
>
>
> Add detail to the EXPLAIN output showing why a Map or Reduce task was not 
> vectorized.
> Add new VECTORIZATION option that displays 3 levels.  Here are some examples:
> (At the beginning)
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> {code}
> For Map and Reduce nodes:
> {code}
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
>                 groupByVectorOutput: false
>                 inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
> {code}
> {code}
>             Reduce Vectorization:
>                 enabled: true
>                 enableConditionsMet: hive.vectorized.execution.reduce.enabled 
> IS true, hive.execution.engine tez IN [tez, spark] IS true
>                 notVectorizedReason: Aggregation Function UDF avg parameter 
> expression for GROUPBY operator: Data type 
> struct<count:bigint,sum:decimal(38,18),input:decimal(38,18)> of 
> Column[VALUE._col3] not supported
>                 vectorized: false
> {code}
> And, for each vectorized operator:
> {code}
>                     Select Vectorization:
>                         className: VectorSelectOperator
>                         native: true
>                         nativeConditionsMet: Supported IS true
>                         selectExpressions: 
> IdentityExpression[6:decimal(38,18)]
>                         vectorized: true
> {code}
> {code}
>                       Map Join Vectorization:
>                           className: VectorMapJoinOperator
>                           native: false
>                           nativeConditionsMet: 
> hive.vectorized.execution.mapjoin.native.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, One MapJoin Condition IS 
> true, No nullsafe IS true, Supports Key Types IS true, When Fast Hash Table, 
> then requires no Hybrid Hash Join IS true, Small table vectorizes IS true
>                           nativeConditionsNotMet: Not empty key IS false
>                           vectorized: true
> {code}
> The standard @Explain Annotation Type is used.  A new 'vectorization' 
> annotation marks each new class and method.
> Works for FORMATTED, like other non-vectorization variations.
> Consider adding options to just show Vectorization information:
> EXPLAIN VECTORIZATION [ONLY] [SUMMARY|DETAIL]
> where current patch is equivalent to EXPLAIN VECTORIZATION DETAIL.
> SUMMARY would add PLAN VECTORIZATION and Map/Reduce Vectorization, but not 
> operator detail.
> ONLY would suppress most non-vectorization elements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

Reply via email to