[
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt McCline updated HIVE-11394:
--------------------------------
Status: Patch Available (was: In Progress)
> Enhance EXPLAIN display for vectorization
> -----------------------------------------
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Reporter: Matt McCline
> Assignee: Matt McCline
> Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch,
> HIVE-11394.03.patch
>
>
> Add detail to the EXPLAIN output showing why a Map or Reduce task was not
> vectorized.
> Add new VECTORIZATION option that displays 3 levels. Here are some examples:
> (At the beginning)
> {code}
> PLAN VECTORIZATION:
> enabled: true
> enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> {code}
> For Map and Reduce nodes:
> {code}
> Map Vectorization:
> enabled: true
> enabledConditionsMet:
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: false
> inputFileFormats:
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> {code}
> {code}
> Reduce Vectorization:
> enabled: true
> enableConditionsMet: hive.vectorized.execution.reduce.enabled
> IS true, hive.execution.engine tez IN [tez, spark] IS true
> notVectorizedReason: Aggregation Function UDF avg parameter
> expression for GROUPBY operator: Data type
> struct<count:bigint,sum:decimal(38,18),input:decimal(38,18)> of
> Column[VALUE._col3] not supported
> vectorized: false
> {code}
> And, for each vectorized operator:
> {code}
> Select Vectorization:
> className: VectorSelectOperator
> native: true
> nativeConditionsMet: Supported IS true
> selectExpressions:
> IdentityExpression[6:decimal(38,18)]
> vectorized: true
> {code}
> {code}
> Map Join Vectorization:
> className: VectorMapJoinOperator
> native: false
> nativeConditionsMet:
> hive.vectorized.execution.mapjoin.native.enabled IS true,
> hive.execution.engine tez IN [tez, spark] IS true, One MapJoin Condition IS
> true, No nullsafe IS true, Supports Key Types IS true, When Fast Hash Table,
> then requires no Hybrid Hash Join IS true, Small table vectorizes IS true
> nativeConditionsNotMet: Not empty key IS false
> vectorized: true
> {code}
> The standard @Explain Annotation Type is used. A new 'vectorization'
> annotation marks each new class and method.
> Works for FORMATTED, like other non-vectorization variations.
> Consider adding options to just show Vectorization information:
> EXPLAIN VECTORIZATION [ONLY] [SUMMARY|DETAIL]
> where current patch is equivalent to EXPLAIN VECTORIZATION DETAIL.
> SUMMARY would add PLAN VECTORIZATION and Map/Reduce Vectorization, but not
> operator detail.
> ONLY would suppress most non-vectorization elements.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)