[
https://issues.apache.org/jira/browse/HIVE-19629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488491#comment-16488491
]
Matt McCline edited comment on HIVE-19629 at 5/24/18 6:22 AM:
--------------------------------------------------------------
No.
The Vectorizer only plans DECIMAL_64 when 2 conditions are true:
1) The input format supports DECIMAL_64
(Vectorizer.getVectorizedInputFormatSupports currently returns null. We need
to enhance it to ask the input format if it supports DECIMAL_64). The only
input format that currently supports DECIMAL_64 is TEXTFILE and in
VectorPartionDesc.VECTOR_DESERIALIZE mode (i.e. it uses the Fast Vector
deserialization).
2) And, the environment variable hive.vectorized.input.format.supports.enabled
contains "decimal_64". We always have an environment variable that can turn
off the feature.
So, the quandary that LLAP is it wants to encode and cache as
Decimal64ColumnVector but the Vectorizer is not always going to request
DECIMAL_64. So, a column cast operation will be required somewhere in the
*pipeline* to convert from Decimal64ColumnVector to DecimalColumnVector. The
VectorizedRowBatchCtx from MapWork says what the Vectorizer wants. I.e.
rowColumnTypeInfos and rowDataTypePhysicalVariations will have DecimalTypeInfo
and DataTypePhysicalVariation.DECIMAL_64.
was (Author: mmccline):
No.
The Vectorizer only plans DECIMAL_64 when 2 conditions are true:
1) The input format supports DECIMAL_64
(Vectorizer.getVectorizedInputFormatSupports currently returns null. We need
to enhance it to ask the input format if it supports DECIMAL_64). The only
input format that currently supports DECIMAL_64 is TEXTFILE and in
VectorPartionDec.VECTOR_DESERIALIZE mode (i.e. it uses the Fast Vector
deserialization).
2) And, the environment variable hive.vectorized.input.format.supports.enabled
contains "decimal_64". We always have an environment variable that can turn
off the feature.
So, the quandary that LLAP is it wants to encode and cache as
Decimal64ColumnVector but the Vectorizer is not always going to request
DECIMAL_64. So, a column cast operation will be required somewhere in the
*pipeline* to convert from Decimal64ColumnVector to DecimalColumnVector. The
VectorizedRowBatchCtx from MapWork says what the Vectorizer wants. I.e.
rowColumnTypeInfos and rowDataTypePhysicalVariations will have DecimalTypeInfo
and DataTypePhysicalVariation.DECIMAL_64.
> Enable Decimal64 reader after orc version upgrade
> -------------------------------------------------
>
> Key: HIVE-19629
> URL: https://issues.apache.org/jira/browse/HIVE-19629
> Project: Hive
> Issue Type: Bug
> Affects Versions: 3.1.0
> Reporter: Prasanth Jayachandran
> Assignee: Prasanth Jayachandran
> Priority: Major
> Attachments: HIVE-19629.1.patch, HIVE-19629.2.patch
>
>
> ORC 1.5.0 supports new fast decimal 64 reader. New VRB has to be created for
> making use of decimal 64 column vectors. Also LLAP IO will need a new reader
> to reader from long stream to decimal 64.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)