[jira] [Comment Edited] (HIVE-19629) Enable Decimal64 reader after orc version upgrade

Matt McCline (JIRA) Wed, 23 May 2018 23:24:07 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-19629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488491#comment-16488491
 ]


Matt McCline edited comment on HIVE-19629 at 5/24/18 6:22 AM:
--------------------------------------------------------------

No.

The Vectorizer only plans DECIMAL_64 when 2 conditions are true:

1) The input format supports DECIMAL_64 
(Vectorizer.getVectorizedInputFormatSupports currently returns null.  We need 
to enhance it to ask the input format if it supports DECIMAL_64).  The only 
input format that currently supports DECIMAL_64 is TEXTFILE and in 
VectorPartionDesc.VECTOR_DESERIALIZE mode (i.e. it uses the Fast Vector 
deserialization).

2) And, the environment variable hive.vectorized.input.format.supports.enabled 
contains "decimal_64".  We always have an environment variable that can turn 
off the feature.

So, the quandary that LLAP is it wants to encode and cache as 
Decimal64ColumnVector but the Vectorizer is not always going to request 
DECIMAL_64.  So, a column cast operation will be required somewhere in the 
*pipeline* to convert from Decimal64ColumnVector to DecimalColumnVector.  The 
VectorizedRowBatchCtx from MapWork says what the Vectorizer wants.  I.e. 
rowColumnTypeInfos and rowDataTypePhysicalVariations will have DecimalTypeInfo 
and DataTypePhysicalVariation.DECIMAL_64.


was (Author: mmccline):
No.

The Vectorizer only plans DECIMAL_64 when 2 conditions are true:

1) The input format supports DECIMAL_64 
(Vectorizer.getVectorizedInputFormatSupports currently returns null.  We need 
to enhance it to ask the input format if it supports DECIMAL_64).  The only 
input format that currently supports DECIMAL_64 is TEXTFILE and in 
VectorPartionDec.VECTOR_DESERIALIZE mode (i.e. it uses the Fast Vector 
deserialization).

2) And, the environment variable hive.vectorized.input.format.supports.enabled 
contains "decimal_64".  We always have an environment variable that can turn 
off the feature.

So, the quandary that LLAP is it wants to encode and cache as 
Decimal64ColumnVector but the Vectorizer is not always going to request 
DECIMAL_64.  So, a column cast operation will be required somewhere in the 
*pipeline* to convert from Decimal64ColumnVector to DecimalColumnVector.  The 
VectorizedRowBatchCtx from MapWork says what the Vectorizer wants.  I.e. 
rowColumnTypeInfos and rowDataTypePhysicalVariations will have DecimalTypeInfo 
and DataTypePhysicalVariation.DECIMAL_64.

> Enable Decimal64 reader after orc version upgrade
> -------------------------------------------------
>
>                 Key: HIVE-19629
>                 URL: https://issues.apache.org/jira/browse/HIVE-19629
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 3.1.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Major
>         Attachments: HIVE-19629.1.patch, HIVE-19629.2.patch
>
>
> ORC 1.5.0 supports new fast decimal 64 reader. New VRB has to be created for 
> making use of decimal 64 column vectors. Also LLAP IO will need a new reader 
> to reader from long stream to decimal 64. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-19629) Enable Decimal64 reader after orc version upgrade

Reply via email to