[ 
https://issues.apache.org/jira/browse/IMPALA-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638285#comment-16638285
 ] 

Csaba Ringhofer edited comment on IMPALA-4994 at 10/4/18 2:28 PM:
------------------------------------------------------------------

I took this Jira as I am already meddling with this topic in IMPALA-5050.
IMPALA-5050 adds support for int64 milli/micro timestamp columns, and as the 
dictionary consists of TimestampValues in this case, the int64->TimestampValue 
conversion has to be done during dictionary construction.


was (Author: csringhofer):
I took this Jira as I am already meddling with this topic in IMPALA-5050.

> Push conversion and validation into dictionary construction
> -----------------------------------------------------------
>
>                 Key: IMPALA-4994
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4994
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.9.0
>            Reporter: Joe McDonnell
>            Assignee: Csaba Ringhofer
>            Priority: Major
>              Labels: ramp-up
>
> Certain data types require conversion and/or validation when read from a 
> Parquet file. For example, timestamps can require conversion to account for 
> different storage offsets. Char/varchar fields can require conversion to 
> handle lengths and space padding. Timestamps require validation, because not 
> all bit combinations are valid timestamps.
> Right now, this is done per element as it is read. For dictionary encoded 
> columns, it would save processing to do the conversion/validation once at 
> dictionary construction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to