[
https://issues.apache.org/jira/browse/ORC-378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16514359#comment-16514359
]
Sergey Shelukhin commented on ORC-378:
--------------------------------------
Some places in the code also blindly unset isRepeating, which used to work
because reader used to populate data for all the values anyway, whereas after
this patch, for isRepeating columnvector, only the 0th element may be set if it
can figure out the values are repeating before populating the data (from ORC
encodings).
I'm going to fix the isRepeating usage for now, but I wonder how much code
relies on redundant stuff like this elsewhere. Maybe it's better to add
Arrays.fill to the end of the optimized case... [~owen.omalley] [~mmccline]
[~t3rmin4t0r] any input? Esp in Hive, where CVs are manipulated in many places
in UDFs, etc.
> translate ShortRepeat/Delta integer encoding into isRepeating on LongCV more
> directly
> -------------------------------------------------------------------------------------
>
> Key: ORC-378
> URL: https://issues.apache.org/jira/browse/ORC-378
> Project: ORC
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Priority: Major
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)