[ 
https://issues.apache.org/jira/browse/ORC-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194917#comment-16194917
 ] 

Owen O'Malley commented on ORC-243:
-----------------------------------

It isn't reasonable to assume that every TreeReader implementation knows all of 
the flags in ColumnVector that need to be reset. That would lead to lots of 
breakages when people add new flags or new TreeReader implementations.

You seem most concerned about the overhead of clearing the isNull array. Is 
that correct? Do you have performance numbers that show that it is a problem or 
is an abstract concern? I'd propose that you add a new method to ColumnVector. 
(I'd propose something like partialReset() that resets all of the flags, but 
doesn't include clearing the isNull array.) Then we can change the prerequisite 
from calling ColumnVector.reset to calling partialReset. Note that Hive has to 
be completely consistent in the operators with never looking at isNull if 
noNulls is set. Given that assumption (which will be hard to verify), not 
clearing the isNull array is acceptable because the ORC reader will either set 
noNull or fill in isNull with the correct values.

> incorrect isRepeating handling in decimal reader
> ------------------------------------------------
>
>                 Key: ORC-243
>                 URL: https://issues.apache.org/jira/browse/ORC-243
>             Project: ORC
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>            Priority: Blocker
>         Attachments: ORC-243.patch
>
>
> This can lead to incorrect results. 
> I need to look at other readers, will do tomorrow if this looks ok in general.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to