[
https://issues.apache.org/jira/browse/ORC-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194917#comment-16194917
]
Owen O'Malley commented on ORC-243:
-----------------------------------
It isn't reasonable to assume that every TreeReader implementation knows all of
the flags in ColumnVector that need to be reset. That would lead to lots of
breakages when people add new flags or new TreeReader implementations.
You seem most concerned about the overhead of clearing the isNull array. Is
that correct? Do you have performance numbers that show that it is a problem or
is an abstract concern? I'd propose that you add a new method to ColumnVector.
(I'd propose something like partialReset() that resets all of the flags, but
doesn't include clearing the isNull array.) Then we can change the prerequisite
from calling ColumnVector.reset to calling partialReset. Note that Hive has to
be completely consistent in the operators with never looking at isNull if
noNulls is set. Given that assumption (which will be hard to verify), not
clearing the isNull array is acceptable because the ORC reader will either set
noNull or fill in isNull with the correct values.
> incorrect isRepeating handling in decimal reader
> ------------------------------------------------
>
> Key: ORC-243
> URL: https://issues.apache.org/jira/browse/ORC-243
> Project: ORC
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Priority: Blocker
> Attachments: ORC-243.patch
>
>
> This can lead to incorrect results.
> I need to look at other readers, will do tomorrow if this looks ok in general.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)