[
https://issues.apache.org/jira/browse/ORC-250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331142#comment-16331142
]
ASF GitHub Bot commented on ORC-250:
------------------------------------
Github user moresandeep commented on the issue:
https://github.com/apache/orc/pull/208
Updated the PR with suggested changes.
> Create sha256 mask
> ------------------
>
> Key: ORC-250
> URL: https://issues.apache.org/jira/browse/ORC-250
> Project: ORC
> Issue Type: Sub-task
> Reporter: Owen O'Malley
> Assignee: Sandeep More
> Priority: Major
>
> We should also create a DataMask that does sha256 of the data:
> * strings should be sha256 of the utf-8 representation of the string
> represented as hex digits
> * binary should be sha256 of the binary in binary
> * integer types should be sha256 of the little endian representation of the
> number in little endian cut down to the right size (1,2,4, or 8 bytes)
> * floating point types should be sha256 of the binary representation as
> either 4 (float) or 8 (double) bytes
> * timestamps and dates should convert like integers
> * decimal should convert like 128 bit numbers with the result cut to the
> matching number of bytes
> It isn't clear what we should do in the very small data types:
> * boolean
> * byte
> * short
> I'd lean toward either making them null or passing them through unchanged.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)