[ 
https://issues.apache.org/jira/browse/PARQUET-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420243#comment-16420243
 ] 

ASF GitHub Bot commented on PARQUET-1143:
-----------------------------------------

scottcarey commented on issue #430: PARQUET-1143: Update to Parquet format 
2.4.0.
URL: https://github.com/apache/parquet-mr/pull/430#issuecomment-377463522
 
 
   Yeah, I looked a little further into what is needed on the Spark side too.   
Part way in modifying the vectorized readers to use method signatures that use 
ByteBufferInputStream rather than (byte[], offset), I hit a spot where they 
called back into code here that did not take a ByteBufferInputStream.
   
   It looks like changes on both sides are needed.
   
   I think that whole area of code would work better if coded with a DataInput 
interface instead.  You can wrap a ByteBufferInputStream in an DataInputStream, 
and get free (and decently efficient but not amazing) tools for reading 
littleEndian ints, etc.  DataInputStream will be quite a bit faster than 
calling read() 4 times in a row and constructing the int by hand, though its 
technique of maintaining a small buffer for reading primitives can be emulated.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Update Java for format 2.4.0 changes
> ------------------------------------
>
>                 Key: PARQUET-1143
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1143
>             Project: Parquet
>          Issue Type: Task
>          Components: parquet-mr
>    Affects Versions: 1.9.0, 1.8.2
>            Reporter: Ryan Blue
>            Assignee: Ryan Blue
>            Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to