[GitHub] [arrow] emkornfield commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

GitBox Sat, 25 Apr 2020 22:58:43 -0700


emkornfield commented on pull request #6985:
URL: https://github.com/apache/arrow/pull/6985#issuecomment-619488093



   @pitrou I think I addressed your comments.  One of them that went stale was 
the complexity for "AppendWord", I tried to remove parts that did not seem to 
affect performance on benchmarks for parquet column reading but I spent some 
time trying to maximize word level parallelism for the unaligned case, because 
I think at least for repeated fields I expect this case to be fairly common. 
   
   As a point of comparison using the AppendWord implementation I put in place 
for BigEndian shows much smaller improvements.  Really this should use 
CopyBitmap but I didn't want to start moving a move code then I needed to in 
bit_util.h for this PR. I opened 
    a [JIRA (ARROW-8595)](https://issues.apache.org/jira/browse/ARROW-8595) to 
track some improvements in that regard.
   
   If more comments are needed or you think there is a cleaner way of writing 
the code, I'm happy for input.
   
   @wesm do you want to look at the parquet specific logic/comments to make 
sure I captured them correctly?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] emkornfield commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

Reply via email to