[ https://issues.apache.org/jira/browse/PARQUET-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402877#comment-15402877 ]
Julien Le Dem commented on PARQUET-671: --------------------------------------- Thanks [~edaniel] Looking forward to see your PR > Improve performance of RLE/bit-packed decoding in parquet-cpp > ------------------------------------------------------------- > > Key: PARQUET-671 > URL: https://issues.apache.org/jira/browse/PARQUET-671 > Project: Parquet > Issue Type: Improvement > Components: parquet-cpp > Reporter: Eric Daniel > Assignee: Eric Daniel > > There are steps that can dramatically improve decoding performance: > - when decoding repeated values in the rle/dictionary encoding, do the > dictionary lookup only once > - when decoding bit-packed sequences, do the decoding in batches so the bit > unpacker's state can be kept in registers (instead of updating members for > every decoded value) > - use Daniel Lemire's fast unpacking routines whenever possible > (https://github.com/lemire/FrameOfReference/) > I have a PR ready to implement these changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)