[jira] [Commented] (PARQUET-671) Improve performance of RLE/bit-packed decoding in parquet-cpp

2016-08-01 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402877#comment-15402877
 ] 

Julien Le Dem commented on PARQUET-671:
---

Thanks [~edaniel]
Looking forward to see your PR

> Improve performance of RLE/bit-packed decoding in parquet-cpp
> -
>
> Key: PARQUET-671
> URL: https://issues.apache.org/jira/browse/PARQUET-671
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-cpp
>Reporter: Eric Daniel
>Assignee: Eric Daniel
>
> There are steps that can dramatically improve decoding performance:
> - when decoding repeated values in the rle/dictionary encoding, do the 
> dictionary lookup only once
> - when decoding bit-packed sequences, do the decoding in batches so the bit 
> unpacker's state can be kept in registers (instead of updating members for 
> every decoded value)
> - use Daniel Lemire's fast unpacking routines whenever possible 
> (https://github.com/lemire/FrameOfReference/)
> I have a PR ready to implement these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PARQUET-671) Improve performance of RLE/bit-packed decoding in parquet-cpp

2016-08-01 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PARQUET-671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated PARQUET-671:
-
Assignee: Eric Daniel

> Improve performance of RLE/bit-packed decoding in parquet-cpp
> -
>
> Key: PARQUET-671
> URL: https://issues.apache.org/jira/browse/PARQUET-671
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-cpp
>Reporter: Eric Daniel
>Assignee: Eric Daniel
>
> There are steps that can dramatically improve decoding performance:
> - when decoding repeated values in the rle/dictionary encoding, do the 
> dictionary lookup only once
> - when decoding bit-packed sequences, do the decoding in batches so the bit 
> unpacker's state can be kept in registers (instead of updating members for 
> every decoded value)
> - use Daniel Lemire's fast unpacking routines whenever possible 
> (https://github.com/lemire/FrameOfReference/)
> I have a PR ready to implement these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PARQUET-671) Improve performance of RLE/bit-packed decoding in parquet-cpp

2016-08-01 Thread Eric Daniel (JIRA)
Eric Daniel created PARQUET-671:
---

 Summary: Improve performance of RLE/bit-packed decoding in 
parquet-cpp
 Key: PARQUET-671
 URL: https://issues.apache.org/jira/browse/PARQUET-671
 Project: Parquet
  Issue Type: Improvement
  Components: parquet-cpp
Reporter: Eric Daniel


There are steps that can dramatically improve decoding performance:

- when decoding repeated values in the rle/dictionary encoding, do the 
dictionary lookup only once
- when decoding bit-packed sequences, do the decoding in batches so the bit 
unpacker's state can be kept in registers (instead of updating members for 
every decoded value)
- use Daniel Lemire's fast unpacking routines whenever possible 
(https://github.com/lemire/FrameOfReference/)

I have a PR ready to implement these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)