[ 
https://issues.apache.org/jira/browse/PARQUET-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086979#comment-16086979
 ] 

Uwe L. Korn commented on PARQUET-1059:
--------------------------------------

Can you describe a workload where this would bring a significant difference? 
The need of delta encoding in the dictionary indices rather indicates that you 
have many distinct values in the column.

> Improve the RLE encoding for Parquet Dictionary IDs
> ---------------------------------------------------
>
>                 Key: PARQUET-1059
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1059
>             Project: Parquet
>          Issue Type: Improvement
>            Reporter: Dapeng Sun
>
> The IDs of Parquet Dictionary encoding is using 
> {{RunLengthBitPackingHybridEncoder}}.
> RunLengthBitPackingHybridEncoder handles encoding with {{repeat}} and 
> {{bitpacking}}, we should improve it with the method likes 
> {{DeltaBinaryPackingWriter}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to