[ https://issues.apache.org/jira/browse/PARQUET-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087069#comment-16087069 ]
Dapeng Sun commented on PARQUET-1059: ------------------------------------- Hi [~xhochy], {quote} Can you describe a workload where this would bring a significant difference? {quote} In my case, the value column may be incremental or decreasing, but the change of the adjoining values is very small, so the dictionary IDs may also be adjoining or near. If the IDs encoding support Delta, I think it would save more disk space. > Improve the RLE encoding for Parquet Dictionary IDs > --------------------------------------------------- > > Key: PARQUET-1059 > URL: https://issues.apache.org/jira/browse/PARQUET-1059 > Project: Parquet > Issue Type: Improvement > Reporter: Dapeng Sun > > The IDs of Parquet Dictionary encoding is using > {{RunLengthBitPackingHybridEncoder}}. > RunLengthBitPackingHybridEncoder handles encoding with {{repeat}} and > {{bitpacking}}, we should improve it with the method likes > {{DeltaBinaryPackingWriter}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)