[ 
https://issues.apache.org/jira/browse/PARQUET-852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16447997#comment-16447997
 ] 

ASF GitHub Bot commented on PARQUET-852:
----------------------------------------

gszadovszky opened a new pull request #467: Revert "PARQUET-852: Slowly ramp up 
sizes of byte[] in ByteBasedBitPa…
URL: https://github.com/apache/parquet-mr/pull/467
 
 
   …ckingEncoder"
   
   This reverts commit d59b32a9120ad40e2a9f6651b680e84dae1747a6.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Slowly ramp up sizes of byte[] in ByteBasedBitPackingEncoder
> ------------------------------------------------------------
>
>                 Key: PARQUET-852
>                 URL: https://issues.apache.org/jira/browse/PARQUET-852
>             Project: Parquet
>          Issue Type: Improvement
>            Reporter: John Jenkins
>            Priority: Minor
>             Fix For: 1.10.0
>
>
> The current allocation policy for ByteBasedBitPackingEncoder is to allocate 
> 64KB * #bits up-front. As similarly observed in [PARQUET-580], this can lead 
> to significant memory overheads for high-fanout scenarios (many columns 
> and/or open files, in my case using BooleanPlainValuesWriter).
> As done in [PARQUET-585], I'll follow up with a PR that starts with a smaller 
> buffer and works its way up to a max.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to