[ 
https://issues.apache.org/jira/browse/PARQUET-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kudinkin updated PARQUET-2106:
-------------------------------------
    Description: 
While writing out large Parquet tables using Spark, we've noticed that 
BinaryComparator is the source of substantial churn of extremely short-lived 
`HeapByteBuffer` objects – It's taking up to *16%* of total amount of 
allocations in our benchmarks, putting substantial pressure on a Garbage 
Collector:

!Screen Shot 2021-12-03 at 3.26.31 PM.png|width=828,height=521!

[^profile_48449_alloc_1638494450_sort_by.html]

  was:
While writing out large Parquet tables using Spark, we've noticed that 
BinaryComparator is the source of substantial churn of extremely short-lived 
`HeapByteBuffer` objects: 

It's taking up to *16%* of total amount of allocations in our benchmarks

!Screen Shot 2021-12-03 at 3.26.31 PM.png!

[^profile_48449_alloc_1638494450_sort_by.html]


> BinaryComparator should avoid doing ByteBuffer.wrap
> ---------------------------------------------------
>
>                 Key: PARQUET-2106
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2106
>             Project: Parquet
>          Issue Type: Task
>          Components: parquet-mr
>    Affects Versions: 1.12.2
>            Reporter: Alexey Kudinkin
>            Priority: Major
>         Attachments: Screen Shot 2021-12-03 at 3.26.31 PM.png, 
> profile_48449_alloc_1638494450_sort_by.html
>
>
> While writing out large Parquet tables using Spark, we've noticed that 
> BinaryComparator is the source of substantial churn of extremely short-lived 
> `HeapByteBuffer` objects – It's taking up to *16%* of total amount of 
> allocations in our benchmarks, putting substantial pressure on a Garbage 
> Collector:
> !Screen Shot 2021-12-03 at 3.26.31 PM.png|width=828,height=521!
> [^profile_48449_alloc_1638494450_sort_by.html]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to