[ 
https://issues.apache.org/jira/browse/HUDI-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZiyueGuan updated HUDI-1796:
----------------------------
    Description: 
Situation: In ExternalSpillMap, we need to control the amount of data in memory 
map to avoid OOM. Currently, we evaluate this by estimate the average size of 
each payload twice. And get total memory use by multiplying average payload 
size with payload number. The first time we get the size is when first payload 
is inserted while the second time is when there are 100 payloads stored in 
memory. 

Problem: If the size is underestimated in the second estimation, an OOM will 
happen.

Plan: Could we have a flag to control if we want an evaluation in accurate?

Currently, I have several ideas but not sure which one could be the best or if 
there are any better one.
 # Estimate each payload, store the length of payload with its value.  Once 
update or remove happen, use diff old length and add new length if needed so 
that we keep the sum of all payload size precisely. This is the method I 
currently use in prod.
 # Do not store the length but evaluate old payload again when it is popped. It 
trades off space against time comparing to method one. A better performance may 
be reached when updating and removing are rare. I didn't adopt this because I 
had profile ingestion process by arthas and found size estimating in that may 
be time consuming in flame graph. But I'm not sure whether it is true in 
compaction. In my intuition,HoodieRecordPayload has a quite simple structure.
 # I also have a more accurate estimate method that is evaluate the whole map 
when size is 1,100,10000 and one million. Less underestimate will happen in 
such large amount of data.

Look forward to any advice or suggestion or discussion.

  was:
Situation: In ExternalSpillMap, we need to control the amount of data in memory 
map to avoid OOM. Currently, we evaluate this by estimate the average size of 
each payload twice. And get total memory use by multiple average payload size 
with payload number. The first time we get the size is when first payload is 
inserted while the second time is when there are 100 payloads stored in memory. 

Problem: If the size is underestimated in the second estimation, an OOM will 
happen.

Plan: Could we have a flag to control if we want an evaluation in accurate?

Currently, I have several ideas but not sure which one could be the best or if 
there are any better one.
 # Estimate each payload, store the length of payload with its value.  Once 
update or remove happen, use diff old length and add new length if needed so 
that we keep the sum of all payload size precisely. This is the method I 
currently use in prod.
 # Do not store the length but evaluate old payload again when it is popped. It 
trades off space against time comparing to method one. A better performance may 
be reached when updating and removing are rare. I didn't adopt this because I 
had profile ingestion process by arthas and found size estimating in that may 
be time consuming in flame graph. But I'm not sure whether it is true in 
compaction. In my intuition,HoodieRecordPayload has a quite simple structure.
 # I also have a more accurate estimate method that is evaluate the whole map 
when size is 1,100,10000 and one million. Less underestimate will happen in 
such large amount of data.

Look forward to any advice or suggestion or discussion.


> allow ExternalSpillMap use accurate payload size rather than estimated
> ----------------------------------------------------------------------
>
>                 Key: HUDI-1796
>                 URL: https://issues.apache.org/jira/browse/HUDI-1796
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: ZiyueGuan
>            Priority: Major
>
> Situation: In ExternalSpillMap, we need to control the amount of data in 
> memory map to avoid OOM. Currently, we evaluate this by estimate the average 
> size of each payload twice. And get total memory use by multiplying average 
> payload size with payload number. The first time we get the size is when 
> first payload is inserted while the second time is when there are 100 
> payloads stored in memory. 
> Problem: If the size is underestimated in the second estimation, an OOM will 
> happen.
> Plan: Could we have a flag to control if we want an evaluation in accurate?
> Currently, I have several ideas but not sure which one could be the best or 
> if there are any better one.
>  # Estimate each payload, store the length of payload with its value.  Once 
> update or remove happen, use diff old length and add new length if needed so 
> that we keep the sum of all payload size precisely. This is the method I 
> currently use in prod.
>  # Do not store the length but evaluate old payload again when it is popped. 
> It trades off space against time comparing to method one. A better 
> performance may be reached when updating and removing are rare. I didn't 
> adopt this because I had profile ingestion process by arthas and found size 
> estimating in that may be time consuming in flame graph. But I'm not sure 
> whether it is true in compaction. In my intuition,HoodieRecordPayload has a 
> quite simple structure.
>  # I also have a more accurate estimate method that is evaluate the whole map 
> when size is 1,100,10000 and one million. Less underestimate will happen in 
> such large amount of data.
> Look forward to any advice or suggestion or discussion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to