[jira] [Updated] (DERBY-5416) SYSCS_COMPRESS_TABLE causes an OutOfMemoryError when the heap is full at call time and then gets mostly garbage collected later on

Knut Anders Hatlen (JIRA) Tue, 10 Dec 2013 07:58:06 -0800

     [ 
https://issues.apache.org/jira/browse/DERBY-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Knut Anders Hatlen updated DERBY-5416:
--------------------------------------

    Attachment: d5416-1a.diff

The attached patch, d5416-1a.diff, implements the variant of 1) mentioned 
above, where we simply replace the initial memory usage with the current memory 
usage if we detect that the memory usage has gone down.

Although this is not a perfect solution (the memory estimates are still 
inaccurate), I don't think it will make the estimates worse in any situations. 
In the common case, it won't affect the estimates at all. And when it does 
affect the estimates, it changes them from "way too low" (negative values) to 
just "too low".

All the regression tests ran cleanly with the patch. So did Ramin's test case.

> SYSCS_COMPRESS_TABLE causes an OutOfMemoryError when the heap is full at call 
> time and then gets mostly garbage collected later on
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5416
>                 URL: https://issues.apache.org/jira/browse/DERBY-5416
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.6.2.1, 10.7.1.1, 10.8.1.2
>            Reporter: Ramin Baradari
>            Assignee: Knut Anders Hatlen
>            Priority: Critical
>              Labels: derby_triage10_9
>         Attachments: compress_test_5416.patch, d5416-1a.diff, lowmem-test.diff
>
>
> When compressing a table with an index that is larger than the maximum heap 
> size and therefore cannot be hold in memory as a whole an OutOfMemoryError 
> can occur. 
> For this to happen the heap usage must be close to the maximum heap size at 
> the start of the index recreation and then while the entries are sorted a 
> garbage collection run must clean out most of the heap. This can happen 
> because a concurrent process releases a huge chunk of memory or just because 
> the buffer of a previous table compression has not yet been garbage 
> collected. 
> The internally used heuristics to guess when more memory can be used for the 
> merge inserter estimates that more memory is available and then the sort 
> buffer gets doubled. The buffer size gets doubled until the heap usage is 
> back to the level when the merge inserter was first initialized or when the 
> OOM occurs.
> The problem lies in MergeInsert.insert(...). The check if the buffer can be 
> doubled contains the expression "estimatedMemoryUsed < 0" where 
> estimatedMemoryUsed is the difference in current heap usage and heap usage at 
> initialization. Unfortunately, in the aforementioned scenario this will be 
> true until the heap usage will reach close to maximum heap size before 
> doubling the buffer size will be stopped.
> I've tested it with 10.6.2.1, 10.7.1.1 and 10.8.1.2 but the actual bug most 
> likely exists in prior versions too.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Updated] (DERBY-5416) SYSCS_COMPRESS_TABLE causes an OutOfMemoryError when the heap is full at call time and then gets mostly garbage collected later on

Reply via email to