[ 
https://issues.apache.org/jira/browse/IOTDB-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangrong closed IOTDB-84.
-------------------------

> Out-of-Memory bug
> -----------------
>
>                 Key: IOTDB-84
>                 URL: https://issues.apache.org/jira/browse/IOTDB-84
>             Project: Apache IoTDB
>          Issue Type: Bug
>            Reporter: kangrong
>            Priority: Major
>              Labels: out-of-memory, pull-request-available
>         Attachments: image-2019-04-22-12-38-04-903.png, 
> image-2019-04-24-11-50-27-220.png, iostat.png
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> It occurs out-of-memory problem in the last long-term test of branch 
> "add_disabled_mem_control":
> !image-2019-04-22-12-38-04-903.png!
> We analyze the reason and try to solve it as follows:
>  # 1. *Flushing to disk may double the memory cost*: A storage group 
> maintains a list of ChunkGroups in memory and will be flushed to disk when 
> its occupied memory exceeding the threshold (128MB by default).
>  ## In the current implementation, when starting to flush data, a ChunkGroup 
> is encoded in memory and thereby a new byte array is kept in memory. Until 
> all ChunkGroups have been encoded in memory, their corresponding byte arrays 
> can be released together. Since the byte array has a comparable size with 
> original data (0.5× to 1×), the above strategy may double the memory in the 
> worst case.
>  ## Solution: It is needed to redesign the flush strategy. In TsFile, a Page 
> is the minimal flush unit, where a ChunkGroup contains several Chunks and a 
> Chunk contains several pages. Once a page is encoded into a byte array, we 
> can flush the byte array to disk and then release it. In this case, the extra 
> memory is a page size (64KB by default) at most. This modification involves a 
> list of cascading change, including metadata format and writing process.
>  # *Memory Control Strategy*: It is needed to redesign the memory Control 
> Strategy. For example, assigning 60% memory to the writing process and 30% 
> memory to the querying process. The writing memory includes the memory table 
> and the flush process. As an Insert coming, if its required memory exceeds 
> TotalMem * 0.6 - MemTableUsage - FlushUsage, the Insert will be rejected.
>  # *Is the memory statistics accuracy?* In current codes, the memory usage of 
> a TSRecord Java Object, corresponding to an Insert SQL, is calculated by 
> summating its DataPoints. e.g., "insert into root.a.b.c(timestamp,v1, v2) 
> values(1L, true, 1.2f)", its usage is 8 + 1 + 4=13, which ignores the size of 
> object head and others. It is needed to redesign the memory statistics 
> accuracy carefully.
>  # *Is there still the memory leak?* As shown in the log of the last crash 
> due to the out of memory exception, we find out the actual JVM memory is 18G, 
> whereas our memory statistic module only counts 8G. Besides the inaccuracy 
> mentioned in Q3, we doubt there are still memory leak or other potential 
> problems. We will continue to debug it.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to