Good summary, I would like to make some minor supplements. 1. I think the data of one series can be discarded as soon as its corresponding Chunk is generated, so you do not need to bother redesigning it, a simple optimization may do the trick.
2. Yeah, as long as you can estimate how much memory is occupied by a query. But I suggest we focus on insertion first. 3. OF COURSE NOT. Considering all intermediate objects and object overhead takes a long way to go, but we can adjust it little by little. 4. I have to point out that you may get BufferWrite working memtable, BufferWrite flushing memtable, Overflow working memtable, Overflow flushing memtable at the same time, so a simple double may be just not enough for estimation. > 在 2019年4月22日,下午12:10,kangr15 <[email protected]> 写道: > > Hi all: > Sorry forthe text format, the follows are reorganized: > 1. Flushing to disk may double the memory cost: A storage group maintains a > list of ChunkGroups in memory and will be flushed to disk when its occupied > memory exceeding the threshold (128MB by default). > In the current implementation, when starting to flush data, a ChunkGroup is > encoded in memory and thereby a new byte array is kept in memory. Until all > ChunkGroups have been encoded in memory, their corresponding byte arrays can > be released together. Since the byte array has a comparable size with > original data (0.5× to 1×), the above strategy may double the memory in the > worst case. > Solution: It is needed to redesign the flush strategy. In TsFile, a Page is > the minimal flush unit, where a ChunkGroup contains several Chunks and a > Chunk contains several pages. Once a page is encoded into a byte array, we > can flush the byte array to disk and then release it. In this case, the extra > memory is a page size (64KB by default) at most. This modification involves a > list of cascading change, including metadata format and writing process. > > 2. Memory Control Strategy: It is needed to redesign the memory Control > Strategy. For example, assigning 60% memory to the writing process and 30% > memory to the querying process. The writing memory includes the memory table > and the flush process. As an Insert coming, if its required memory exceeds > TotalMem * 0.6 - MemTableUsage - FlushUsage, the Insert will be rejected. > 3. Is the memory statistics accuracy? In current codes, the memory usage of a > TSRecord Java Object, corresponding to an Insert SQL, is calculated by > summating its DataPoints. e.g., "insert into root.a.b.c(timestamp,v1, v2) > values(1L, true, 1.2f)", its usage is 8 + 1 + 4=13, which ignores the size of > object head and others. It is needed to redesign the memory statistics > accuracy carefully. > 4. Is there still the memory leak? As shown in the log of the last crash due > to the out of memory exception, we find out the actual JVM memory is 18G, > whereas our memory statistic module only counts 8G. Besides the inaccuracy > mentioned in Q3, we doubt there are still memory leak or other potential > problems. We will continue to debug it. > > > > > — > 顺颂时祺 > 康荣 > 清华大学软件学院 > — > Best Regards, > Rong Kang > School of Software, Tsinghua University > > > 原始邮件 > 发件人:[email protected] > 收件人:[email protected] > 发送时间:2019年4月22日(周一) 12:01 > 主题:Out-of-Memory Analysis > > > Hi all: Flushing to disk may double the memory cost: A storage group > maintains a list of ChunkGroups in memory and will be flushed to disk when > its occupied memory exceeding the threshold (128MB by default). In the > current implementation, when starting to flush data, a ChunkGroup is encoded > in memory and thereby a new byte array is kept in memory. Until all > ChunkGroups have been encoded in memory, their corresponding byte arrays can > be released together. Since the byte array has a comparable size with > original data (0.5× to 1×), the above strategy may double the memory in the > worst case. Solution: It is needed to redesign the flush strategy. In TsFile, > a Page is the minimal flush unit, where a ChunkGroup contains several Chunks > and a Chunk contains several pages. Once a page is encoded into a byte array, > we can flush the byte array to disk and then release it. In this case, the > extra memory is a page size (64KB by default) at most. This modification > involves a list of cascading change, including metadata format and writing > process. Memory Control Strategy: It is needed to redesign the memory Control > Strategy. For example, assigning 60% memory to the writing process and 30% > memory to the querying process. The writing memory includes the memory table > and the flush process. As an Insert coming, if its required memory exceeds > TotalMem * 0.6 - MemTableUsage - FlushUsage, the Insert will be rejected. Is > the memory statistics accuracy? In current codes, the memory usage of a > TSRecord Java Object, corresponding to an Insert SQL, is calculated by > summating its DataPoints. e.g., "insert into root.a.b.c(timestamp,v1, v2) > values(1L, true, 1.2f)", its usage is 8 + 1 + 4=13, which ignores the size of > object head and others. It is needed to redesign the memory statistics > accuracy carefully. Is there still the memory leak? As shown in the log of > the last crash due to the out of memory exception, we find out the actual JVM > memory is 18G, whereas our memory statistic module only counts 8G. Besides > the inaccuracy mentioned in Q3, we doubt there are still memory leak or other > potential problems. We will continue to debug it. — 顺颂时祺 康荣 清华大学软件学院 — Best > Regards, Rong Kang School of Software, Tsinghua University
