Cpaulyz opened a new pull request, #9659: URL: https://github.com/apache/iotdb/pull/9659
cherry pick from #9631 ## Description During a flush of SchemaFile for numerous Entity/Internal Nodes, a surge in dirty pages occurs due to the pre-allocation mechanism. Retrieval for a suitable SegmentedPage may require a full iteration if each of these pages has too little space spare to accommodate another segment. The quadratic time overhead may appear to be stucked while handling a massive number of child nodes, e.g., 10 million child nodes. ### Solution An array of LinkedLists has been introduced to index dirty pages into a tiered structure based on their spare size, considering the minimum segment and other step sizes. When a SegmentedPage is marked as dirty, it will be integrated into one of the LinkedLists if its spare size is larger than the minimum segment. ### Further Improvement Although the size of pageInstCache is constrained by the config file, and the complexity of retrieval from it will not surge like the dirty pages do, there is a performance pitfall if the application frequently writes small amounts of data after reading almost full pages. This will lead to an optimizable full iteration over the pageInstCache. This issue requires a refactor of the cache mechanism or a complete spare page index and should be addressed in future versions. ### Evaluation After testing, time cost of flushing 11149000 devices under the same db node is about 50s. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
