[jira] [Assigned] (IOTDB-5792) Parallel encoding in MemTable flush

Tian Jiang (Jira) Tue, 18 Apr 2023 20:18:04 -0700


     [ 
https://issues.apache.org/jira/browse/IOTDB-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tian Jiang reassigned IOTDB-5792:
---------------------------------

    Assignee: Tian Jiang

> Parallel encoding in MemTable flush
> -----------------------------------
>
>                 Key: IOTDB-5792
>                 URL: https://issues.apache.org/jira/browse/IOTDB-5792
>             Project: Apache IoTDB
>          Issue Type: Improvement
>          Components: Core/Engine
>            Reporter: Tian Jiang
>            Assignee: Tian Jiang
>            Priority: Major
>              Labels: encoding, flush
>             Fix For: master branch
>
>
> Currently, there is only one encoding task for each MemTable flushing task. 
> In other words, the encoding during flushing a MemTable is fully serialized. 
> Thus, when the size of MemTable is large, the encoding will be considerably 
> time-consuming. This is especially true when the computing power of a single 
> core is low, which is common for commercial servers with many cores.
> In one of my experiments, there are 1M time series (datatype = double) in a 
> MemTable, and the avg point number of each series is around 300, making the 
> total size of the MemTable about 5GB. The time of encoding such a MemTable 
> is, incredibly, over 100s. The system easily into a reject status because the 
> flushing is so slow.
> Since the encoding process is naturally parallelizable (it is a purely 
> in-memory operation with perfect locality), I would like to propose replacing 
> the single-threaded encoding process with multiple threads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IOTDB-5792) Parallel encoding in MemTable flush

Reply via email to