[ 
https://issues.apache.org/jira/browse/TUBEMQ-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guocheng Zhang updated TUBEMQ-110:
----------------------------------
    Fix Version/s:     (was: 0.7.0)

> Optimize Broker storage to increase throughput
> ----------------------------------------------
>
>                 Key: TUBEMQ-110
>                 URL: https://issues.apache.org/jira/browse/TUBEMQ-110
>             Project: Apache TubeMQ
>          Issue Type: Task
>            Reporter: Guocheng Zhang
>            Assignee: Guocheng Zhang
>            Priority: Major
>              Labels: features, performance
>
> I think the current Broker's read and write performance still has a 
> relatively large room for improvement. We need to continue to iterate to 
> improve the storage performance of the system. I have listed some 
> considerations and hope to get some better suggestions:
> 1. Data read and write operations should consider the characteristics of the 
> disk, for example, the disk is based on 512-byte sectors as its storage unit, 
> and read data in batches of 64k; the file system will eliminate the cache 
> according to certain rules Pages in memory etc. If the read and write 
> operations take these contents into account, I believe that the current TPS 
> can be higher;
> 2. Storage should consider the problem of fragmentation of disk space, such 
> as pre-allocation of fixed-length files and reuse of aging files to enable 
> continuous reading of disk files and improve data read and write speed;
> 3. The number of memory cache blocks should be configurable: the current 
> memory cache is managed according to the fixed configuration of 2 memory 
> blocks per topic. We should allow the business to build more memory cache 
> space based on actual resource conditions;
> 4. More effective memory-to-disk operation: At present, the flashing 
> operation is to flash messages from the memory to the disk one by one for 
> storage. This block can be adjusted to write to the disk in batches according 
> to the memory block, thereby improving storage efficiency;
> 5. Remove the SSD auxiliary consumption function: Because the SSD disk 
> capacity is too small, the SSD storage consumption is not suitable for 
> practical applications, so it should be removed to avoid user confusion, and 
> related configurations and settings need to be cleaned up;
> 6. The stored file should increase the content of the file header, including 
> the data version information, in order to facilitate the subsequent storage 
> scheme is still seamlessly compatible with the data format of the old version;
> 7. Add CheckPoint check mechanism: the current system will only check the 
> validity of the last file when it is restarted. In fact, when the system is 
> shut down, there may be multiple consecutive files still in memory, the 
> practice that only the last file is checked currently is easy to cause 
> abnormal mixing into the data stream, we should add CheckPoint mechanism to 
> improve this abnormal situation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to