[
https://issues.apache.org/jira/browse/TUBEMQ-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Guocheng Zhang updated TUBEMQ-110:
----------------------------------
Fix Version/s: (was: 0.7.0)
> Optimize Broker storage to increase throughput
> ----------------------------------------------
>
> Key: TUBEMQ-110
> URL: https://issues.apache.org/jira/browse/TUBEMQ-110
> Project: Apache TubeMQ
> Issue Type: Task
> Reporter: Guocheng Zhang
> Assignee: Guocheng Zhang
> Priority: Major
> Labels: features, performance
>
> I think the current Broker's read and write performance still has a
> relatively large room for improvement. We need to continue to iterate to
> improve the storage performance of the system. I have listed some
> considerations and hope to get some better suggestions:
> 1. Data read and write operations should consider the characteristics of the
> disk, for example, the disk is based on 512-byte sectors as its storage unit,
> and read data in batches of 64k; the file system will eliminate the cache
> according to certain rules Pages in memory etc. If the read and write
> operations take these contents into account, I believe that the current TPS
> can be higher;
> 2. Storage should consider the problem of fragmentation of disk space, such
> as pre-allocation of fixed-length files and reuse of aging files to enable
> continuous reading of disk files and improve data read and write speed;
> 3. The number of memory cache blocks should be configurable: the current
> memory cache is managed according to the fixed configuration of 2 memory
> blocks per topic. We should allow the business to build more memory cache
> space based on actual resource conditions;
> 4. More effective memory-to-disk operation: At present, the flashing
> operation is to flash messages from the memory to the disk one by one for
> storage. This block can be adjusted to write to the disk in batches according
> to the memory block, thereby improving storage efficiency;
> 5. Remove the SSD auxiliary consumption function: Because the SSD disk
> capacity is too small, the SSD storage consumption is not suitable for
> practical applications, so it should be removed to avoid user confusion, and
> related configurations and settings need to be cleaned up;
> 6. The stored file should increase the content of the file header, including
> the data version information, in order to facilitate the subsequent storage
> scheme is still seamlessly compatible with the data format of the old version;
> 7. Add CheckPoint check mechanism: the current system will only check the
> validity of the last file when it is restarted. In fact, when the system is
> shut down, there may be multiple consecutive files still in memory, the
> practice that only the last file is checked currently is easy to cause
> abnormal mixing into the data stream, we should add CheckPoint mechanism to
> improve this abnormal situation.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)