On 2026/3/3 14:25, Nithurshen Karthikeyan wrote:
Hi Xiang,
On Mon, Mar 2, 2026 at 9:22 PM Gao Xiang <[email protected]> wrote:
Hi Nithurshen,
Glad to see the improvements, I think there are more room
to be improved anyway.
Also there are still some follow-up works, I'm busy these
two days, but I will release a formal gsoc page later.
Thanks,
Gao Xiang
I completely agree that there is significant room for improvement
beyond this initial batching implementation. In my formal GSoC
proposal, I plan to explore several key areas:
1) Dynamically scaling the batch size based on the algorithm's
complexity. Fast algorithms like LZ4 can handle larger batches to hide
latency, while compute-heavy LZMA/ZSTD may require smaller, more
frequent dispatches to keep cores saturated without bloating memory.
2) Currently, the directory traversal remains serial. I want to
investigate parallelizing the inode verification pass to ensure the
entire fsck process is truly multi-threaded.
3) Implementing a sliding-window or credit-based throttle to prevent
worker threads from over-consuming memory on low-resource devices when
disk I/O is slow.
For the part 3), The linux kernel already has dirty page throttling
and it will block the worker thread if there is enough dirty page
cache.
So I wonder if it's really necessary since apart from that, the
remaining memory (incluing memory allocated in the compressors) are
all temporary buffers and should correlate to the number of workers.
For the part 1,2, I think they are good.
Also there is another thing about metadata/directory decompression,
I think in order to make these faster, we need to implement caching
for these, much like what we did for fragments now.
Before I begin drafting my proposal based on these goals, can you
kindly let me know your thoughts on this?
Thanks for the guidance!
Best regards,
Nithurshen