Hi folks, As you may noticed, I'm working on several different stuffs for the upcoming 5.16 Linux kernel, which including multiple device support for multi-layer container images for runc and kata-likewise containers and LZMA algorithm support (I'll send out the formal patchset this week.)
Here is the EROFS roadmap in the near/mid term as far as I know: Container use cases: - Multiple device/blob support (v5.16, me); - Other stuffs working in progress (our whole team, mainly working in the form of new RAFS v6 format which is an EROFS-compatible format for nydus [1] container image service later.) Embedded device use cases: - LZMA compression support, specifically MicroLZMA (v5.16, me with Lasse kind help/support) with complete in-place I/O and overlapped decompression (for embedded boards or a secondary auxiliary algorithm in a file as a complement for specific access patterns): https://git.kernel.org/pub/scm/linux/kernel/git/xiang/linux.git -b erofs/lzma - Tail-packing inline for compression files (AFAIK, Yue Hu is currently working on this new feature); - LZ4 range dictionary support (v5.xx?), which works in a way to seperate a file into several sub-file segments and add a external dictionary for each segment (4KiB dictionary for 2MiB segment for example), I can see the benefits for specific datasets and have some DEMO compressor code for this, for example: enwik9 1000000000 enwik9_4k.erofs.img 558346240 enwik9_4k.dict.erofs.img 449683456 (2MiB segs with 8KiB dicts); enwik9_4k.dict.erofs.img 400093184 (1MiB segs with 32KiB dicts); ... https://github.com/hsiangkao/erofs-utils.git -b experimental-dictdemo I'd like to try to seek some potential volunteer who could also be interested in this kind of stuffs to optimize compression ratios for specific data patterns (Note that it's not a free lunch since you need to keep the whole dictionaries in memory before decompressing any data in the specific range, and again it doesn't work for all datasets [compared with LZMA] as far as I observed and the dictionary build time is relative slow); - Multi-threaded compression for mkfs, including file level paralleled compression and sub-file level paralleled compression. File level paralleled compression is trivial to think and sub-file level paralleled compression approach is quite similar to range dictionaries, separate the files into several segments (e.g. 16MiB) and compress each individually in parallel; Others: - dump.erofs (AFAIK, Wang Qi / Guo Xuenan is working on this?) https://lore.kernel.org/r/oszp286mb07097ae45e9d391b0049f661b2...@oszp286mb0709.jpnp286.prod.outlook.com - partial page up-to-date support and corresponding read interface; - code cleanup / simplification; - etc.. [1] https://github.com/dragonflyoss/image-service Thanks, Gao Xiang
