Re: UBSAN array-index-out-of-bounds in read_blocklist
On 08/04/2024 01:56, Ubisectech Sirius wrote: Hello. We are Ubisectech Sirius Team, the vulnerability lab of China ValiantSec. Recently, our team has discovered a issue in Linux kernel 6.8. Attached to the email were a PoC file of the issue. Hello, I have found the cause of the problem and fixed it. A patch will be sent to the LKML shortly. Thanks Phillip Squashfs maintainer. Stack dump: [ cut here ] UBSAN: array-index-out-of-bounds in fs/squashfs/file.c:256:34 index 4294967295 is out of range for type 'meta_entry [127]' CPU: 1 PID: 16927 Comm: syz-executor.0 Not tainted 6.8.0 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x136/0x150 lib/dump_stack.c:106 ubsan_epilogue lib/ubsan.c:217 [inline] __ubsan_handle_out_of_bounds+0xd5/0x130 lib/ubsan.c:347 fill_meta_index fs/squashfs/file.c:256 [inline] read_blocklist+0x175e/0x1790 fs/squashfs/file.c:333 squashfs_readahead+0x14fa/0x22d0 fs/squashfs/file.c:591 read_pages+0x1a2/0xd70 mm/readahead.c:160 page_cache_ra_unbounded+0x477/0x5f0 mm/readahead.c:269 do_page_cache_ra mm/readahead.c:299 [inline] page_cache_ra_order+0x772/0xa00 mm/readahead.c:544 do_sync_mmap_readahead mm/filemap.c:3153 [inline] filemap_fault+0x1691/0x3390 mm/filemap.c:3245 __do_fault+0x108/0x490 mm/memory.c:4396 do_read_fault mm/memory.c:4758 [inline] do_fault mm/memory.c:4888 [inline] do_pte_missing mm/memory.c:3745 [inline] handle_pte_fault mm/memory.c:5164 [inline] __handle_mm_fault+0x340a/0x48b0 mm/memory.c:5305 handle_mm_fault+0x3c2/0xa40 mm/memory.c:5470 do_user_addr_fault+0x2ed/0x1010 arch/x86/mm/fault.c:1355 handle_page_fault arch/x86/mm/fault.c:1498 [inline] exc_page_fault+0x99/0x180 arch/x86/mm/fault.c:1554 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570 RIP: 0033:0x7f8446a28202 Code: 48 63 c3 5b c3 0f 1f 00 f3 0f 1e fa 48 89 7c 24 f0 48 89 74 24 e8 48 89 54 24 e0 48 8b 4c 24 f0 48 8b 54 24 e8 48 8b 74 24 e0 <8b> 41 40 23 81 00 01 00 00 f3 0f 6f 06 c1 e0 06 48 01 d0 0f 11 00 RSP: 002b:7f8447755018 EFLAGS: 00010212 RAX: 7f8446a281e0 RBX: 7f8446bcbf80 RCX: 20ff RDX: RSI: 2000 RDI: 20ff RBP: 7f8446af14a6 R08: R09: R10: 20ff R11: 2000 R12: R13: 000b R14: 7f8446bcbf80 R15: 7f8447735000 ---[ end trace ]--- Thank you for taking the time to read this email and we look forward to working with you further.
Re: [PATCH] squashfs: fix inode lookup sanity checks
On 26/02/2021 09:29, Sean Nyekjaer wrote: When mouting a squashfs image created without inode compression it fails with: "unable to read inode lookup table" It turns out that the BLOCK_OFFSET is missing when checking the SQUASHFS_METADATA_SIZE agaist the actual size. Fixes: eabac19e40c0 ("squashfs: add more sanity checks in inode lookup") CC: sta...@vger.kernel.org Signed-off-by: Sean Nyekjaer Acked-by: phil...@squashfs.org.uk. --- fs/squashfs/export.c | 8 ++-- fs/squashfs/squashfs_fs.h | 1 + 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/squashfs/export.c b/fs/squashfs/export.c index eb02072d28dd..723763746238 100644 --- a/fs/squashfs/export.c +++ b/fs/squashfs/export.c @@ -152,14 +152,18 @@ __le64 *squashfs_read_inode_lookup_table(struct super_block *sb, start = le64_to_cpu(table[n]); end = le64_to_cpu(table[n + 1]); - if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + if (start >= end + || (end - start) > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { kfree(table); return ERR_PTR(-EINVAL); } } start = le64_to_cpu(table[indexes - 1]); - if (start >= lookup_table_start || (lookup_table_start - start) > SQUASHFS_METADATA_SIZE) { + if (start >= lookup_table_start || + (lookup_table_start - start) > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { kfree(table); return ERR_PTR(-EINVAL); } diff --git a/fs/squashfs/squashfs_fs.h b/fs/squashfs/squashfs_fs.h index 8d64edb80ebf..b3fdc8212c5f 100644 --- a/fs/squashfs/squashfs_fs.h +++ b/fs/squashfs/squashfs_fs.h @@ -17,6 +17,7 @@ /* size of metadata (inode and directory) blocks */ #define SQUASHFS_METADATA_SIZE8192 +#define SQUASHFS_BLOCK_OFFSET 2 /* default size of block device I/O */ #ifdef CONFIG_SQUASHFS_4K_DEVBLK_SIZE
Re: [PATCH] Squashfs: fix xattr id and id lookup sanity checks
> On 03/03/2021 00:34 Andrew Morton wrote: > > > On Mon, 1 Mar 2021 07:27:57 + (GMT) Phillip Lougher > wrote: > > > The checks for maximum metadata block size is > > missing SQUASHFS_BLOCK_OFFSET (the two byte length > > count). > > What are the user visible consequences of this bug? The user will be unable to mount the filesystem, because it will fail the sanity check. > > > Cc: sta...@vger.kernel.org > > Signed-off-by: Phillip Lougher > > Fixes: f37aa4c7366e23f ("squashfs: add more sanity checks in id lookup") > > yes? Yes. Phillip
[PATCH] Squashfs: fix xattr id and id lookup sanity checks
The checks for maximum metadata block size is missing SQUASHFS_BLOCK_OFFSET (the two byte length count). Cc: sta...@vger.kernel.org Signed-off-by: Phillip Lougher --- fs/squashfs/id.c | 6 -- fs/squashfs/xattr_id.c | 6 -- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/fs/squashfs/id.c b/fs/squashfs/id.c index 11581bf31af4..ea5387679723 100644 --- a/fs/squashfs/id.c +++ b/fs/squashfs/id.c @@ -97,14 +97,16 @@ __le64 *squashfs_read_id_index_table(struct super_block *sb, start = le64_to_cpu(table[n]); end = le64_to_cpu(table[n + 1]); - if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + if (start >= end || (end - start) > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { kfree(table); return ERR_PTR(-EINVAL); } } start = le64_to_cpu(table[indexes - 1]); - if (start >= id_table_start || (id_table_start - start) > SQUASHFS_METADATA_SIZE) { + if (start >= id_table_start || (id_table_start - start) > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { kfree(table); return ERR_PTR(-EINVAL); } diff --git a/fs/squashfs/xattr_id.c b/fs/squashfs/xattr_id.c index ead66670b41a..087cab8c78f4 100644 --- a/fs/squashfs/xattr_id.c +++ b/fs/squashfs/xattr_id.c @@ -109,14 +109,16 @@ __le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 table_start, start = le64_to_cpu(table[n]); end = le64_to_cpu(table[n + 1]); - if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + if (start >= end || (end - start) > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { kfree(table); return ERR_PTR(-EINVAL); } } start = le64_to_cpu(table[indexes - 1]); - if (start >= table_start || (table_start - start) > SQUASHFS_METADATA_SIZE) { + if (start >= table_start || (table_start - start) > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { kfree(table); return ERR_PTR(-EINVAL); } -- 2.29.2
Re: [PATCH] squashfs: fix inode lookup sanity checks
> On 26/02/2021 09:29 Sean Nyekjaer wrote: > > > When mouting a squashfs image created without inode compression it > fails with: "unable to read inode lookup table" > > It turns out that the BLOCK_OFFSET is missing when checking > the SQUASHFS_METADATA_SIZE agaist the actual size. > > Fixes: eabac19e40c0 ("squashfs: add more sanity checks in inode lookup") > CC: sta...@vger.kernel.org > Signed-off-by: Sean Nyekjaer Acked-by: Phillip Lougher > --- > fs/squashfs/export.c | 8 ++-- > fs/squashfs/squashfs_fs.h | 1 + > 2 files changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/squashfs/export.c b/fs/squashfs/export.c > index eb02072d28dd..723763746238 100644 > --- a/fs/squashfs/export.c > +++ b/fs/squashfs/export.c > @@ -152,14 +152,18 @@ __le64 *squashfs_read_inode_lookup_table(struct > super_block *sb, > start = le64_to_cpu(table[n]); > end = le64_to_cpu(table[n + 1]); > > - if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { > + if (start >= end > + || (end - start) > > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { > kfree(table); > return ERR_PTR(-EINVAL); > } > } > > start = le64_to_cpu(table[indexes - 1]); > - if (start >= lookup_table_start || (lookup_table_start - start) > > SQUASHFS_METADATA_SIZE) { > + if (start >= lookup_table_start || > + (lookup_table_start - start) > > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { > kfree(table); > return ERR_PTR(-EINVAL); > } > diff --git a/fs/squashfs/squashfs_fs.h b/fs/squashfs/squashfs_fs.h > index 8d64edb80ebf..b3fdc8212c5f 100644 > --- a/fs/squashfs/squashfs_fs.h > +++ b/fs/squashfs/squashfs_fs.h > @@ -17,6 +17,7 @@ > > /* size of metadata (inode and directory) blocks */ > #define SQUASHFS_METADATA_SIZE 8192 > +#define SQUASHFS_BLOCK_OFFSET2 > > /* default size of block device I/O */ > #ifdef CONFIG_SQUASHFS_4K_DEVBLK_SIZE > -- > 2.29.2
[PATCH V2 2/4] Squashfs: add more sanity checks in id lookup
Sysbot has reported a number of "slab-out-of-bounds reads" and "use-after-free read" errors which has been identified as being caused by a corrupted index value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This patch adds additional sanity checks to detect this, and the following corruption. 1. It checks against corruption of the ids count. This can either lead to a larger table to be read, or a smaller than expected table to be read. In the case of a too large ids count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values. 2. It checks the contents of the index table for corruption. Reported-by: syzbot+b06d57ba83f604522...@syzkaller.appspotmail.com Reported-by: syzbot+c021ba012da41ee98...@syzkaller.appspotmail.com Reported-by: syzbot+5024636e8b5fd19f0...@syzkaller.appspotmail.com Reported-by: syzbot+bcbc661df46657d0f...@syzkaller.appspotmail.com Signed-off-by: Phillip Lougher Cc: sta...@vger.kernel.org --- fs/squashfs/id.c | 40 fs/squashfs/squashfs_fs_sb.h | 1 + fs/squashfs/super.c | 6 +++--- fs/squashfs/xattr.h | 10 - 4 files changed, 45 insertions(+), 12 deletions(-) diff --git a/fs/squashfs/id.c b/fs/squashfs/id.c index 6be5afe7287d..11581bf31af4 100644 --- a/fs/squashfs/id.c +++ b/fs/squashfs/id.c @@ -35,10 +35,15 @@ int squashfs_get_id(struct super_block *sb, unsigned int index, struct squashfs_sb_info *msblk = sb->s_fs_info; int block = SQUASHFS_ID_BLOCK(index); int offset = SQUASHFS_ID_BLOCK_OFFSET(index); - u64 start_block = le64_to_cpu(msblk->id_table[block]); + u64 start_block; __le32 disk_id; int err; + if (index >= msblk->ids) + return -EINVAL; + + start_block = le64_to_cpu(msblk->id_table[block]); + err = squashfs_read_metadata(sb, _id, _block, , sizeof(disk_id)); if (err < 0) @@ -56,7 +61,10 @@ __le64 *squashfs_read_id_index_table(struct super_block *sb, u64 id_table_start, u64 next_table, unsigned short no_ids) { unsigned int length = SQUASHFS_ID_BLOCK_BYTES(no_ids); + unsigned int indexes = SQUASHFS_ID_BLOCKS(no_ids); + int n; __le64 *table; + u64 start, end; TRACE("In read_id_index_table, length %d\n", length); @@ -67,20 +75,36 @@ __le64 *squashfs_read_id_index_table(struct super_block *sb, return ERR_PTR(-EINVAL); /* -* length bytes should not extend into the next table - this check -* also traps instances where id_table_start is incorrectly larger -* than the next table start +* The computed size of the index table (length bytes) should exactly +* match the table start and end points */ - if (id_table_start + length > next_table) + if (length != (next_table - id_table_start)) return ERR_PTR(-EINVAL); table = squashfs_read_table(sb, id_table_start, length); + if (IS_ERR(table)) + return table; /* -* table[0] points to the first id lookup table metadata block, this -* should be less than id_table_start +* table[0], table[1], ... table[indexes - 1] store the locations +* of the compressed id blocks. Each entry should be less than +* the next (i.e. table[0] < table[1]), and the difference between them +* should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1] +* should be less than id_table_start, and again the difference +* should be SQUASHFS_METADATA_SIZE or less */ - if (!IS_ERR(table) && le64_to_cpu(table[0]) >= id_table_start) { + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= id_table_start || (id_table_start - start) > SQUASHFS_METADATA_SIZE) { kfree(table); return ERR_PTR(-EINVAL); } diff --git a/fs/squashfs/squashfs_fs_sb.h b/fs/squashfs/squashfs_fs_sb.h index 34c21ffb6df3..166e98806265 100644 --- a/fs/squashfs/squashfs_fs_sb.h +++ b/fs/squashfs/squashfs_fs_sb.h @@ -64,5 +64,6 @@ struct squashfs_sb_info { unsigned intinodes; unsigned int
[PATCH V2 0/4] Squashfs: fix BIO migration regression and add sanity checks
Hi all, Patch [1/4] fixes a regression introduced by the "migrate from ll_rw_block usage to BIO" patch, which has produced a number of Sysbot/Syzkaller reports. Patches [2/4], [3/4], and [4/4] fix a number of filesystem corruption issues which have produced Sysbot reports in the id, inode and xattr lookup code. Each patch has been tested against the Sysbot reproducers using the given kernel configuration. They have the appropriate "Reported-by:" lines added. Additionally, all of the reproducer filesystems are indirectly fixed by patch [4/4] due to the fact they all have xattr corruption which is now detected there. Additional testing with other configurations and architectures (32bit, big endian), and normal filesystems has also been done to trap any inadvertent regressions caused by the additional sanity checks. V2: Fix two checkpatch.pl errors. Phillip
[PATCH V2 1/4] Squashfs: avoid out of bounds writes in decompressors
This is a regression introduced by the patch "migrate from ll_rw_block usage to BIO". Sysbot/Syskaller has reported a number of "out of bounds writes" and "unable to handle kernel paging request in squashfs_decompress" errors which have been identified as a regression introduced by the above patch. Specifically, the patch removed the following sanity check if (length < 0 || length > output->length || (index + length) > msblk->bytes_used) This check did two things: 1. It ensured any reads were not beyond the end of the filesystem 2. It ensured that the "length" field read from the filesystem was within the expected maximum length. Without this any corrupted values can over-run allocated buffers. Reported-by: syzbot+6fba78f99b9afd4b5...@syzkaller.appspotmail.com Signed-off-by: Phillip Lougher Cc: sta...@vger.kernel.org --- fs/squashfs/block.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 8a19773b5a0b..45f44425d856 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -196,9 +196,15 @@ int squashfs_read_data(struct super_block *sb, u64 index, int length, length = SQUASHFS_COMPRESSED_SIZE(length); index += 2; - TRACE("Block @ 0x%llx, %scompressed size %d\n", index, + TRACE("Block @ 0x%llx, %scompressed size %d\n", index - 2, compressed ? "" : "un", length); } + if (length < 0 || length > output->length || + (index + length) > msblk->bytes_used) { + res = -EIO; + goto out; + } + if (next_index) *next_index = index + length; -- 2.20.1
[PATCH V2 4/4] Squashfs: add more sanity checks in xattr id lookup
Sysbot has reported a warning where a kmalloc() attempt exceeds the maximum limit. This has been identified as corruption of the xattr_ids count when reading the xattr id lookup table. This patch adds a number of additional sanity checks to detect this corruption and others. 1. It checks for a corrupted xattr index read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This would cause an out of bounds read. 2. It checks against corruption of the xattr_ids count. This can either lead to the above kmalloc failure, or a smaller than expected table to be read. 3. It checks the contents of the index table for corruption. Reported-by: syzbot+2ccea6339d3683608...@syzkaller.appspotmail.com Signed-off-by: Phillip Lougher Cc: sta...@vger.kernel.org --- fs/squashfs/xattr_id.c | 66 -- 1 file changed, 57 insertions(+), 9 deletions(-) diff --git a/fs/squashfs/xattr_id.c b/fs/squashfs/xattr_id.c index d99e08464554..52905ce2b6f7 100644 --- a/fs/squashfs/xattr_id.c +++ b/fs/squashfs/xattr_id.c @@ -31,10 +31,15 @@ int squashfs_xattr_lookup(struct super_block *sb, unsigned int index, struct squashfs_sb_info *msblk = sb->s_fs_info; int block = SQUASHFS_XATTR_BLOCK(index); int offset = SQUASHFS_XATTR_BLOCK_OFFSET(index); - u64 start_block = le64_to_cpu(msblk->xattr_id_table[block]); + u64 start_block; struct squashfs_xattr_id id; int err; + if (index >= msblk->xattr_ids) + return -EINVAL; + + start_block = le64_to_cpu(msblk->xattr_id_table[block]); + err = squashfs_read_metadata(sb, , _block, , sizeof(id)); if (err < 0) @@ -50,13 +55,17 @@ int squashfs_xattr_lookup(struct super_block *sb, unsigned int index, /* * Read uncompressed xattr id lookup table indexes from disk into memory */ -__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 start, +__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 table_start, u64 *xattr_table_start, int *xattr_ids) { - unsigned int len; + struct squashfs_sb_info *msblk = sb->s_fs_info; + unsigned int len, indexes; struct squashfs_xattr_id_table *id_table; + __le64 *table; + u64 start, end; + int n; - id_table = squashfs_read_table(sb, start, sizeof(*id_table)); + id_table = squashfs_read_table(sb, table_start, sizeof(*id_table)); if (IS_ERR(id_table)) return (__le64 *) id_table; @@ -70,13 +79,52 @@ __le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 start, if (*xattr_ids == 0) return ERR_PTR(-EINVAL); - /* xattr_table should be less than start */ - if (*xattr_table_start >= start) + len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids); + indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids); + + /* +* The computed size of the index table (len bytes) should exactly +* match the table start and end points +*/ + start = table_start + sizeof(*id_table); + end = msblk->bytes_used; + + if (len != (end - start)) return ERR_PTR(-EINVAL); - len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids); + table = squashfs_read_table(sb, start, len); + if (IS_ERR(table)) + return table; + + /* table[0], table[1], ... table[indexes - 1] store the locations +* of the compressed xattr id blocks. Each entry should be less than +* the next (i.e. table[0] < table[1]), and the difference between them +* should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1] +* should be less than table_start, and again the difference +* shouls be SQUASHFS_METADATA_SIZE or less. +* +* Finally xattr_table_start should be less than table[0]. +*/ + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= table_start || (table_start - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } - TRACE("In read_xattr_index_table, length %d\n", len); + if (*xattr_table_start >= le64_to_cpu(table[0])) { + kfree(table); + return ERR_PTR(-EINVAL); + } - return squashfs_read_table(sb, start + sizeof(*id_table), len); + return table; } -- 2.20.1
[PATCH V2 3/4] Squashfs: add more sanity checks in inode lookup
Sysbot has reported an "slab-out-of-bounds read" error which has been identified as being caused by a corrupted "ino_num" value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This patch adds additional sanity checks to detect this, and the following corruption. 1. It checks against corruption of the inodes count. This can either lead to a larger table to be read, or a smaller than expected table to be read. In the case of a too large inodes count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values. 2. It checks the contents of the index table for corruption. Reported-by: syzbot+04419e3ff19d2970e...@syzkaller.appspotmail.com Signed-off-by: Phillip Lougher Cc: sta...@vger.kernel.org --- fs/squashfs/export.c | 41 + 1 file changed, 33 insertions(+), 8 deletions(-) diff --git a/fs/squashfs/export.c b/fs/squashfs/export.c index ae2c87bb0fbe..3f134ba86a45 100644 --- a/fs/squashfs/export.c +++ b/fs/squashfs/export.c @@ -41,12 +41,17 @@ static long long squashfs_inode_lookup(struct super_block *sb, int ino_num) struct squashfs_sb_info *msblk = sb->s_fs_info; int blk = SQUASHFS_LOOKUP_BLOCK(ino_num - 1); int offset = SQUASHFS_LOOKUP_BLOCK_OFFSET(ino_num - 1); - u64 start = le64_to_cpu(msblk->inode_lookup_table[blk]); + u64 start; __le64 ino; int err; TRACE("Entered squashfs_inode_lookup, inode_number = %d\n", ino_num); + if (ino_num == 0 || (ino_num - 1) >= msblk->inodes) + return -EINVAL; + + start = le64_to_cpu(msblk->inode_lookup_table[blk]); + err = squashfs_read_metadata(sb, , , , sizeof(ino)); if (err < 0) return err; @@ -111,7 +116,10 @@ __le64 *squashfs_read_inode_lookup_table(struct super_block *sb, u64 lookup_table_start, u64 next_table, unsigned int inodes) { unsigned int length = SQUASHFS_LOOKUP_BLOCK_BYTES(inodes); + unsigned int indexes = SQUASHFS_LOOKUP_BLOCKS(inodes); + int n; __le64 *table; + u64 start, end; TRACE("In read_inode_lookup_table, length %d\n", length); @@ -121,20 +129,37 @@ __le64 *squashfs_read_inode_lookup_table(struct super_block *sb, if (inodes == 0) return ERR_PTR(-EINVAL); - /* length bytes should not extend into the next table - this check -* also traps instances where lookup_table_start is incorrectly larger -* than the next table start + /* +* The computed size of the lookup table (length bytes) should exactly +* match the table start and end points */ - if (lookup_table_start + length > next_table) + if (length != (next_table - lookup_table_start)) return ERR_PTR(-EINVAL); table = squashfs_read_table(sb, lookup_table_start, length); + if (IS_ERR(table)) + return table; /* -* table[0] points to the first inode lookup table metadata block, -* this should be less than lookup_table_start +* table0], table[1], ... table[indexes - 1] store the locations +* of the compressed inode lookup blocks. Each entry should be +* less than the next (i.e. table[0] < table[1]), and the difference +* between them should be SQUASHFS_METADATA_SIZE or less. +* table[indexes - 1] should be less than lookup_table_start, and +* again the difference should be SQUASHFS_METADATA_SIZE or less */ - if (!IS_ERR(table) && le64_to_cpu(table[0]) >= lookup_table_start) { + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= lookup_table_start || (lookup_table_start - start) > SQUASHFS_METADATA_SIZE) { kfree(table); return ERR_PTR(-EINVAL); } -- 2.20.1
Greylisting on linux-kernel broken?
Hi, I have sent a patch-set from my usual email address (phil...@squashfs.org.uk), but despite over 18 hours it has still not shown up. Instead I have received bounce warning emails that they are delayed. I have sent emails to autoans...@vger.kernel.org and majord...@vger.kernel.org, and they have not been delivered either. Is there something wrong here? It is impossible to get an empty email to autoans...@vger.kernel.org wrong so it is delayed AFAIK. I have used http://vger.kernel.org/mxverify.html to check that the MX records appear correct. Thanks for any help. Phillip Lougher Squashfs maintainer
[PATCH] squashfs: avoid bio_alloc() failure with 1Mbyte blocks
This is a regression introduced by the "migrate from ll_rw_block usage to BIO" patch. Bio_alloc() is limited to 256 pages (1 Mbyte). This can cause a failure when reading 1 Mbyte block filesystems. The problem is a datablock can be fully (or almost uncompressed), requiring 256 pages, but, because blocks are not aligned to page boundaries, it may require 257 pages to read. Bio_kmalloc() can handle 1024 pages, and so use this for the edge condition. Reported-by: Nicolas Prochazka Reported-by: Tomoatsu Shimada Signed-off-by: Phillip Lougher --- fs/squashfs/block.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 76bb1c846845..8a19773b5a0b 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -87,7 +87,11 @@ static int squashfs_bio_read(struct super_block *sb, u64 index, int length, int error, i; struct bio *bio; - bio = bio_alloc(GFP_NOIO, page_count); + if (page_count <= BIO_MAX_PAGES) + bio = bio_alloc(GFP_NOIO, page_count); + else + bio = bio_kmalloc(GFP_NOIO, page_count); + if (!bio) return -ENOMEM; -- 2.20.1
[PATCH] squashfs: fix length field overlap check in metadata reading
This is a regression introduced by the "migrate from ll_rw_block usage to BIO" patch. Squashfs packs structures on byte boundaries, and due to that the length field (of the metadata block) may not be fully in the current block. The new code rewrote and introduced a faulty check for that edge case. Reported-by: Bernd Amend Signed-off-by: Phillip Lougher --- fs/squashfs/block.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 64f61330564a..76bb1c846845 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -175,7 +175,7 @@ int squashfs_read_data(struct super_block *sb, u64 index, int length, /* Extract the length of the metadata block */ data = page_address(bvec->bv_page) + bvec->bv_offset; length = data[offset]; - if (offset <= bvec->bv_len - 1) { + if (offset < bvec->bv_len - 1) { length |= data[offset + 1] << 8; } else { if (WARN_ON_ONCE(!bio_next_segment(bio, _all))) { -- 2.20.1
Re: Regression: squashfs issues since change "squashfs: migrate from ll_rw_block usage to BIO"
On Fri, Jul 17, 2020 at 12:07 AM Andrew Morton wrote: > > On Tue, 14 Jul 2020 21:41:07 +0200 Bernd Amend wrote: > > > Hi, > > > > With the Linux Kernel version 5.8-rc5/master I am unable to mount some > > squashfs filesystems compressed with "-comp lz4". > > If I try to mount them I get the following error: > > [1.084246] SQUASHFS error: lz4 decompression failed, data probably > > corrupt > > [1.084545] SQUASHFS error: Failed to read block 0x873e1001: -5 > > [1.084761] SQUASHFS error: Unable to read metadata cache entry > > [873e0fff] > > [1.084983] SQUASHFS error: Unable to read directory block > > [873e0fff:1586] > > [1.122564] SQUASHFS error: Unable to read metadata cache entry > > [873e0fff] > > [1.122708] SQUASHFS error: Unable to read directory block > > [873e0fff:1586] > > [1.122862] Starting init: /sbin/init exists but couldn't execute > > it (error -5) > > [1.123027] SQUASHFS error: Unable to read metadata cache entry > > [873e0fff] > > [1.123152] SQUASHFS error: Unable to read directory block > > [873e0fff:1586] > > [1.123279] Starting init: /etc/init exists but couldn't execute it > > (error -5) > > [1.123444] SQUASHFS error: Unable to read metadata cache entry > > [873e0fff] > > [1.123573] SQUASHFS error: Unable to read directory block > > [873e0fff:1586] > > [1.123713] Starting init: /bin/init exists but couldn't execute it > > (error -5) > > [1.123900] SQUASHFS error: Unable to read metadata cache entry > > [873e0fff] > > > > or > > > > [ 4960.910693] attempt to access beyond end of device > > [ 4960.910695] loop0: rw=2048, want=46, limit=40 > > [ 4960.910696] SQUASHFS error: Failed to read block 0x4001: -5 > > [ 4960.910697] SQUASHFS error: Unable to read metadata cache entry [3fff] > > [ 4960.910698] SQUASHFS error: Unable to read inode 0x20c5000c > > > > I bisected the issue to the commit "squashfs: migrate from ll_rw_block > > usage to BIO" > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/squashfs?id=93e72b3c612adcaca13d874fcc86c53e6c8da541 > > > > The issue can be reproduced by downloading > > https://theworldsend.eu/demo.squashfs (20K) and the following command > > line. > > # mount demo.squashfs mnt && ls mnt && umount mnt > > > > The same squashfs can be mounted using Linux <=5.7.8. > > The kernel config is identical to the Arch Linux Kernel configuration, > > build using gcc 9 and 10 on x86_64. > > Thanks. I queued a reversion patch. I'll go ahead with this if we are > unable to get this fixed in the next week or so. > Yes, there is a bug in the patch. I have tracked it down today, and I will send out a fix patch tomorrow. The bug is here: + /* Extract the length of the metadata block */ + data = page_address(bvec->bv_page) + bvec->bv_offset; + length = data[offset]; + if (offset <= bvec->bv_len - 1) { This check is wrong, it should be + if (offset < bvec->bv_len - 1) { Phillip > Are you able to check that the below fixes things up? > > Thanks. > > > From: Andrew Morton > Subject: revert "squashfs: migrate from ll_rw_block usage to BIO" > > Revert 93e72b3c612adc ("squashfs: migrate from ll_rw_block usage to BIO") > due to a regression reported by Bernd Amend. > > Link: > http://lkml.kernel.org/r/caf31+h5zb7zn73obrc5svlzgfstnyye5tkvr7-6atuoqrry...@mail.gmail.com > Reported-by: Bernd Amend > Cc: Philippe Liard > Cc: Christoph Hellwig > Cc: Adrien Schildknecht > Cc: Phillip Lougher > Cc: Guenter Roeck > Cc: Daniel Rosenberg > Signed-off-by: Andrew Morton > --- > > fs/squashfs/block.c | 273 ++ > fs/squashfs/decompressor.h |5 > fs/squashfs/decompressor_multi.c|9 > fs/squashfs/decompressor_multi_percpu.c |6 > fs/squashfs/decompressor_single.c |9 > fs/squashfs/lz4_wrapper.c | 17 - > fs/squashfs/lzo_wrapper.c | 17 - > fs/squashfs/squashfs.h |4 > fs/squashfs/xz_wrapper.c| 51 +--- > fs/squashfs/zlib_wrapper.c | 63 ++--- > fs/squashfs/zstd_wrapper.c | 62 ++-- > 11 files changed, 237 insertions(+), 279 deletions(-) > > --- > a/fs/squashfs/block.c~revert-squashfs-migrate-from-ll_rw_block-usage-to-bio > +++ a/fs/squashfs/block.c > @@ -13,7 +13,6 @@ > * datablocks and me
[ANN] Squashfs tools 4.4 released
Hi, I'm pleased to announce the release of Squashfs tools 4.4. This is the first release in over 5 years, and there are substantial improvements: reproducible builds, new compressors, CVE fixes, security hardening and new options for Mksquashfs/Unsquashfs. The new release can be downloaded here: http://sourceforge.net/projects/squashfs/files/latest/download?source=files Summary of changes in Squashfs tools 4.4 1. Mksquashfs now generates reproducible images by default. Mkfs time and file timestamps can also be specified. 2. Support for the Zstandard (ZSTD) compression algorithm has been added. 3. Pseudo files now support symbolic links. 4. CVE-2015-4645 and CVE-2015-4646 have been fixed. 5. Unsquashfs has been further hardened against corrupted filestems. 6. Unsquashfs is now more strict about error handling. 7. Miscellaneous new options and major bug fixes for Mksquashfs. 8. Miscellaneous new options and major bug fixes for Unsquashfs. 9. Squashfs-tools 4.4 is compatible with all earlier 4.x filesystems and releases. 1. Introducing reproducible builds -- Ever since Mksquashfs was parallelised back in 2006, there has been a certain randomness in how fragments and multi-block files are ordered in the output filesystem even if the input remains the same. This is because the multiple parallel threads can be scheduled differently between Mksquashfs runs. For example, the thread given fragment 10 to compress may finish before the thread given fragment 9 to compress on one run (writing fragment 10 to the output filesystem before fragment 9), but, on the next run it could be vice-versa. There are many different scheduling scenarios here, all of which can have a knock on effect causing different scheduling and ordering later in the filesystem too. Mkquashfs doesn't care about the ordering of fragments and multi-block files within the filesystem, as this does not affect the correctness of the filesystem. In fact not caring about the ordering, as it doesn't matter, allows Mksquashfs to run as fast as possible, maximising CPU and I/O performance. But, in the last couple of years, Squashfs has become used in scenarios (cloud etc) where this randomness is causing problems. Specifically this appears to be where downloaders, installers etc. try to work out the differences between Squashfs filesystem updates to minimise the amount of data that needs to transferred to update an image. Additionally, in the last couple of years has arisen the notion of reproducible builds, that is the same source and build environment etc should be able to (re-)generate identical output. This is usually for verification and security, allowing binaries/distributions to be checked for malicious activity. See https://reproducible-builds.org/ for more information. Mksquashfs now generates reproducible images by default. Images generated by Mksquashfs will be ordered identically to previous runs if the same input has been supplied, and the same options used. 1.1.1 Dealing with timestamps Timestamps embedded in the filesystem will stiil cause differences. Each new run of Mksquashfs will produce a different mkfs (make filesystem) timestamp in the super-block. Moreover if any file timestamps have changed (even if the content hasn't), this will produce a difference. To prevent timestamps from producing differences, the following new Mksquashfs options have been added. 1.1.2 -mkfs-time This option takes a positive time value (which is the number of seconds since the epoch of 1970-01-01 00:00:00 UTC), and sets the file system timestamp to that. Squashfs uses an unsigned 32-bit integer to store time, and the time given should be in that range. Obviously you can use the date command to convert dates into this value, i.e. % mksquashfs source source.sqsh -mkfs-time $(date +%s -d "Jan 1 2019 19:00") 1.1.3 -all-time This option takes a positive time value (which is the number of seconds since the epoch of 1970-01-01 00:00:00 UTC), and sets the timestamp on all files to that (but not the mkfs time). 1.1.4 environment variable SOURCE_DATE_EPOCH As an alternative to the above command line options, you can set the environment variable SOURCE_DATE_EPOCH to a time value. This value will be used to set the mkfs time. Also any file timestamps which are after SOURCE_DATE_EPOCH will be clamped to SOURCE_DATE_EPOCH. See https://reproducible-builds.org/docs/source-date-epoch/ for more information. Note: both SOURCE_DATE_EPOCH and the command line options cannot be used at the same time. They are different ways to do the same thing, and both have FORCE sematics which mean they can't be over-ridden elsewhere (otherwise it would defeat the purpose). 1.1.5 -not-reproducible This option tells Mksquashfs that the files do not have to be strictly ordered. This will make Mksquashfs behave like version 4.3. 2. Zstandard (ZSTD) compression added
Re: Bug#921146: Program mksquashfs from squashfs-tools 1:4.3-11 does not make use all CPU cores
On Thu, Aug 1, 2019 at 11:26 PM Don Armstrong wrote: > > On Thu, 01 Aug 2019, Phillip Lougher wrote: > > That patch is a laughable piece of rubbish. I believe both you > > people (the Debian maintainer and author) are in total denial > > about your incompetence in this matter. This is obviously just my > > opinion I've formed over the last couple of months, in case you want to > > claim that it is libellous. > > This isn't an appropriate tone for Debian mailing lists or the our bug > tracking system. > > It's fine to disagree on technical matters, but it's not appropriate to > claim that people are incompetent or that they are making rubbish. > I am only defending myself from the slurs and false information being spread by your maintainer. I would not be doing this otherwise. Cheers Phillip > Please stop. > > -- > Don Armstrong https://www.donarmstrong.com > > To punish me for my contempt of authority, Fate has made me an > authority myself > -- Albert Einstein
Re: [PATCH 0/2] Make squashfs fragments' cache size more configurable
On Thu, Oct 19, 2017 at 12:50 AM, Qixuan Wu <wuqix...@huawei.com> wrote: > Hi All, > > Currently, squashfs fragments' cache size is only determined by > config option CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE. Users have > no way to change the value when they get the binary kernel. Thank-you for the patches, but they're both pointless and dangerous. Let's be clear here you're trying to change an "expert only" kernel configuration option into a user changeable option. This is stupid because it is not meant for non-experts to change for good reason. The fragment cache size isn't some small tweak to the operation of Squashfs, it fundamentally affects both the performance and memory overhead of Squashfs. As such right from its introduction in 2003, it has been an "expert only" configuration option at build time. Even then it is made clear that the default has been carefully chosen, and it should only be changed in exceptional circumstances. This basically means don't change the default unless you really know what you're doing, and this means tracing of Squashfs against your use-case to determine caching behaviour. There is absolutely no other reason why you'd want to change the default. This also means it should be restricted to kernel configuration time only. Let's be clear again, very few people should ever want to change the default, and for the "experts" that do want to do so, they can do so when configuring the kernel. If you're not in a position to change it at kernel configuration time then by definition you're not an expert, and you shouldn't be able to change it anyway and certainly not as a user. There is absolutely no use-case here to make this a user changeable option. I can see no upsides in doing this, only downsides. Frankly if you need to change this value at module insert time then there is something wrong with your system or build process. If you want this because you want to build the kernel/modules once, and then post-facto configure them for various products then it is your build process that is broken. If you want this because you want to dynamically change Squashfs memory usage/caching behaviour post kernel configuration time it suggests you're trying to adapt Squashfs's footprint based on available memory. This is an abuse of the option as it's only meant to be used after detailed tracing/analysis and certainly not used to accommodate unforeseen dynamic low memory situations, and if that's the reason for needing this option, you should be looking to solve it elsewhere. Ultimately this has been an "expert" kernel configuration only option since its introduction in 2003, and I never been asked to change it, and I believe this is because people recognise it as such. I suspect you're trying to change this for fundamentally bogus reasons. Moreover Squashfs is used in many different use-cases and distributions, and I'm not going to make this a user-changeable option allowing users to insert the Squashfs module in such a way that will break its performance. So NACK. Phillip Lougher (Squashfs maintainer) > Now make it be configured when booting or inserting module. > Actually, it's better that a config option in a number format > in .config file cat be reconfigured during booting or inserting > module. > > Thanks > Qixuan > > Qixuan Wu (2): > Squashfs: Let the number of fragments cached configurable > Documentation/kernel-parameters.txt: Add kernel parameter of squashfs > fragments' cache size > > Documentation/admin-guide/kernel-parameters.txt | 7 > fs/squashfs/super.c | 43 > - > 2 files changed, 49 insertions(+), 1 deletion(-) > > -- > 2.7.4 >
Re: [PATCH 0/2] Make squashfs fragments' cache size more configurable
On Thu, Oct 19, 2017 at 12:50 AM, Qixuan Wu wrote: > Hi All, > > Currently, squashfs fragments' cache size is only determined by > config option CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE. Users have > no way to change the value when they get the binary kernel. Thank-you for the patches, but they're both pointless and dangerous. Let's be clear here you're trying to change an "expert only" kernel configuration option into a user changeable option. This is stupid because it is not meant for non-experts to change for good reason. The fragment cache size isn't some small tweak to the operation of Squashfs, it fundamentally affects both the performance and memory overhead of Squashfs. As such right from its introduction in 2003, it has been an "expert only" configuration option at build time. Even then it is made clear that the default has been carefully chosen, and it should only be changed in exceptional circumstances. This basically means don't change the default unless you really know what you're doing, and this means tracing of Squashfs against your use-case to determine caching behaviour. There is absolutely no other reason why you'd want to change the default. This also means it should be restricted to kernel configuration time only. Let's be clear again, very few people should ever want to change the default, and for the "experts" that do want to do so, they can do so when configuring the kernel. If you're not in a position to change it at kernel configuration time then by definition you're not an expert, and you shouldn't be able to change it anyway and certainly not as a user. There is absolutely no use-case here to make this a user changeable option. I can see no upsides in doing this, only downsides. Frankly if you need to change this value at module insert time then there is something wrong with your system or build process. If you want this because you want to build the kernel/modules once, and then post-facto configure them for various products then it is your build process that is broken. If you want this because you want to dynamically change Squashfs memory usage/caching behaviour post kernel configuration time it suggests you're trying to adapt Squashfs's footprint based on available memory. This is an abuse of the option as it's only meant to be used after detailed tracing/analysis and certainly not used to accommodate unforeseen dynamic low memory situations, and if that's the reason for needing this option, you should be looking to solve it elsewhere. Ultimately this has been an "expert" kernel configuration only option since its introduction in 2003, and I never been asked to change it, and I believe this is because people recognise it as such. I suspect you're trying to change this for fundamentally bogus reasons. Moreover Squashfs is used in many different use-cases and distributions, and I'm not going to make this a user-changeable option allowing users to insert the Squashfs module in such a way that will break its performance. So NACK. Phillip Lougher (Squashfs maintainer) > Now make it be configured when booting or inserting module. > Actually, it's better that a config option in a number format > in .config file cat be reconfigured during booting or inserting > module. > > Thanks > Qixuan > > Qixuan Wu (2): > Squashfs: Let the number of fragments cached configurable > Documentation/kernel-parameters.txt: Add kernel parameter of squashfs > fragments' cache size > > Documentation/admin-guide/kernel-parameters.txt | 7 > fs/squashfs/super.c | 43 > - > 2 files changed, 49 insertions(+), 1 deletion(-) > > -- > 2.7.4 >
Re: [PATCH 1/5] Squashfs: remove the FILE_CACHE option
On Fri, Sep 22, 2017 at 10:55 PM, Daniel Rosenberg <dro...@google.com> wrote: > From: Adrien Schildknecht <adri...@google.com> > > FILE_DIRECT is working fine and offers faster results and lower memory > footprint. > > Removing FILE_CACHE makes our life easier because we don't have to > maintain 2 differents function that does the same thing. When I added FILE_DIRECT, I deliberately retained the original FILE_CACHE behaviour and kept it as default, and I spent a lot of effort to do so. It basically required complete rewriting to enable FILE_CACHE and FILE_DIRECT to co-exist. FILE_CACHE wasn't simply some old cruft I left in there because I couldn't be bothered to remove it. There is a good reason for keeping the FILE_CACHE behaviour. While FILE_DIRECT improves the performance of Squashfs by eliminating a memcpy, and removing contention on a lock, it also increases the amount of I/O and CPU Squashfs can use at any one time (by enabling multiple Squashfs operations to run in parallel). This changes scheduling and can take resources away from other tasks. Who knows how many low-end embedded systems are out there that rely on the original slower behaviour of FILE_CACHE and will break if it is removed. I didn't want to remove the FILE_CACHE behaviour and get a lot of people complaining (and quite rightly too) that I'd removed a feature they were relying on, and I don't intend to now either. You quite simply don't remove a feature that people may be relying on. Let's be clear. Your reason for removing FILE_CACHE is simply because when adding your new whizzo feature, you needed to update both the FILE_CACHE and FILE_DIRECT code, and as *you* didn't use FILE_CACHE, you couldn't be bothered to update it, and removed it instead. This is not a valid reason for removing it. NACK Phillip > > Signed-off-by: Adrien Schildknecht <adri...@google.com> > Signed-off-by: Daniel Rosenberg <dro...@google.com> > --- > fs/squashfs/Kconfig | 28 > fs/squashfs/Makefile | 3 +-- > fs/squashfs/file_cache.c | 38 -- > fs/squashfs/page_actor.h | 42 +- > 4 files changed, 2 insertions(+), 109 deletions(-) > delete mode 100644 fs/squashfs/file_cache.c > > diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig > index 1adb3346b9d6..6c81bf620067 100644 > --- a/fs/squashfs/Kconfig > +++ b/fs/squashfs/Kconfig > @@ -25,34 +25,6 @@ config SQUASHFS > > If unsure, say N. > > -choice > - prompt "File decompression options" > - depends on SQUASHFS > - help > - Squashfs now supports two options for decompressing file > - data. Traditionally Squashfs has decompressed into an > - intermediate buffer and then memcopied it into the page cache. > - Squashfs now supports the ability to decompress directly into > - the page cache. > - > - If unsure, select "Decompress file data into an intermediate buffer" > - > -config SQUASHFS_FILE_CACHE > - bool "Decompress file data into an intermediate buffer" > - help > - Decompress file data into an intermediate buffer and then > - memcopy it into the page cache. > - > -config SQUASHFS_FILE_DIRECT > - bool "Decompress files directly into the page cache" > - help > - Directly decompress file data into the page cache. > - Doing so can significantly improve performance because > - it eliminates a memcpy and it also removes the lock contention > - on the single buffer. > - > -endchoice > - > choice > prompt "Decompressor parallelisation options" > depends on SQUASHFS > diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile > index 6655631c53ae..225330ab7723 100644 > --- a/fs/squashfs/Makefile > +++ b/fs/squashfs/Makefile > @@ -5,8 +5,7 @@ > obj-$(CONFIG_SQUASHFS) += squashfs.o > squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o > squashfs-y += namei.o super.o symlink.o decompressor.o > -squashfs-$(CONFIG_SQUASHFS_FILE_CACHE) += file_cache.o > -squashfs-$(CONFIG_SQUASHFS_FILE_DIRECT) += file_direct.o page_actor.o > +squashfs-y += file_direct.o page_actor.o > squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o > squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o > squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += > decompressor_multi_percpu.o > diff --git a/fs/squashfs/file_cache.c b/fs/squashfs/file_cache.c > deleted file mode 100644 > index f2310d2a2019.. > --- a/fs/squashfs/file_cache.c > +++ /dev/null > @@ -1,38 +0,0 @@ > -/* > - * Co
Re: [PATCH 1/5] Squashfs: remove the FILE_CACHE option
On Fri, Sep 22, 2017 at 10:55 PM, Daniel Rosenberg wrote: > From: Adrien Schildknecht > > FILE_DIRECT is working fine and offers faster results and lower memory > footprint. > > Removing FILE_CACHE makes our life easier because we don't have to > maintain 2 differents function that does the same thing. When I added FILE_DIRECT, I deliberately retained the original FILE_CACHE behaviour and kept it as default, and I spent a lot of effort to do so. It basically required complete rewriting to enable FILE_CACHE and FILE_DIRECT to co-exist. FILE_CACHE wasn't simply some old cruft I left in there because I couldn't be bothered to remove it. There is a good reason for keeping the FILE_CACHE behaviour. While FILE_DIRECT improves the performance of Squashfs by eliminating a memcpy, and removing contention on a lock, it also increases the amount of I/O and CPU Squashfs can use at any one time (by enabling multiple Squashfs operations to run in parallel). This changes scheduling and can take resources away from other tasks. Who knows how many low-end embedded systems are out there that rely on the original slower behaviour of FILE_CACHE and will break if it is removed. I didn't want to remove the FILE_CACHE behaviour and get a lot of people complaining (and quite rightly too) that I'd removed a feature they were relying on, and I don't intend to now either. You quite simply don't remove a feature that people may be relying on. Let's be clear. Your reason for removing FILE_CACHE is simply because when adding your new whizzo feature, you needed to update both the FILE_CACHE and FILE_DIRECT code, and as *you* didn't use FILE_CACHE, you couldn't be bothered to update it, and removed it instead. This is not a valid reason for removing it. NACK Phillip > > Signed-off-by: Adrien Schildknecht > Signed-off-by: Daniel Rosenberg > --- > fs/squashfs/Kconfig | 28 > fs/squashfs/Makefile | 3 +-- > fs/squashfs/file_cache.c | 38 -- > fs/squashfs/page_actor.h | 42 +- > 4 files changed, 2 insertions(+), 109 deletions(-) > delete mode 100644 fs/squashfs/file_cache.c > > diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig > index 1adb3346b9d6..6c81bf620067 100644 > --- a/fs/squashfs/Kconfig > +++ b/fs/squashfs/Kconfig > @@ -25,34 +25,6 @@ config SQUASHFS > > If unsure, say N. > > -choice > - prompt "File decompression options" > - depends on SQUASHFS > - help > - Squashfs now supports two options for decompressing file > - data. Traditionally Squashfs has decompressed into an > - intermediate buffer and then memcopied it into the page cache. > - Squashfs now supports the ability to decompress directly into > - the page cache. > - > - If unsure, select "Decompress file data into an intermediate buffer" > - > -config SQUASHFS_FILE_CACHE > - bool "Decompress file data into an intermediate buffer" > - help > - Decompress file data into an intermediate buffer and then > - memcopy it into the page cache. > - > -config SQUASHFS_FILE_DIRECT > - bool "Decompress files directly into the page cache" > - help > - Directly decompress file data into the page cache. > - Doing so can significantly improve performance because > - it eliminates a memcpy and it also removes the lock contention > - on the single buffer. > - > -endchoice > - > choice > prompt "Decompressor parallelisation options" > depends on SQUASHFS > diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile > index 6655631c53ae..225330ab7723 100644 > --- a/fs/squashfs/Makefile > +++ b/fs/squashfs/Makefile > @@ -5,8 +5,7 @@ > obj-$(CONFIG_SQUASHFS) += squashfs.o > squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o > squashfs-y += namei.o super.o symlink.o decompressor.o > -squashfs-$(CONFIG_SQUASHFS_FILE_CACHE) += file_cache.o > -squashfs-$(CONFIG_SQUASHFS_FILE_DIRECT) += file_direct.o page_actor.o > +squashfs-y += file_direct.o page_actor.o > squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o > squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o > squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += > decompressor_multi_percpu.o > diff --git a/fs/squashfs/file_cache.c b/fs/squashfs/file_cache.c > deleted file mode 100644 > index f2310d2a2019.. > --- a/fs/squashfs/file_cache.c > +++ /dev/null > @@ -1,38 +0,0 @@ > -/* > - * Copyright (c) 2013 > - * Phillip Lougher > - * > - * This work is licensed under the terms o
Re: [GIT PULL v2] zstd support (lib, btrfs, squashfs, nocrypto)
On Mon, Sep 11, 2017 at 8:24 PM, Chris Masonwrote: > Hi Linus, > > Nick Terrell's patch series to add zstd support to the kernel has been > floating around for a while. After talking with Dave Sterba, Herbert > and Phillip, we decided to send the whole thing in as one pull request. As the Squashfs maintainer I have ACKed the Squashfs patch, and I confirm that I've agreed to have it included in this one pull request. Phillip > > Herbert had asked about the crypto patch when we discussed the pull, but > I didn't realize he really meant not-right-now. I've rebased it out of > this branch, and none of the other patches depended on it. > > I have things in my zstd-minimal branch: > > git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git > zstd-minimal > > There's a trivial conflict with the main btrfs pull from last week. > Dave's pull deletes BTRFS_COMPRESS_LAST in fs/btrfs/compression.h, and > I've put the sample resolution in a branch named zstd-4.14-merge. > > zstd is a big win in speed over zlib and in compression ratio over lzo, > and the compression team here at FB has gotten great results using it in > production. Nick will continue to update the kernel side with new > improvements from the open source zstd userland code. > > Nick has a number of benchmarks for the main zstd code in his lib/zstd > commit: > > > I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM. > The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor, > 16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is > 211,988,480 B large. Run the following commands for the benchmark: > > sudo modprobe zstd_compress_test > sudo mknod zstd_compress_test c 245 0 > sudo cp silesia.tar zstd_compress_test > > The time is reported by the time of the userland `cp`. > The MB/s is computed with > > 1,536,217,008 B / time(buffer size, hash) > > which includes the time to copy from userland. > The Adjusted MB/s is computed with > > 1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)). > > The memory reported is the amount of memory the compressor requests. > > | Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | > |--|--|--|---|-|--|--| > | none | 11988480 |0.100 | 1 | 2119.88 |- |- | > | zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | > | zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | > | zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | > | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | > | zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | > | zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | > | zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | > | zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | > | zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | > | zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | > > I benchmarked zstd decompression using the same method on the same machine. > The benchmark file is located in the upstream zstd repo under > `contrib/linux-kernel/zstd_decompress_test.c` [4]. The memory reported is > the amount of memory required to decompress data compressed with the given > compression level. If you know the maximum size of your input, you can > reduce the memory usage of decompression irrespective of the compression > level. > > | Method | Time (s) | MB/s| Adjusted MB/s | Memory (MB) | > |--|--|-|---|-| > | none |0.025 | 8479.54 | - | - | > | zstd -1 |0.358 | 592.15 |636.60 |0.84 | > | zstd -3 |0.396 | 535.32 |571.40 |1.46 | > | zstd -5 |0.396 | 535.32 |571.40 |1.46 | > | zstd -10 |0.374 | 566.81 |607.42 |2.51 | > | zstd -15 |0.379 | 559.34 |598.84 |4.61 | > | zstd -19 |0.412 | 514.54 |547.77 |8.80 | > | zlib -1 |0.940 | 225.52 |231.68 |0.04 | > | zlib -3 |0.883 | 240.08 |247.07 |0.04 | > | zlib -6 |0.844 | 251.17 |258.84 |0.04 | > | zlib -9 |0.837 | 253.27 |287.64 |0.04 | > > === > > I ran a long series of tests and benchmarks on the btrfs side and > the gains are very similar to the core benchmarks Nick ran. > > Nick Terrell (3) commits (+14222/-12): > btrfs: Add zstd support (+468/-12) > lib: Add zstd modules (+13014/-0) > lib: Add xxhash module (+740/-0) > > Sean Purcell (1) commits (+178/-0): > squashfs: Add zstd support > > Total: (4) commits (+14400/-12) > > fs/btrfs/Kconfig |2 + > fs/btrfs/Makefile
Re: [GIT PULL v2] zstd support (lib, btrfs, squashfs, nocrypto)
On Mon, Sep 11, 2017 at 8:24 PM, Chris Mason wrote: > Hi Linus, > > Nick Terrell's patch series to add zstd support to the kernel has been > floating around for a while. After talking with Dave Sterba, Herbert > and Phillip, we decided to send the whole thing in as one pull request. As the Squashfs maintainer I have ACKed the Squashfs patch, and I confirm that I've agreed to have it included in this one pull request. Phillip > > Herbert had asked about the crypto patch when we discussed the pull, but > I didn't realize he really meant not-right-now. I've rebased it out of > this branch, and none of the other patches depended on it. > > I have things in my zstd-minimal branch: > > git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git > zstd-minimal > > There's a trivial conflict with the main btrfs pull from last week. > Dave's pull deletes BTRFS_COMPRESS_LAST in fs/btrfs/compression.h, and > I've put the sample resolution in a branch named zstd-4.14-merge. > > zstd is a big win in speed over zlib and in compression ratio over lzo, > and the compression team here at FB has gotten great results using it in > production. Nick will continue to update the kernel side with new > improvements from the open source zstd userland code. > > Nick has a number of benchmarks for the main zstd code in his lib/zstd > commit: > > > I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM. > The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor, > 16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is > 211,988,480 B large. Run the following commands for the benchmark: > > sudo modprobe zstd_compress_test > sudo mknod zstd_compress_test c 245 0 > sudo cp silesia.tar zstd_compress_test > > The time is reported by the time of the userland `cp`. > The MB/s is computed with > > 1,536,217,008 B / time(buffer size, hash) > > which includes the time to copy from userland. > The Adjusted MB/s is computed with > > 1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)). > > The memory reported is the amount of memory the compressor requests. > > | Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | > |--|--|--|---|-|--|--| > | none | 11988480 |0.100 | 1 | 2119.88 |- |- | > | zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | > | zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | > | zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | > | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | > | zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | > | zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | > | zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | > | zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | > | zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | > | zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | > > I benchmarked zstd decompression using the same method on the same machine. > The benchmark file is located in the upstream zstd repo under > `contrib/linux-kernel/zstd_decompress_test.c` [4]. The memory reported is > the amount of memory required to decompress data compressed with the given > compression level. If you know the maximum size of your input, you can > reduce the memory usage of decompression irrespective of the compression > level. > > | Method | Time (s) | MB/s| Adjusted MB/s | Memory (MB) | > |--|--|-|---|-| > | none |0.025 | 8479.54 | - | - | > | zstd -1 |0.358 | 592.15 |636.60 |0.84 | > | zstd -3 |0.396 | 535.32 |571.40 |1.46 | > | zstd -5 |0.396 | 535.32 |571.40 |1.46 | > | zstd -10 |0.374 | 566.81 |607.42 |2.51 | > | zstd -15 |0.379 | 559.34 |598.84 |4.61 | > | zstd -19 |0.412 | 514.54 |547.77 |8.80 | > | zlib -1 |0.940 | 225.52 |231.68 |0.04 | > | zlib -3 |0.883 | 240.08 |247.07 |0.04 | > | zlib -6 |0.844 | 251.17 |258.84 |0.04 | > | zlib -9 |0.837 | 253.27 |287.64 |0.04 | > > === > > I ran a long series of tests and benchmarks on the btrfs side and > the gains are very similar to the core benchmarks Nick ran. > > Nick Terrell (3) commits (+14222/-12): > btrfs: Add zstd support (+468/-12) > lib: Add zstd modules (+13014/-0) > lib: Add xxhash module (+740/-0) > > Sean Purcell (1) commits (+178/-0): > squashfs: Add zstd support > > Total: (4) commits (+14400/-12) > > fs/btrfs/Kconfig |2 + > fs/btrfs/Makefile |
Re: [PATCH v4 4/5] squashfs: Add zstd support
On Sun, Aug 13, 2017 at 9:31 AM, Geert Uytterhoevenwrote: > On Fri, Aug 4, 2017 at 10:19 PM, Nick Terrell wrote: >> --- a/fs/squashfs/decompressor.c >> +++ b/fs/squashfs/decompressor.c >> @@ -65,6 +65,12 @@ static const struct squashfs_decompressor >> squashfs_zlib_comp_ops = { >> }; >> #endif >> >> +#ifndef CONFIG_SQUASHFS_ZSTD >> +static const struct squashfs_decompressor squashfs_zstd_comp_ops = { >> + NULL, NULL, NULL, NULL, ZSTD_COMPRESSION, "zstd", 0 > > Can you please use designated initializers? I prefer it as it is. It matches the coding style that I used in the rest of that file to declare the "unsupported" compressor entries (if this patch uses a different style it would look odd).There's no pointers to functions being assigned here, and it makes it a short and concise one-line. Phillip > >> +}; >> +#endif >> + >> static const struct squashfs_decompressor squashfs_unknown_comp_ops = { >> NULL, NULL, NULL, NULL, 0, "unknown", 0 >> ; > >> +const struct squashfs_decompressor squashfs_zstd_comp_ops = { >> + .init = zstd_init, >> + .free = zstd_free, >> + .decompress = zstd_uncompress, >> + .id = ZSTD_COMPRESSION, >> + .name = "zstd", >> + .supported = 1 >> +}; > > ... like you did here. > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- > ge...@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like > that. > -- Linus Torvalds
Re: [PATCH v4 4/5] squashfs: Add zstd support
On Sun, Aug 13, 2017 at 9:31 AM, Geert Uytterhoeven wrote: > On Fri, Aug 4, 2017 at 10:19 PM, Nick Terrell wrote: >> --- a/fs/squashfs/decompressor.c >> +++ b/fs/squashfs/decompressor.c >> @@ -65,6 +65,12 @@ static const struct squashfs_decompressor >> squashfs_zlib_comp_ops = { >> }; >> #endif >> >> +#ifndef CONFIG_SQUASHFS_ZSTD >> +static const struct squashfs_decompressor squashfs_zstd_comp_ops = { >> + NULL, NULL, NULL, NULL, ZSTD_COMPRESSION, "zstd", 0 > > Can you please use designated initializers? I prefer it as it is. It matches the coding style that I used in the rest of that file to declare the "unsupported" compressor entries (if this patch uses a different style it would look odd).There's no pointers to functions being assigned here, and it makes it a short and concise one-line. Phillip > >> +}; >> +#endif >> + >> static const struct squashfs_decompressor squashfs_unknown_comp_ops = { >> NULL, NULL, NULL, NULL, 0, "unknown", 0 >> ; > >> +const struct squashfs_decompressor squashfs_zstd_comp_ops = { >> + .init = zstd_init, >> + .free = zstd_free, >> + .decompress = zstd_uncompress, >> + .id = ZSTD_COMPRESSION, >> + .name = "zstd", >> + .supported = 1 >> +}; > > ... like you did here. > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- > ge...@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like > that. > -- Linus Torvalds
Re: [PATCH v3 4/4] squashfs: Add zstd support
On Mon, Jul 31, 2017 at 2:50 AM, Phillip Lougher <phillip.loug...@gmail.com> wrote: > On Thu, Jul 20, 2017 at 10:27 PM, Nick Terrell <terre...@fb.com> wrote: >> Add zstd compression and decompression support to SquashFS. zstd is a >> great fit for SquashFS because it can compress at ratios approaching xz, >> while decompressing twice as fast as zlib. For SquashFS in particular, >> it can decompress as fast as lzo and lz4. It also has the flexibility >> to turn down the compression ratio for faster compression times. > > Hi Nick, > > This patch (and none of the previous versions) is showing up on > squashfs-de...@lists.sourceforge.net. I also think you should have > emailed me directly as a courtesy, as I'm the Squashfs author and > maintainer. OK, you're not subscribed to squashfs-devel I have accepted your previous posts, but, please subscribe. Thanks Phillip > > >> | Method | Ratio | Compression MB/s | Decompression MB/s | >> ||---|--|| >> | gzip | 2.92 | 15 |128 | >> | lzo| 2.64 | 9.5 |217 | >> | lz4| 2.12 | 94 |218 | >> | xz | 3.43 | 5.5 | 35 | >> | xz 256 KB | 3.53 | 5.4 | 40 | >> | zstd 1 | 2.71 | 96 |210 | >> | zstd 5 | 2.93 | 69 |198 | >> | zstd 10| 3.01 | 41 |225 | >> | zstd 15| 3.13 | 11.4 |224 | >> | zstd 16 256 KB | 3.24 | 8.1 |210 | > > > Those numbers look good to me. > >> >> This patch was written by Sean Purcell <m...@seanp.xyz>, but I will be >> taking over the submission process. >> >> [1] http://releases.ubuntu.com/16.10/ >> [2] >> https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/squashfs-benchmark.sh > > I can't find your patch that adds zstd to the user-land > Squashfs-tools. That would be handy to do any testing :-) > > Phillip > >> >> zstd source repository: https://github.com/facebook/zstd >> >> Cc: Sean Purcell <m...@seanp.xyz> >> Signed-off-by: Nick Terrell <terre...@fb.com> >> --- >> fs/squashfs/Kconfig| 14 + >> fs/squashfs/Makefile | 1 + >> fs/squashfs/decompressor.c | 7 +++ >> fs/squashfs/decompressor.h | 4 ++ >> fs/squashfs/squashfs_fs.h | 1 + >> fs/squashfs/zstd_wrapper.c | 150 >> + >> 6 files changed, 177 insertions(+) >> create mode 100644 fs/squashfs/zstd_wrapper.c >> >> diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig >> index ffb093e..1adb334 100644 >> --- a/fs/squashfs/Kconfig >> +++ b/fs/squashfs/Kconfig >> @@ -165,6 +165,20 @@ config SQUASHFS_XZ >> >> If unsure, say N. >> >> +config SQUASHFS_ZSTD >> + bool "Include support for ZSTD compressed file systems" >> + depends on SQUASHFS >> + select ZSTD_DECOMPRESS >> + help >> + Saying Y here includes support for reading Squashfs file systems >> + compressed with ZSTD compression. ZSTD gives better compression >> than >> + the default ZLIB compression, while using less CPU. >> + >> + ZSTD is not the standard compression used in Squashfs and so most >> + file systems will be readable without selecting this option. >> + >> + If unsure, say N. >> + >> config SQUASHFS_4K_DEVBLK_SIZE >> bool "Use 4K device block size?" >> depends on SQUASHFS >> diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile >> index 246a6f3..6655631 100644 >> --- a/fs/squashfs/Makefile >> +++ b/fs/squashfs/Makefile >> @@ -15,3 +15,4 @@ squashfs-$(CONFIG_SQUASHFS_LZ4) += lz4_wrapper.o >> squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o >> squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o >> squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o >> +squashfs-$(CONFIG_SQUASHFS_ZSTD) += zstd_wrapper.o >> diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c >> index d2bc136..8366398 100644 >> --- a/fs/squashfs/decompressor.c >> +++ b/fs/squashfs/decompressor.c >> @@ -65,6 +65,12 @@ static const struct squashfs_decompressor >> squashfs_zlib_comp_ops = { >> }; >> #endif >> >> +#
Re: [PATCH v3 4/4] squashfs: Add zstd support
On Mon, Jul 31, 2017 at 2:50 AM, Phillip Lougher wrote: > On Thu, Jul 20, 2017 at 10:27 PM, Nick Terrell wrote: >> Add zstd compression and decompression support to SquashFS. zstd is a >> great fit for SquashFS because it can compress at ratios approaching xz, >> while decompressing twice as fast as zlib. For SquashFS in particular, >> it can decompress as fast as lzo and lz4. It also has the flexibility >> to turn down the compression ratio for faster compression times. > > Hi Nick, > > This patch (and none of the previous versions) is showing up on > squashfs-de...@lists.sourceforge.net. I also think you should have > emailed me directly as a courtesy, as I'm the Squashfs author and > maintainer. OK, you're not subscribed to squashfs-devel I have accepted your previous posts, but, please subscribe. Thanks Phillip > > >> | Method | Ratio | Compression MB/s | Decompression MB/s | >> ||---|--|| >> | gzip | 2.92 | 15 |128 | >> | lzo| 2.64 | 9.5 |217 | >> | lz4| 2.12 | 94 |218 | >> | xz | 3.43 | 5.5 | 35 | >> | xz 256 KB | 3.53 | 5.4 | 40 | >> | zstd 1 | 2.71 | 96 |210 | >> | zstd 5 | 2.93 | 69 |198 | >> | zstd 10| 3.01 | 41 |225 | >> | zstd 15| 3.13 | 11.4 |224 | >> | zstd 16 256 KB | 3.24 | 8.1 |210 | > > > Those numbers look good to me. > >> >> This patch was written by Sean Purcell , but I will be >> taking over the submission process. >> >> [1] http://releases.ubuntu.com/16.10/ >> [2] >> https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/squashfs-benchmark.sh > > I can't find your patch that adds zstd to the user-land > Squashfs-tools. That would be handy to do any testing :-) > > Phillip > >> >> zstd source repository: https://github.com/facebook/zstd >> >> Cc: Sean Purcell >> Signed-off-by: Nick Terrell >> --- >> fs/squashfs/Kconfig| 14 + >> fs/squashfs/Makefile | 1 + >> fs/squashfs/decompressor.c | 7 +++ >> fs/squashfs/decompressor.h | 4 ++ >> fs/squashfs/squashfs_fs.h | 1 + >> fs/squashfs/zstd_wrapper.c | 150 >> + >> 6 files changed, 177 insertions(+) >> create mode 100644 fs/squashfs/zstd_wrapper.c >> >> diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig >> index ffb093e..1adb334 100644 >> --- a/fs/squashfs/Kconfig >> +++ b/fs/squashfs/Kconfig >> @@ -165,6 +165,20 @@ config SQUASHFS_XZ >> >> If unsure, say N. >> >> +config SQUASHFS_ZSTD >> + bool "Include support for ZSTD compressed file systems" >> + depends on SQUASHFS >> + select ZSTD_DECOMPRESS >> + help >> + Saying Y here includes support for reading Squashfs file systems >> + compressed with ZSTD compression. ZSTD gives better compression >> than >> + the default ZLIB compression, while using less CPU. >> + >> + ZSTD is not the standard compression used in Squashfs and so most >> + file systems will be readable without selecting this option. >> + >> + If unsure, say N. >> + >> config SQUASHFS_4K_DEVBLK_SIZE >> bool "Use 4K device block size?" >> depends on SQUASHFS >> diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile >> index 246a6f3..6655631 100644 >> --- a/fs/squashfs/Makefile >> +++ b/fs/squashfs/Makefile >> @@ -15,3 +15,4 @@ squashfs-$(CONFIG_SQUASHFS_LZ4) += lz4_wrapper.o >> squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o >> squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o >> squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o >> +squashfs-$(CONFIG_SQUASHFS_ZSTD) += zstd_wrapper.o >> diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c >> index d2bc136..8366398 100644 >> --- a/fs/squashfs/decompressor.c >> +++ b/fs/squashfs/decompressor.c >> @@ -65,6 +65,12 @@ static const struct squashfs_decompressor >> squashfs_zlib_comp_ops = { >> }; >> #endif >> >> +#ifndef CONFIG_SQUASHFS_ZSTD >> +static const struct squashfs_decompressor squashfs_zstd_comp_ops = { >> + NU
Re: [PATCH v3 4/4] squashfs: Add zstd support
On Thu, Jul 20, 2017 at 10:27 PM, Nick Terrellwrote: > Add zstd compression and decompression support to SquashFS. zstd is a > great fit for SquashFS because it can compress at ratios approaching xz, > while decompressing twice as fast as zlib. For SquashFS in particular, > it can decompress as fast as lzo and lz4. It also has the flexibility > to turn down the compression ratio for faster compression times. Hi Nick, This patch (and none of the previous versions) is showing up on squashfs-de...@lists.sourceforge.net. I also think you should have emailed me directly as a courtesy, as I'm the Squashfs author and maintainer. > | Method | Ratio | Compression MB/s | Decompression MB/s | > ||---|--|| > | gzip | 2.92 | 15 |128 | > | lzo| 2.64 | 9.5 |217 | > | lz4| 2.12 | 94 |218 | > | xz | 3.43 | 5.5 | 35 | > | xz 256 KB | 3.53 | 5.4 | 40 | > | zstd 1 | 2.71 | 96 |210 | > | zstd 5 | 2.93 | 69 |198 | > | zstd 10| 3.01 | 41 |225 | > | zstd 15| 3.13 | 11.4 |224 | > | zstd 16 256 KB | 3.24 | 8.1 |210 | Those numbers look good to me. > > This patch was written by Sean Purcell , but I will be > taking over the submission process. > > [1] http://releases.ubuntu.com/16.10/ > [2] > https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/squashfs-benchmark.sh I can't find your patch that adds zstd to the user-land Squashfs-tools. That would be handy to do any testing :-) Phillip > > zstd source repository: https://github.com/facebook/zstd > > Cc: Sean Purcell > Signed-off-by: Nick Terrell > --- > fs/squashfs/Kconfig| 14 + > fs/squashfs/Makefile | 1 + > fs/squashfs/decompressor.c | 7 +++ > fs/squashfs/decompressor.h | 4 ++ > fs/squashfs/squashfs_fs.h | 1 + > fs/squashfs/zstd_wrapper.c | 150 > + > 6 files changed, 177 insertions(+) > create mode 100644 fs/squashfs/zstd_wrapper.c > > diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig > index ffb093e..1adb334 100644 > --- a/fs/squashfs/Kconfig > +++ b/fs/squashfs/Kconfig > @@ -165,6 +165,20 @@ config SQUASHFS_XZ > > If unsure, say N. > > +config SQUASHFS_ZSTD > + bool "Include support for ZSTD compressed file systems" > + depends on SQUASHFS > + select ZSTD_DECOMPRESS > + help > + Saying Y here includes support for reading Squashfs file systems > + compressed with ZSTD compression. ZSTD gives better compression > than > + the default ZLIB compression, while using less CPU. > + > + ZSTD is not the standard compression used in Squashfs and so most > + file systems will be readable without selecting this option. > + > + If unsure, say N. > + > config SQUASHFS_4K_DEVBLK_SIZE > bool "Use 4K device block size?" > depends on SQUASHFS > diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile > index 246a6f3..6655631 100644 > --- a/fs/squashfs/Makefile > +++ b/fs/squashfs/Makefile > @@ -15,3 +15,4 @@ squashfs-$(CONFIG_SQUASHFS_LZ4) += lz4_wrapper.o > squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o > squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o > squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o > +squashfs-$(CONFIG_SQUASHFS_ZSTD) += zstd_wrapper.o > diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c > index d2bc136..8366398 100644 > --- a/fs/squashfs/decompressor.c > +++ b/fs/squashfs/decompressor.c > @@ -65,6 +65,12 @@ static const struct squashfs_decompressor > squashfs_zlib_comp_ops = { > }; > #endif > > +#ifndef CONFIG_SQUASHFS_ZSTD > +static const struct squashfs_decompressor squashfs_zstd_comp_ops = { > + NULL, NULL, NULL, NULL, ZSTD_COMPRESSION, "zstd", 0 > +}; > +#endif > + > static const struct squashfs_decompressor squashfs_unknown_comp_ops = { > NULL, NULL, NULL, NULL, 0, "unknown", 0 > }; > @@ -75,6 +81,7 @@ static const struct squashfs_decompressor *decompressor[] = > { > _lzo_comp_ops, > _xz_comp_ops, > _lzma_unsupported_comp_ops, > + _zstd_comp_ops, > _unknown_comp_ops > }; > > diff --git a/fs/squashfs/decompressor.h b/fs/squashfs/decompressor.h > index a25713c..0f5a8e4 100644 > --- a/fs/squashfs/decompressor.h > +++ b/fs/squashfs/decompressor.h > @@ -58,4 +58,8 @@ extern const struct squashfs_decompressor > squashfs_lzo_comp_ops; > extern const struct squashfs_decompressor squashfs_zlib_comp_ops; > #endif > > +#ifdef CONFIG_SQUASHFS_ZSTD > +extern const struct
Re: [PATCH v3 4/4] squashfs: Add zstd support
On Thu, Jul 20, 2017 at 10:27 PM, Nick Terrell wrote: > Add zstd compression and decompression support to SquashFS. zstd is a > great fit for SquashFS because it can compress at ratios approaching xz, > while decompressing twice as fast as zlib. For SquashFS in particular, > it can decompress as fast as lzo and lz4. It also has the flexibility > to turn down the compression ratio for faster compression times. Hi Nick, This patch (and none of the previous versions) is showing up on squashfs-de...@lists.sourceforge.net. I also think you should have emailed me directly as a courtesy, as I'm the Squashfs author and maintainer. > | Method | Ratio | Compression MB/s | Decompression MB/s | > ||---|--|| > | gzip | 2.92 | 15 |128 | > | lzo| 2.64 | 9.5 |217 | > | lz4| 2.12 | 94 |218 | > | xz | 3.43 | 5.5 | 35 | > | xz 256 KB | 3.53 | 5.4 | 40 | > | zstd 1 | 2.71 | 96 |210 | > | zstd 5 | 2.93 | 69 |198 | > | zstd 10| 3.01 | 41 |225 | > | zstd 15| 3.13 | 11.4 |224 | > | zstd 16 256 KB | 3.24 | 8.1 |210 | Those numbers look good to me. > > This patch was written by Sean Purcell , but I will be > taking over the submission process. > > [1] http://releases.ubuntu.com/16.10/ > [2] > https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/squashfs-benchmark.sh I can't find your patch that adds zstd to the user-land Squashfs-tools. That would be handy to do any testing :-) Phillip > > zstd source repository: https://github.com/facebook/zstd > > Cc: Sean Purcell > Signed-off-by: Nick Terrell > --- > fs/squashfs/Kconfig| 14 + > fs/squashfs/Makefile | 1 + > fs/squashfs/decompressor.c | 7 +++ > fs/squashfs/decompressor.h | 4 ++ > fs/squashfs/squashfs_fs.h | 1 + > fs/squashfs/zstd_wrapper.c | 150 > + > 6 files changed, 177 insertions(+) > create mode 100644 fs/squashfs/zstd_wrapper.c > > diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig > index ffb093e..1adb334 100644 > --- a/fs/squashfs/Kconfig > +++ b/fs/squashfs/Kconfig > @@ -165,6 +165,20 @@ config SQUASHFS_XZ > > If unsure, say N. > > +config SQUASHFS_ZSTD > + bool "Include support for ZSTD compressed file systems" > + depends on SQUASHFS > + select ZSTD_DECOMPRESS > + help > + Saying Y here includes support for reading Squashfs file systems > + compressed with ZSTD compression. ZSTD gives better compression > than > + the default ZLIB compression, while using less CPU. > + > + ZSTD is not the standard compression used in Squashfs and so most > + file systems will be readable without selecting this option. > + > + If unsure, say N. > + > config SQUASHFS_4K_DEVBLK_SIZE > bool "Use 4K device block size?" > depends on SQUASHFS > diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile > index 246a6f3..6655631 100644 > --- a/fs/squashfs/Makefile > +++ b/fs/squashfs/Makefile > @@ -15,3 +15,4 @@ squashfs-$(CONFIG_SQUASHFS_LZ4) += lz4_wrapper.o > squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o > squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o > squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o > +squashfs-$(CONFIG_SQUASHFS_ZSTD) += zstd_wrapper.o > diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c > index d2bc136..8366398 100644 > --- a/fs/squashfs/decompressor.c > +++ b/fs/squashfs/decompressor.c > @@ -65,6 +65,12 @@ static const struct squashfs_decompressor > squashfs_zlib_comp_ops = { > }; > #endif > > +#ifndef CONFIG_SQUASHFS_ZSTD > +static const struct squashfs_decompressor squashfs_zstd_comp_ops = { > + NULL, NULL, NULL, NULL, ZSTD_COMPRESSION, "zstd", 0 > +}; > +#endif > + > static const struct squashfs_decompressor squashfs_unknown_comp_ops = { > NULL, NULL, NULL, NULL, 0, "unknown", 0 > }; > @@ -75,6 +81,7 @@ static const struct squashfs_decompressor *decompressor[] = > { > _lzo_comp_ops, > _xz_comp_ops, > _lzma_unsupported_comp_ops, > + _zstd_comp_ops, > _unknown_comp_ops > }; > > diff --git a/fs/squashfs/decompressor.h b/fs/squashfs/decompressor.h > index a25713c..0f5a8e4 100644 > --- a/fs/squashfs/decompressor.h > +++ b/fs/squashfs/decompressor.h > @@ -58,4 +58,8 @@ extern const struct squashfs_decompressor > squashfs_lzo_comp_ops; > extern const struct squashfs_decompressor squashfs_zlib_comp_ops; > #endif > > +#ifdef CONFIG_SQUASHFS_ZSTD > +extern const struct squashfs_decompressor squashfs_zstd_comp_ops; > +#endif > + > #endif >
Re: squashfs-tools: Add -offset option to skip n bytes at the start of the file
On Sun, Jan 29, 2017 at 2:00 PM, Ritesh Raj Sarrafwrote: > On Sun, 2017-01-29 at 03:33 -0800, probonopd wrote: >> Yes, I am not happy about that. >> You could either push your change to the squashfs folks >> Sent a PR half a year ago but no reaction: plougher/squashfs-tools#13 - is >> something wrong with it? > > That repo seems inactive. So does the git repo at kernel.org. but I would rely > on that one. > The repo at github is dead, I only used it during the time kernel.org was offline due to the hack. The repo at kernel.org is the official repo. It hasn't been updated in a while simply because the tools are considered stable, and I have not had a burning desire to add new features for a while. > Let's try fresh now. Given there's been not much development in the upstream > repo, I've just taken your patch, as is, for submission. > >> Or you could just carry the delta and ask the user to fetch the >> squashfs-tools >> source themselves >> Do you mean "carry the delta" as in "provide a diff"? > > Doing now. > > Dear Squashfs Team, > That's me. > Can you please review/include the attached patch into squashfs-tools ? > The mentioned changes are needed by AppImageKit. > Well from a quick look at the attached patches, I see no reason not to pull them. I'll have a look over at github because is the first I've seen of these patches, and I'll see if I can pull them from there. If not, I'll ask the author to send them to me at this address. phil...@lougher.demon.co.uk is dead because my ISP arbitrarily ceased email support sometime ago (and I'd had that email for 20 years). It would appear phillip@squashfs,org.uk is still redirected to that, and I'll fix that. Phillip > What is AppImage > > The AppImage format is a format for packaging applications in a way that > allows > them to run on a variety of different target systems (base operating systems, > distributions) without further modification. > > https://en.wikipedia.org/wiki/AppImage > > AppImageKit is a concrete implementation of the AppImage format and provides > tools such as appimagetool and appimaged for conveniently handling AppImages. > > https://github.com/probonopd/AppImageKit > > > > -- > Ritesh Raj Sarraf > RESEARCHUT - http://www.researchut.com > "Necessity is the mother of invention."
Re: squashfs-tools: Add -offset option to skip n bytes at the start of the file
On Sun, Jan 29, 2017 at 2:00 PM, Ritesh Raj Sarraf wrote: > On Sun, 2017-01-29 at 03:33 -0800, probonopd wrote: >> Yes, I am not happy about that. >> You could either push your change to the squashfs folks >> Sent a PR half a year ago but no reaction: plougher/squashfs-tools#13 - is >> something wrong with it? > > That repo seems inactive. So does the git repo at kernel.org. but I would rely > on that one. > The repo at github is dead, I only used it during the time kernel.org was offline due to the hack. The repo at kernel.org is the official repo. It hasn't been updated in a while simply because the tools are considered stable, and I have not had a burning desire to add new features for a while. > Let's try fresh now. Given there's been not much development in the upstream > repo, I've just taken your patch, as is, for submission. > >> Or you could just carry the delta and ask the user to fetch the >> squashfs-tools >> source themselves >> Do you mean "carry the delta" as in "provide a diff"? > > Doing now. > > Dear Squashfs Team, > That's me. > Can you please review/include the attached patch into squashfs-tools ? > The mentioned changes are needed by AppImageKit. > Well from a quick look at the attached patches, I see no reason not to pull them. I'll have a look over at github because is the first I've seen of these patches, and I'll see if I can pull them from there. If not, I'll ask the author to send them to me at this address. phil...@lougher.demon.co.uk is dead because my ISP arbitrarily ceased email support sometime ago (and I'd had that email for 20 years). It would appear phillip@squashfs,org.uk is still redirected to that, and I'll fix that. Phillip > What is AppImage > > The AppImage format is a format for packaging applications in a way that > allows > them to run on a variety of different target systems (base operating systems, > distributions) without further modification. > > https://en.wikipedia.org/wiki/AppImage > > AppImageKit is a concrete implementation of the AppImage format and provides > tools such as appimagetool and appimaged for conveniently handling AppImages. > > https://github.com/probonopd/AppImageKit > > > > -- > Ritesh Raj Sarraf > RESEARCHUT - http://www.researchut.com > "Necessity is the mother of invention."
Re: [RFC] Using squashfs, kernel will hung task with no free memory?
On 22/01/15 02:28, long.wanglong wrote: hi, I have encountered kernel hung task when running stability and stress test. test scenarios: 1)the kernel hungtask settings are following: hung_task_panic = 1 hung_task_timeout_secs = 120 2)the rootfs type is squashfs(read-only) what the test does is to fork many child process and each process will alloc memory. when there is no free memory in the system, OOM killer is triggerred. and then the kernel triggers hung task(after about five minutes) . the reason for hung task is that some process keep D states for 120 seconds. if there is no free memory in the system, many process state is D, they enter into D state by kernel path `squashfs_cache_get()--->wait_event()`. the backtrace is: [ 313.950118] [] (__schedule+0x448/0x5cc) from [] (squashfs_cache_get+0x120/0x3ec) [ 314.059660] [] (squashfs_cache_get+0x120/0x3ec) from [] (squashfs_readpage+0x748/0xa2c) [ 314.176497] [] (squashfs_readpage+0x748/0xa2c) from [] (__do_page_cache_readahead+0x1ac/0x200) [ 314.300621] [] (__do_page_cache_readahead+0x1ac/0x200) from [] (ra_submit+0x24/0x28) [ 314.414325] [] (ra_submit+0x24/0x28) from [] (filemap_fault+0x16c/0x3f0) [ 314.515521] [] (filemap_fault+0x16c/0x3f0) from [] (__do_fault+0xc0/0x570) [ 314.618802] [] (__do_fault+0xc0/0x570) from [] (handle_pte_fault+0x47c/0x1048) [ 314.726250] [] (handle_pte_fault+0x47c/0x1048) from [] (handle_mm_fault+0x164/0x218) [ 314.839959] [] (handle_mm_fault+0x164/0x218) from [] (do_page_fault.part.7+0x108/0x360) [ 314.956788] [] (do_page_fault.part.7+0x108/0x360) from [] (do_page_fault+0x2c/0x70) [ 315.069442] [] (do_page_fault+0x2c/0x70) from [] (do_PrefetchAbort+0x2c/0x90) [ 315.175850] [] (do_PrefetchAbort+0x2c/0x90) from [] (ret_from_exception+0x0/0x10) when a task is already exiting because of OOM killer,the next time OOM killer will kill the same task. so, if the first time of OOM killer select a task(A) that in D state (the task ingore exit signal beacuse of D state). then the next time of OOM killer will also kill task A. In this scenario, oom killer will not free memory. with no free memory, many process sleep in function squashfs_cache_get. about 2 minutes, the system hung task and panic. because of OOM feature and squashfs, on heavy system, This problem is easily reproduce. Is this a problem about squashfs or about the OOM killer. Can anyone give me some good ideas about this? This is not a Squashfs issue, it is a well known problem with the OOM killer trying to kill tasks which are slow to exit (being in D state). Just google "OOM hung task" to see how long this issue has been around. The OOM killer is worse than useless in embedded systems because its behaviour is unpredictable and can leave a system in a zombified or half zombified state. Due to this reason many embedded systems disable the OOM killer entirely, and ensure there is adequate memory backed up by a watchdog which reboots a hung system. Phillip Best Regards Wang Long . -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Using squashfs, kernel will hung task with no free memory?
On 22/01/15 02:28, long.wanglong wrote: hi, I have encountered kernel hung task when running stability and stress test. test scenarios: 1)the kernel hungtask settings are following: hung_task_panic = 1 hung_task_timeout_secs = 120 2)the rootfs type is squashfs(read-only) what the test does is to fork many child process and each process will alloc memory. when there is no free memory in the system, OOM killer is triggerred. and then the kernel triggers hung task(after about five minutes) . the reason for hung task is that some process keep D states for 120 seconds. if there is no free memory in the system, many process state is D, they enter into D state by kernel path `squashfs_cache_get()---wait_event()`. the backtrace is: [ 313.950118] [c02d2014] (__schedule+0x448/0x5cc) from [c014e510] (squashfs_cache_get+0x120/0x3ec) [ 314.059660] [c014e510] (squashfs_cache_get+0x120/0x3ec) from [c014fd1c] (squashfs_readpage+0x748/0xa2c) [ 314.176497] [c014fd1c] (squashfs_readpage+0x748/0xa2c) from [c00b7be0] (__do_page_cache_readahead+0x1ac/0x200) [ 314.300621] [c00b7be0] (__do_page_cache_readahead+0x1ac/0x200) from [c00b7e98] (ra_submit+0x24/0x28) [ 314.414325] [c00b7e98] (ra_submit+0x24/0x28) from [c00b043c] (filemap_fault+0x16c/0x3f0) [ 314.515521] [c00b043c] (filemap_fault+0x16c/0x3f0) from [c00c94e0] (__do_fault+0xc0/0x570) [ 314.618802] [c00c94e0] (__do_fault+0xc0/0x570) from [c00cbdc4] (handle_pte_fault+0x47c/0x1048) [ 314.726250] [c00cbdc4] (handle_pte_fault+0x47c/0x1048) from [c00cd928] (handle_mm_fault+0x164/0x218) [ 314.839959] [c00cd928] (handle_mm_fault+0x164/0x218) from [c02d4878] (do_page_fault.part.7+0x108/0x360) [ 314.956788] [c02d4878] (do_page_fault.part.7+0x108/0x360) from [c02d4afc] (do_page_fault+0x2c/0x70) [ 315.069442] [c02d4afc] (do_page_fault+0x2c/0x70) from [c00084cc] (do_PrefetchAbort+0x2c/0x90) [ 315.175850] [c00084cc] (do_PrefetchAbort+0x2c/0x90) from [c02d3674] (ret_from_exception+0x0/0x10) when a task is already exiting because of OOM killer,the next time OOM killer will kill the same task. so, if the first time of OOM killer select a task(A) that in D state (the task ingore exit signal beacuse of D state). then the next time of OOM killer will also kill task A. In this scenario, oom killer will not free memory. with no free memory, many process sleep in function squashfs_cache_get. about 2 minutes, the system hung task and panic. because of OOM feature and squashfs, on heavy system, This problem is easily reproduce. Is this a problem about squashfs or about the OOM killer. Can anyone give me some good ideas about this? This is not a Squashfs issue, it is a well known problem with the OOM killer trying to kill tasks which are slow to exit (being in D state). Just google OOM hung task to see how long this issue has been around. The OOM killer is worse than useless in embedded systems because its behaviour is unpredictable and can leave a system in a zombified or half zombified state. Due to this reason many embedded systems disable the OOM killer entirely, and ensure there is adequate memory backed up by a watchdog which reboots a hung system. Phillip Best Regards Wang Long . -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] Squashfs updates for 3.19
Hi Linus, Please pull the following Squashfs updates... Thanks Phillip The following changes since commit 5d01410fe4d92081f349b013a2e7a95429e4f2c9: Linux 3.18-rc6 (2014-11-23 15:25:20 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next.git tags/squashfs-updates for you to fetch changes up to 62421645bb702c077ee5a462815525106cb53bcf: Squashfs: Add LZ4 compression configuration option (2014-11-27 18:48:44 +) These patches optionally add LZ4 compression support to Squashfs. LZ4 is a lightweight compression algorithm which can be used on embedded systems to reduce CPU and memory overhead (in comparison to the standard zlib compression). These patches add the wrapper code to allow Squashfs to use the existing LZ4 decompression code, and the necessary configuration option. Phillip Lougher (2): Squashfs: add LZ4 compression support Squashfs: Add LZ4 compression configuration option Documentation/filesystems/squashfs.txt |8 +- fs/squashfs/Kconfig| 15 fs/squashfs/Makefile |1 + fs/squashfs/decompressor.c |7 ++ fs/squashfs/decompressor.h |4 + fs/squashfs/lz4_wrapper.c | 142 fs/squashfs/squashfs_fs.h |1 + 7 files changed, 174 insertions(+), 4 deletions(-) create mode 100644 fs/squashfs/lz4_wrapper.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] Squashfs updates for 3.19
Hi Linus, Please pull the following Squashfs updates... Thanks Phillip The following changes since commit 5d01410fe4d92081f349b013a2e7a95429e4f2c9: Linux 3.18-rc6 (2014-11-23 15:25:20 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next.git tags/squashfs-updates for you to fetch changes up to 62421645bb702c077ee5a462815525106cb53bcf: Squashfs: Add LZ4 compression configuration option (2014-11-27 18:48:44 +) These patches optionally add LZ4 compression support to Squashfs. LZ4 is a lightweight compression algorithm which can be used on embedded systems to reduce CPU and memory overhead (in comparison to the standard zlib compression). These patches add the wrapper code to allow Squashfs to use the existing LZ4 decompression code, and the necessary configuration option. Phillip Lougher (2): Squashfs: add LZ4 compression support Squashfs: Add LZ4 compression configuration option Documentation/filesystems/squashfs.txt |8 +- fs/squashfs/Kconfig| 15 fs/squashfs/Makefile |1 + fs/squashfs/decompressor.c |7 ++ fs/squashfs/decompressor.h |4 + fs/squashfs/lz4_wrapper.c | 142 fs/squashfs/squashfs_fs.h |1 + 7 files changed, 174 insertions(+), 4 deletions(-) create mode 100644 fs/squashfs/lz4_wrapper.c -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] Squashfs: add LZ4 compression
On 12/12/14 15:56, Bruno Wolff III wrote: On Fri, Dec 12, 2014 at 13:23:19 +0100, toki clover wrote: Now, I did not see any Linux FS devs activity/response to this... What a waste of time because if those patch don't make it for this merge window, rebasing/reposting will be, again, necessary. The patches got pulled into linux-next. For XZ, the patches sat in linux-next for a full release cycle. I don't know whether or not if that is the plan for the lz4 patches. . The plan is to send out the merge request this weekend. I've been too busy this week with my main job to do the merge request, sorry (I'll probably be working past midnight again tonight, the weeks leading up to Christmas are always really busy). Phillip -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] Squashfs: add LZ4 compression
On 12/12/14 15:56, Bruno Wolff III wrote: On Fri, Dec 12, 2014 at 13:23:19 +0100, toki clover tokiclo...@gmail.com wrote: Now, I did not see any Linux FS devs activity/response to this... What a waste of time because if those patch don't make it for this merge window, rebasing/reposting will be, again, necessary. The patches got pulled into linux-next. For XZ, the patches sat in linux-next for a full release cycle. I don't know whether or not if that is the plan for the lz4 patches. . The plan is to send out the merge request this weekend. I've been too busy this week with my main job to do the merge request, sorry (I'll probably be working past midnight again tonight, the weeks leading up to Christmas are always really busy). Phillip -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2 V2] Squashfs: Add LZ4 compression configuration option
Add the glue code, and also update the documentation. Signed-off-by: Phillip Lougher --- V2 patch: when porting the code to the latest -rc I forgot to update the case where CONFIG_SQUASHFS_LZ4 is not defined, leading to a build error. Trivial change: +#ifndef CONFIG_SQUASHFS_LZ4 +static const struct squashfs_decompressor squashfs_lz4_comp_ops = { + NULL, NULL, NULL, LZ4_COMPRESSION, "lz4", 0 should be +#ifndef CONFIG_SQUASHFS_LZ4 +static const struct squashfs_decompressor squashfs_lz4_comp_ops = { + NULL, NULL, NULL, NULL, LZ4_COMPRESSION, "lz4", 0 Documentation/filesystems/squashfs.txt |8 fs/squashfs/Kconfig| 15 +++ fs/squashfs/Makefile |1 + fs/squashfs/decompressor.c |7 +++ fs/squashfs/decompressor.h |4 5 files changed, 31 insertions(+), 4 deletions(-) diff --git a/Documentation/filesystems/squashfs.txt b/Documentation/filesystems/squashfs.txt index 403c090..e5274f8 100644 --- a/Documentation/filesystems/squashfs.txt +++ b/Documentation/filesystems/squashfs.txt @@ -2,10 +2,10 @@ SQUASHFS 4.0 FILESYSTEM === Squashfs is a compressed read-only filesystem for Linux. -It uses zlib/lzo/xz compression to compress files, inodes and directories. -Inodes in the system are very small and all blocks are packed to minimise -data overhead. Block sizes greater than 4K are supported up to a maximum -of 1Mbytes (default block size 128K). +It uses zlib, lz4, lzo, or xz compression to compress files, inodes and +directories. Inodes in the system are very small and all blocks are packed to +minimise data overhead. Block sizes greater than 4K are supported up to a +maximum of 1Mbytes (default block size 128K). Squashfs is intended for general read-only filesystem use, for archival use (i.e. in cases where a .tar.gz file may be used), and in constrained diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index b6fa865..ffb093e 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -120,6 +120,21 @@ config SQUASHFS_ZLIB If unsure, say Y. +config SQUASHFS_LZ4 + bool "Include support for LZ4 compressed file systems" + depends on SQUASHFS + select LZ4_DECOMPRESS + help + Saying Y here includes support for reading Squashfs file systems + compressed with LZ4 compression. LZ4 compression is mainly + aimed at embedded systems with slower CPUs where the overheads + of zlib are too high. + + LZ4 is not the standard compression used in Squashfs and so most + file systems will be readable without selecting this option. + + If unsure, say N. + config SQUASHFS_LZO bool "Include support for LZO compressed file systems" depends on SQUASHFS diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 4132520..246a6f3 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -11,6 +11,7 @@ squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o +squashfs-$(CONFIG_SQUASHFS_LZ4) += lz4_wrapper.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c index ac22fe7..e9034bf 100644 --- a/fs/squashfs/decompressor.c +++ b/fs/squashfs/decompressor.c @@ -41,6 +41,12 @@ static const struct squashfs_decompressor squashfs_lzma_unsupported_comp_ops = { NULL, NULL, NULL, NULL, LZMA_COMPRESSION, "lzma", 0 }; +#ifndef CONFIG_SQUASHFS_LZ4 +static const struct squashfs_decompressor squashfs_lz4_comp_ops = { + NULL, NULL, NULL, NULL, LZ4_COMPRESSION, "lz4", 0 +}; +#endif + #ifndef CONFIG_SQUASHFS_LZO static const struct squashfs_decompressor squashfs_lzo_comp_ops = { NULL, NULL, NULL, NULL, LZO_COMPRESSION, "lzo", 0 @@ -65,6 +71,7 @@ static const struct squashfs_decompressor squashfs_unknown_comp_ops = { static const struct squashfs_decompressor *decompressor[] = { _zlib_comp_ops, + _lz4_comp_ops, _lzo_comp_ops, _xz_comp_ops, _lzma_unsupported_comp_ops, diff --git a/fs/squashfs/decompressor.h b/fs/squashfs/decompressor.h index af09853..a25713c 100644 --- a/fs/squashfs/decompressor.h +++ b/fs/squashfs/decompressor.h @@ -46,6 +46,10 @@ static inline void *squashfs_comp_opts(struct squashfs_sb_info *msblk, extern const struct squashfs_decompressor squashfs_xz_comp_ops; #endif +#ifdef CONFIG_SQUASHFS_LZ4 +extern const struct squashfs_decompressor squashfs_lz4_comp_ops; +#endif + #ifdef CONFIG_SQUASHFS_LZO extern const struct squashfs_decompressor squashf
[PATCH 1/2] Squashfs: add LZ4 compression support
Add support for reading file systems compressed with the LZ4 compression algorithm. This patch adds the LZ4 decompressor wrapper code. Signed-off-by: Phillip Lougher --- fs/squashfs/lz4_wrapper.c | 142 + fs/squashfs/squashfs_fs.h |1 + 2 files changed, 143 insertions(+) create mode 100644 fs/squashfs/lz4_wrapper.c diff --git a/fs/squashfs/lz4_wrapper.c b/fs/squashfs/lz4_wrapper.c new file mode 100644 index 000..c31e2bc --- /dev/null +++ b/fs/squashfs/lz4_wrapper.c @@ -0,0 +1,142 @@ +/* + * Copyright (c) 2013, 2014 + * Phillip Lougher + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include +#include +#include +#include +#include + +#include "squashfs_fs.h" +#include "squashfs_fs_sb.h" +#include "squashfs.h" +#include "decompressor.h" +#include "page_actor.h" + +#define LZ4_LEGACY 1 + +struct lz4_comp_opts { + __le32 version; + __le32 flags; +}; + +struct squashfs_lz4 { + void *input; + void *output; +}; + + +static void *lz4_comp_opts(struct squashfs_sb_info *msblk, + void *buff, int len) +{ + struct lz4_comp_opts *comp_opts = buff; + + /* LZ4 compressed filesystems always have compression options */ + if (comp_opts == NULL || len < sizeof(*comp_opts)) + return ERR_PTR(-EIO); + + if (le32_to_cpu(comp_opts->version) != LZ4_LEGACY) { + /* LZ4 format currently used by the kernel is the 'legacy' +* format */ + ERROR("Unknown LZ4 version\n"); + return ERR_PTR(-EINVAL); + } + + return NULL; +} + + +static void *lz4_init(struct squashfs_sb_info *msblk, void *buff) +{ + int block_size = max_t(int, msblk->block_size, SQUASHFS_METADATA_SIZE); + struct squashfs_lz4 *stream; + + stream = kzalloc(sizeof(*stream), GFP_KERNEL); + if (stream == NULL) + goto failed; + stream->input = vmalloc(block_size); + if (stream->input == NULL) + goto failed2; + stream->output = vmalloc(block_size); + if (stream->output == NULL) + goto failed3; + + return stream; + +failed3: + vfree(stream->input); +failed2: + kfree(stream); +failed: + ERROR("Failed to initialise LZ4 decompressor\n"); + return ERR_PTR(-ENOMEM); +} + + +static void lz4_free(void *strm) +{ + struct squashfs_lz4 *stream = strm; + + if (stream) { + vfree(stream->input); + vfree(stream->output); + } + kfree(stream); +} + + +static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm, + struct buffer_head **bh, int b, int offset, int length, + struct squashfs_page_actor *output) +{ + struct squashfs_lz4 *stream = strm; + void *buff = stream->input, *data; + int avail, i, bytes = length, res; + size_t dest_len = output->length; + + for (i = 0; i < b; i++) { + avail = min(bytes, msblk->devblksize - offset); + memcpy(buff, bh[i]->b_data + offset, avail); + buff += avail; + bytes -= avail; + offset = 0; + put_bh(bh[i]); + } + + res = lz4_decompress_unknownoutputsize(stream->input, length, + stream->output, _len); + if (res) + return -EIO; + + bytes = dest_len; + data = squashfs_first_page(output); + buff = stream->output; + while (data) { + if (bytes <= PAGE_CACHE_SIZE) { + memcpy(data, buff, bytes); + break; + } + memcpy(data, buff, PAGE_CACHE_SIZE); + buff += PAGE_CACHE_SIZE; + bytes -= PAGE_CACHE_SIZE; + data = squashfs_next_page(output); + } + squashfs_finish_page(output); + + return dest_len; +} + +const struct squashfs_decompressor squashfs_lz4_comp_ops = { + .init = lz4_init, + .comp_opts = lz4_comp_opts, + .free = lz4_free, + .decompress = lz4_uncompress, + .id = LZ4_COMPRESSION, + .name = "lz4", + .supported = 1 +}; diff --git a/fs/squashfs/squashfs_fs.h b/fs/squashfs/squashfs_fs.h index 4b2beda..506f4ba 100644 --- a/fs/squashfs/squashfs_fs.h +++ b/fs/squashfs/squashfs_fs.h @@ -240,6 +240,7 @@ struct meta_index { #define LZMA_COMPRESSION 2 #define LZO_COMPRESSION3 #define XZ_COMPRESSION 4 +#define LZ4_COMPRESSION5 struct squashfs_super_block { __le32 s_magic; -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/2] Squashfs: add LZ4 compression
Hi, The following is a repost of the Squashfs LZ4 compression patches updated to the latest -rc kernel. These patches were originally sent last year, but they did not receive much interest at the time. These patches are being reposted by popular demand. I have received many private and public requests to repost these patches and to get them mainlined. My intention is to submit them in the next kernel merge window. If you want LZ4 support in Squashfs now is a good time to publically support the inclusion of these patches. Phillip Lougher (2): Squashfs: add LZ4 compression support Squashfs: Add LZ4 compression configuration option Documentation/filesystems/squashfs.txt |8 +- fs/squashfs/Kconfig| 15 fs/squashfs/Makefile |1 + fs/squashfs/decompressor.c |7 ++ fs/squashfs/decompressor.h |4 + fs/squashfs/lz4_wrapper.c | 142 fs/squashfs/squashfs_fs.h |1 + 7 files changed, 174 insertions(+), 4 deletions(-) create mode 100644 fs/squashfs/lz4_wrapper.c These patches are also available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-lz4.git master Squashfs-tools 4.3 already supports LZ4 compression. Phillip -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] Squashfs: Add LZ4 compression configuration option
Add the glue code, and also update the documentation. Signed-off-by: Phillip Lougher --- Documentation/filesystems/squashfs.txt |8 fs/squashfs/Kconfig| 15 +++ fs/squashfs/Makefile |1 + fs/squashfs/decompressor.c |7 +++ fs/squashfs/decompressor.h |4 5 files changed, 31 insertions(+), 4 deletions(-) diff --git a/Documentation/filesystems/squashfs.txt b/Documentation/filesystems/squashfs.txt index 403c090..e5274f8 100644 --- a/Documentation/filesystems/squashfs.txt +++ b/Documentation/filesystems/squashfs.txt @@ -2,10 +2,10 @@ SQUASHFS 4.0 FILESYSTEM === Squashfs is a compressed read-only filesystem for Linux. -It uses zlib/lzo/xz compression to compress files, inodes and directories. -Inodes in the system are very small and all blocks are packed to minimise -data overhead. Block sizes greater than 4K are supported up to a maximum -of 1Mbytes (default block size 128K). +It uses zlib, lz4, lzo, or xz compression to compress files, inodes and +directories. Inodes in the system are very small and all blocks are packed to +minimise data overhead. Block sizes greater than 4K are supported up to a +maximum of 1Mbytes (default block size 128K). Squashfs is intended for general read-only filesystem use, for archival use (i.e. in cases where a .tar.gz file may be used), and in constrained diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index b6fa865..ffb093e 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -120,6 +120,21 @@ config SQUASHFS_ZLIB If unsure, say Y. +config SQUASHFS_LZ4 + bool "Include support for LZ4 compressed file systems" + depends on SQUASHFS + select LZ4_DECOMPRESS + help + Saying Y here includes support for reading Squashfs file systems + compressed with LZ4 compression. LZ4 compression is mainly + aimed at embedded systems with slower CPUs where the overheads + of zlib are too high. + + LZ4 is not the standard compression used in Squashfs and so most + file systems will be readable without selecting this option. + + If unsure, say N. + config SQUASHFS_LZO bool "Include support for LZO compressed file systems" depends on SQUASHFS diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 4132520..246a6f3 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -11,6 +11,7 @@ squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o +squashfs-$(CONFIG_SQUASHFS_LZ4) += lz4_wrapper.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c index ac22fe7..505d6aa 100644 --- a/fs/squashfs/decompressor.c +++ b/fs/squashfs/decompressor.c @@ -41,6 +41,12 @@ static const struct squashfs_decompressor squashfs_lzma_unsupported_comp_ops = { NULL, NULL, NULL, NULL, LZMA_COMPRESSION, "lzma", 0 }; +#ifndef CONFIG_SQUASHFS_LZ4 +static const struct squashfs_decompressor squashfs_lz4_comp_ops = { + NULL, NULL, NULL, LZ4_COMPRESSION, "lz4", 0 +}; +#endif + #ifndef CONFIG_SQUASHFS_LZO static const struct squashfs_decompressor squashfs_lzo_comp_ops = { NULL, NULL, NULL, NULL, LZO_COMPRESSION, "lzo", 0 @@ -65,6 +71,7 @@ static const struct squashfs_decompressor squashfs_unknown_comp_ops = { static const struct squashfs_decompressor *decompressor[] = { _zlib_comp_ops, + _lz4_comp_ops, _lzo_comp_ops, _xz_comp_ops, _lzma_unsupported_comp_ops, diff --git a/fs/squashfs/decompressor.h b/fs/squashfs/decompressor.h index af09853..a25713c 100644 --- a/fs/squashfs/decompressor.h +++ b/fs/squashfs/decompressor.h @@ -46,6 +46,10 @@ static inline void *squashfs_comp_opts(struct squashfs_sb_info *msblk, extern const struct squashfs_decompressor squashfs_xz_comp_ops; #endif +#ifdef CONFIG_SQUASHFS_LZ4 +extern const struct squashfs_decompressor squashfs_lz4_comp_ops; +#endif + #ifdef CONFIG_SQUASHFS_LZO extern const struct squashfs_decompressor squashfs_lzo_comp_ops; #endif -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] Squashfs: Add LZ4 compression configuration option
Add the glue code, and also update the documentation. Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- Documentation/filesystems/squashfs.txt |8 fs/squashfs/Kconfig| 15 +++ fs/squashfs/Makefile |1 + fs/squashfs/decompressor.c |7 +++ fs/squashfs/decompressor.h |4 5 files changed, 31 insertions(+), 4 deletions(-) diff --git a/Documentation/filesystems/squashfs.txt b/Documentation/filesystems/squashfs.txt index 403c090..e5274f8 100644 --- a/Documentation/filesystems/squashfs.txt +++ b/Documentation/filesystems/squashfs.txt @@ -2,10 +2,10 @@ SQUASHFS 4.0 FILESYSTEM === Squashfs is a compressed read-only filesystem for Linux. -It uses zlib/lzo/xz compression to compress files, inodes and directories. -Inodes in the system are very small and all blocks are packed to minimise -data overhead. Block sizes greater than 4K are supported up to a maximum -of 1Mbytes (default block size 128K). +It uses zlib, lz4, lzo, or xz compression to compress files, inodes and +directories. Inodes in the system are very small and all blocks are packed to +minimise data overhead. Block sizes greater than 4K are supported up to a +maximum of 1Mbytes (default block size 128K). Squashfs is intended for general read-only filesystem use, for archival use (i.e. in cases where a .tar.gz file may be used), and in constrained diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index b6fa865..ffb093e 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -120,6 +120,21 @@ config SQUASHFS_ZLIB If unsure, say Y. +config SQUASHFS_LZ4 + bool Include support for LZ4 compressed file systems + depends on SQUASHFS + select LZ4_DECOMPRESS + help + Saying Y here includes support for reading Squashfs file systems + compressed with LZ4 compression. LZ4 compression is mainly + aimed at embedded systems with slower CPUs where the overheads + of zlib are too high. + + LZ4 is not the standard compression used in Squashfs and so most + file systems will be readable without selecting this option. + + If unsure, say N. + config SQUASHFS_LZO bool Include support for LZO compressed file systems depends on SQUASHFS diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 4132520..246a6f3 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -11,6 +11,7 @@ squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o +squashfs-$(CONFIG_SQUASHFS_LZ4) += lz4_wrapper.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c index ac22fe7..505d6aa 100644 --- a/fs/squashfs/decompressor.c +++ b/fs/squashfs/decompressor.c @@ -41,6 +41,12 @@ static const struct squashfs_decompressor squashfs_lzma_unsupported_comp_ops = { NULL, NULL, NULL, NULL, LZMA_COMPRESSION, lzma, 0 }; +#ifndef CONFIG_SQUASHFS_LZ4 +static const struct squashfs_decompressor squashfs_lz4_comp_ops = { + NULL, NULL, NULL, LZ4_COMPRESSION, lz4, 0 +}; +#endif + #ifndef CONFIG_SQUASHFS_LZO static const struct squashfs_decompressor squashfs_lzo_comp_ops = { NULL, NULL, NULL, NULL, LZO_COMPRESSION, lzo, 0 @@ -65,6 +71,7 @@ static const struct squashfs_decompressor squashfs_unknown_comp_ops = { static const struct squashfs_decompressor *decompressor[] = { squashfs_zlib_comp_ops, + squashfs_lz4_comp_ops, squashfs_lzo_comp_ops, squashfs_xz_comp_ops, squashfs_lzma_unsupported_comp_ops, diff --git a/fs/squashfs/decompressor.h b/fs/squashfs/decompressor.h index af09853..a25713c 100644 --- a/fs/squashfs/decompressor.h +++ b/fs/squashfs/decompressor.h @@ -46,6 +46,10 @@ static inline void *squashfs_comp_opts(struct squashfs_sb_info *msblk, extern const struct squashfs_decompressor squashfs_xz_comp_ops; #endif +#ifdef CONFIG_SQUASHFS_LZ4 +extern const struct squashfs_decompressor squashfs_lz4_comp_ops; +#endif + #ifdef CONFIG_SQUASHFS_LZO extern const struct squashfs_decompressor squashfs_lzo_comp_ops; #endif -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] Squashfs: add LZ4 compression support
Add support for reading file systems compressed with the LZ4 compression algorithm. This patch adds the LZ4 decompressor wrapper code. Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- fs/squashfs/lz4_wrapper.c | 142 + fs/squashfs/squashfs_fs.h |1 + 2 files changed, 143 insertions(+) create mode 100644 fs/squashfs/lz4_wrapper.c diff --git a/fs/squashfs/lz4_wrapper.c b/fs/squashfs/lz4_wrapper.c new file mode 100644 index 000..c31e2bc --- /dev/null +++ b/fs/squashfs/lz4_wrapper.c @@ -0,0 +1,142 @@ +/* + * Copyright (c) 2013, 2014 + * Phillip Lougher phil...@squashfs.org.uk + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include linux/buffer_head.h +#include linux/mutex.h +#include linux/slab.h +#include linux/vmalloc.h +#include linux/lz4.h + +#include squashfs_fs.h +#include squashfs_fs_sb.h +#include squashfs.h +#include decompressor.h +#include page_actor.h + +#define LZ4_LEGACY 1 + +struct lz4_comp_opts { + __le32 version; + __le32 flags; +}; + +struct squashfs_lz4 { + void *input; + void *output; +}; + + +static void *lz4_comp_opts(struct squashfs_sb_info *msblk, + void *buff, int len) +{ + struct lz4_comp_opts *comp_opts = buff; + + /* LZ4 compressed filesystems always have compression options */ + if (comp_opts == NULL || len sizeof(*comp_opts)) + return ERR_PTR(-EIO); + + if (le32_to_cpu(comp_opts-version) != LZ4_LEGACY) { + /* LZ4 format currently used by the kernel is the 'legacy' +* format */ + ERROR(Unknown LZ4 version\n); + return ERR_PTR(-EINVAL); + } + + return NULL; +} + + +static void *lz4_init(struct squashfs_sb_info *msblk, void *buff) +{ + int block_size = max_t(int, msblk-block_size, SQUASHFS_METADATA_SIZE); + struct squashfs_lz4 *stream; + + stream = kzalloc(sizeof(*stream), GFP_KERNEL); + if (stream == NULL) + goto failed; + stream-input = vmalloc(block_size); + if (stream-input == NULL) + goto failed2; + stream-output = vmalloc(block_size); + if (stream-output == NULL) + goto failed3; + + return stream; + +failed3: + vfree(stream-input); +failed2: + kfree(stream); +failed: + ERROR(Failed to initialise LZ4 decompressor\n); + return ERR_PTR(-ENOMEM); +} + + +static void lz4_free(void *strm) +{ + struct squashfs_lz4 *stream = strm; + + if (stream) { + vfree(stream-input); + vfree(stream-output); + } + kfree(stream); +} + + +static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm, + struct buffer_head **bh, int b, int offset, int length, + struct squashfs_page_actor *output) +{ + struct squashfs_lz4 *stream = strm; + void *buff = stream-input, *data; + int avail, i, bytes = length, res; + size_t dest_len = output-length; + + for (i = 0; i b; i++) { + avail = min(bytes, msblk-devblksize - offset); + memcpy(buff, bh[i]-b_data + offset, avail); + buff += avail; + bytes -= avail; + offset = 0; + put_bh(bh[i]); + } + + res = lz4_decompress_unknownoutputsize(stream-input, length, + stream-output, dest_len); + if (res) + return -EIO; + + bytes = dest_len; + data = squashfs_first_page(output); + buff = stream-output; + while (data) { + if (bytes = PAGE_CACHE_SIZE) { + memcpy(data, buff, bytes); + break; + } + memcpy(data, buff, PAGE_CACHE_SIZE); + buff += PAGE_CACHE_SIZE; + bytes -= PAGE_CACHE_SIZE; + data = squashfs_next_page(output); + } + squashfs_finish_page(output); + + return dest_len; +} + +const struct squashfs_decompressor squashfs_lz4_comp_ops = { + .init = lz4_init, + .comp_opts = lz4_comp_opts, + .free = lz4_free, + .decompress = lz4_uncompress, + .id = LZ4_COMPRESSION, + .name = lz4, + .supported = 1 +}; diff --git a/fs/squashfs/squashfs_fs.h b/fs/squashfs/squashfs_fs.h index 4b2beda..506f4ba 100644 --- a/fs/squashfs/squashfs_fs.h +++ b/fs/squashfs/squashfs_fs.h @@ -240,6 +240,7 @@ struct meta_index { #define LZMA_COMPRESSION 2 #define LZO_COMPRESSION3 #define XZ_COMPRESSION 4 +#define LZ4_COMPRESSION5 struct squashfs_super_block { __le32 s_magic; -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org
[PATCH 0/2] Squashfs: add LZ4 compression
Hi, The following is a repost of the Squashfs LZ4 compression patches updated to the latest -rc kernel. These patches were originally sent last year, but they did not receive much interest at the time. These patches are being reposted by popular demand. I have received many private and public requests to repost these patches and to get them mainlined. My intention is to submit them in the next kernel merge window. If you want LZ4 support in Squashfs now is a good time to publically support the inclusion of these patches. Phillip Lougher (2): Squashfs: add LZ4 compression support Squashfs: Add LZ4 compression configuration option Documentation/filesystems/squashfs.txt |8 +- fs/squashfs/Kconfig| 15 fs/squashfs/Makefile |1 + fs/squashfs/decompressor.c |7 ++ fs/squashfs/decompressor.h |4 + fs/squashfs/lz4_wrapper.c | 142 fs/squashfs/squashfs_fs.h |1 + 7 files changed, 174 insertions(+), 4 deletions(-) create mode 100644 fs/squashfs/lz4_wrapper.c These patches are also available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-lz4.git master Squashfs-tools 4.3 already supports LZ4 compression. Phillip -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2 V2] Squashfs: Add LZ4 compression configuration option
Add the glue code, and also update the documentation. Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- V2 patch: when porting the code to the latest -rc I forgot to update the case where CONFIG_SQUASHFS_LZ4 is not defined, leading to a build error. Trivial change: +#ifndef CONFIG_SQUASHFS_LZ4 +static const struct squashfs_decompressor squashfs_lz4_comp_ops = { + NULL, NULL, NULL, LZ4_COMPRESSION, lz4, 0 should be +#ifndef CONFIG_SQUASHFS_LZ4 +static const struct squashfs_decompressor squashfs_lz4_comp_ops = { + NULL, NULL, NULL, NULL, LZ4_COMPRESSION, lz4, 0 Documentation/filesystems/squashfs.txt |8 fs/squashfs/Kconfig| 15 +++ fs/squashfs/Makefile |1 + fs/squashfs/decompressor.c |7 +++ fs/squashfs/decompressor.h |4 5 files changed, 31 insertions(+), 4 deletions(-) diff --git a/Documentation/filesystems/squashfs.txt b/Documentation/filesystems/squashfs.txt index 403c090..e5274f8 100644 --- a/Documentation/filesystems/squashfs.txt +++ b/Documentation/filesystems/squashfs.txt @@ -2,10 +2,10 @@ SQUASHFS 4.0 FILESYSTEM === Squashfs is a compressed read-only filesystem for Linux. -It uses zlib/lzo/xz compression to compress files, inodes and directories. -Inodes in the system are very small and all blocks are packed to minimise -data overhead. Block sizes greater than 4K are supported up to a maximum -of 1Mbytes (default block size 128K). +It uses zlib, lz4, lzo, or xz compression to compress files, inodes and +directories. Inodes in the system are very small and all blocks are packed to +minimise data overhead. Block sizes greater than 4K are supported up to a +maximum of 1Mbytes (default block size 128K). Squashfs is intended for general read-only filesystem use, for archival use (i.e. in cases where a .tar.gz file may be used), and in constrained diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index b6fa865..ffb093e 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -120,6 +120,21 @@ config SQUASHFS_ZLIB If unsure, say Y. +config SQUASHFS_LZ4 + bool Include support for LZ4 compressed file systems + depends on SQUASHFS + select LZ4_DECOMPRESS + help + Saying Y here includes support for reading Squashfs file systems + compressed with LZ4 compression. LZ4 compression is mainly + aimed at embedded systems with slower CPUs where the overheads + of zlib are too high. + + LZ4 is not the standard compression used in Squashfs and so most + file systems will be readable without selecting this option. + + If unsure, say N. + config SQUASHFS_LZO bool Include support for LZO compressed file systems depends on SQUASHFS diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 4132520..246a6f3 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -11,6 +11,7 @@ squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o +squashfs-$(CONFIG_SQUASHFS_LZ4) += lz4_wrapper.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c index ac22fe7..e9034bf 100644 --- a/fs/squashfs/decompressor.c +++ b/fs/squashfs/decompressor.c @@ -41,6 +41,12 @@ static const struct squashfs_decompressor squashfs_lzma_unsupported_comp_ops = { NULL, NULL, NULL, NULL, LZMA_COMPRESSION, lzma, 0 }; +#ifndef CONFIG_SQUASHFS_LZ4 +static const struct squashfs_decompressor squashfs_lz4_comp_ops = { + NULL, NULL, NULL, NULL, LZ4_COMPRESSION, lz4, 0 +}; +#endif + #ifndef CONFIG_SQUASHFS_LZO static const struct squashfs_decompressor squashfs_lzo_comp_ops = { NULL, NULL, NULL, NULL, LZO_COMPRESSION, lzo, 0 @@ -65,6 +71,7 @@ static const struct squashfs_decompressor squashfs_unknown_comp_ops = { static const struct squashfs_decompressor *decompressor[] = { squashfs_zlib_comp_ops, + squashfs_lz4_comp_ops, squashfs_lzo_comp_ops, squashfs_xz_comp_ops, squashfs_lzma_unsupported_comp_ops, diff --git a/fs/squashfs/decompressor.h b/fs/squashfs/decompressor.h index af09853..a25713c 100644 --- a/fs/squashfs/decompressor.h +++ b/fs/squashfs/decompressor.h @@ -46,6 +46,10 @@ static inline void *squashfs_comp_opts(struct squashfs_sb_info *msblk, extern const struct squashfs_decompressor squashfs_xz_comp_ops; #endif +#ifdef CONFIG_SQUASHFS_LZ4 +extern const struct squashfs_decompressor squashfs_lz4_comp_ops; +#endif + #ifdef CONFIG_SQUASHFS_LZO extern const struct squashfs_decompressor squashfs_lzo_comp_ops; #endif
[ANN] Squashfs tools 4.3 released
Hi, I'm pleased to announce the release of Squashfs tools 4.3. This is the first release in over 3 years, and there are substantial improvements to stability, new compression options and compressors, speed optimisations, new options for Mksquashfs/Unsquashfs, and fixes for CVE-2012-4024 and CVE-2012-4025. The new release can be downloaded here: http://sourceforge.net/projects/squashfs/files/latest/download?source=files Changes in Squashfs-tools 4.3 - 1. Stability improvements. Better checking of user input for out of range/invalid values. Better handling of corrupted Squashfs filesystems (Mksquashfs append mode, and Unsquashfs). Better handling of buffer overflow/underflow. 2. GZIP compressor now supports compression options, allowing different compression levels to be used. 3. Rewritten LZO compressor with compression options, allowing different LZO algorithms and different compression levels to be used. 4. New LZ4 compressor (note not yet in mainline kernel) 5. Better default memory usage for Mksquashfs. Mksquashfs by default now uses 25% of physical memory. 6. Duplicate checking in Mksquashfs further optimised. With certain "problem filesystems" greater than 2x performance improvement. Filesystems with a lot of duplicates should see at least 10-20% speed improvement. 7. The -stat option in Unsquashfs now displays the compression options used to generate the original filesystem. Previously -stat only displayed the compression algorithm used. 8. The file being compressed/uncompressed in Mksquashfs/Unsquashfs is now displayed if CTRL-\ (SIGQUIT from keyboard) typed. 9. The status of the internal queues/caches in Mksquashfs/Unsquashfs is now displayed if CTRL-\ (SIGQUIT from keyboard) is typed twice within one second. Normally only useful for "power users", but it can be used to discover if there's any bottlenecks affecting performance (the bottleneck will normally be the compressors/fragment compressors). 10. Miscellaneous new options for Mksquashfs/Unsquashfs to fine tune behaviour. 11. Fixes for CVE-2012-4024 and CVE-2012-4025. Compatiblity Mksquashfs 4.3 generates 4.0 filesystems. These filesystems are fully compatible/interchangable with filesystems generated by Mksquashfs 4.0 and are mountable on 2.6.29 and later kernels. Compressors --- New compression options and compressors are now supported. The new options and compressors are: 1. gzip -Xcompression-level should be 1 .. 9 (default 9) -Xwindow-size should be 8 .. 15 (default 15) -Xstrategy strategy1,strategy2,...,strategyN Compress using strategy1,strategy2,...,strategyN in turn and choose the best compression. Available strategies: default, filtered, huffman_only, run_length_encoded and fixed 2. lzo -Xalgorithm Where is one of: lzo1x_1 lzo1x_1_11 lzo1x_1_12 lzo1x_1_15 lzo1x_999 (default) -Xcompression-level should be 1 .. 9 (default 8) Only applies to lzo1x_999 algorithm 3. lz4 -Xhc Compress using LZ4 High Compression The compression specific options are, obviously, specific to the compressor in question, and you should read the compressor documentation and check their web sites to understand their behaviour. In general the defaults used by Mksquashfs for each compressor are optimised to give the best performance for each compressor, where what constitutes best depends on the compressor. For gzip/xz best means highest compression (trying multiple filters/strategies can improve compression, but this is extremely expensive computationally, and hence, not suitable for the defaults), for LZO/LZ4 best means a tradeoff between compression and (de)-compression overhead (LZO/LZ4 by definition are intended for weaker processors). New Mksquashfs options -- 1. -mem Set the amount of memory used by Mksquashfs to bytes. G/M and K post-fixes are supported. By default Mksquashfs uses 25% of the physical memory. Increasing this with the -mem option can increase performance (note it does not have any effect on compression). Reducing it can prevent thrashing if the system is busy and there is not 25% of physical memory free (again, note it does not have any effect on compression). 2. -exit-on-error By default Mksquashfs treats certain errors as benign, if these errors occur Mksquashfs prints the error on the console but continues. These errors are typically failure to read a file from the source filesystem. This is deliberate, in many cases users prefer Mksquashfs to flag the error but continue rather than abort what may be hours of
[ANN] Squashfs tools 4.3 released
Hi, I'm pleased to announce the release of Squashfs tools 4.3. This is the first release in over 3 years, and there are substantial improvements to stability, new compression options and compressors, speed optimisations, new options for Mksquashfs/Unsquashfs, and fixes for CVE-2012-4024 and CVE-2012-4025. The new release can be downloaded here: http://sourceforge.net/projects/squashfs/files/latest/download?source=files Changes in Squashfs-tools 4.3 - 1. Stability improvements. Better checking of user input for out of range/invalid values. Better handling of corrupted Squashfs filesystems (Mksquashfs append mode, and Unsquashfs). Better handling of buffer overflow/underflow. 2. GZIP compressor now supports compression options, allowing different compression levels to be used. 3. Rewritten LZO compressor with compression options, allowing different LZO algorithms and different compression levels to be used. 4. New LZ4 compressor (note not yet in mainline kernel) 5. Better default memory usage for Mksquashfs. Mksquashfs by default now uses 25% of physical memory. 6. Duplicate checking in Mksquashfs further optimised. With certain problem filesystems greater than 2x performance improvement. Filesystems with a lot of duplicates should see at least 10-20% speed improvement. 7. The -stat option in Unsquashfs now displays the compression options used to generate the original filesystem. Previously -stat only displayed the compression algorithm used. 8. The file being compressed/uncompressed in Mksquashfs/Unsquashfs is now displayed if CTRL-\ (SIGQUIT from keyboard) typed. 9. The status of the internal queues/caches in Mksquashfs/Unsquashfs is now displayed if CTRL-\ (SIGQUIT from keyboard) is typed twice within one second. Normally only useful for power users, but it can be used to discover if there's any bottlenecks affecting performance (the bottleneck will normally be the compressors/fragment compressors). 10. Miscellaneous new options for Mksquashfs/Unsquashfs to fine tune behaviour. 11. Fixes for CVE-2012-4024 and CVE-2012-4025. Compatiblity Mksquashfs 4.3 generates 4.0 filesystems. These filesystems are fully compatible/interchangable with filesystems generated by Mksquashfs 4.0 and are mountable on 2.6.29 and later kernels. Compressors --- New compression options and compressors are now supported. The new options and compressors are: 1. gzip -Xcompression-level compression-level compression-level should be 1 .. 9 (default 9) -Xwindow-size window-size window-size should be 8 .. 15 (default 15) -Xstrategy strategy1,strategy2,...,strategyN Compress using strategy1,strategy2,...,strategyN in turn and choose the best compression. Available strategies: default, filtered, huffman_only, run_length_encoded and fixed 2. lzo -Xalgorithm algorithm Where algorithm is one of: lzo1x_1 lzo1x_1_11 lzo1x_1_12 lzo1x_1_15 lzo1x_999 (default) -Xcompression-level compression-level compression-level should be 1 .. 9 (default 8) Only applies to lzo1x_999 algorithm 3. lz4 -Xhc Compress using LZ4 High Compression The compression specific options are, obviously, specific to the compressor in question, and you should read the compressor documentation and check their web sites to understand their behaviour. In general the defaults used by Mksquashfs for each compressor are optimised to give the best performance for each compressor, where what constitutes best depends on the compressor. For gzip/xz best means highest compression (trying multiple filters/strategies can improve compression, but this is extremely expensive computationally, and hence, not suitable for the defaults), for LZO/LZ4 best means a tradeoff between compression and (de)-compression overhead (LZO/LZ4 by definition are intended for weaker processors). New Mksquashfs options -- 1. -mem size Set the amount of memory used by Mksquashfs to size bytes. G/M and K post-fixes are supported. By default Mksquashfs uses 25% of the physical memory. Increasing this with the -mem option can increase performance (note it does not have any effect on compression). Reducing it can prevent thrashing if the system is busy and there is not 25% of physical memory free (again, note it does not have any effect on compression). 2. -exit-on-error By default Mksquashfs treats certain errors as benign, if these errors occur Mksquashfs prints the error on the console but continues. These errors are typically failure to read a file from the source filesystem. This is deliberate, in
Re: [PATCH] xz: make XZ_DEC_BCJ filters non-optional
On 05/03/14 16:24, Lasse Collin wrote: On 2014-03-05 Phillip Lougher wrote: (BTW Kyle you should have CC'd me on the patch as a courtesy). I could have done that too but somehow I didn't, sorry. np But speaking as the Squashfs author, the lack of BCJ support for an architecture creates a subtle failure mode in Squashfs, this is because not all blocks in a Squashfs filesystem get compressed with a BCJ filter. At compression time each block is compressed without any BCJ filter, and then with the BCJ filter(s) selected on the command line, and the best compression for *that* block is chosen. What this means is kernels without a particular BCJ filter can still read the Squashfs metadata (mount, ls etc.) and read many of the files, it is only some files that mysteriously fail with decompression error. As such this will be (and has been) invariably treated as a bug in Squashfs. There is an easy way to make Squashfs give an error message in the kernel log. xz_dec_run() gives XZ_OPTIONS_ERROR when valid-looking but unsupported input is detected. Currently Squashfs treats all error codes from xz_dec_run() the same way so the reason for the decompression error is silently lost. Yes, that is deliberate. It's to give a generic easy to understand error message for potentially novice users that may be running Linux from LiveCDs. When I wrote the original zlib support for Squashfs, I put in a lot of debug information in the zlib error messages, e.g. ERROR("zlib_inflate returned unexpected result" " 0x%x, srclength %d, avail_in %d," " avail_out %d\n", zlib_err, srclength, msblk->stream.avail_in, msblk->stream.avail_out); But after mainlining Squashfs in 2009, I started to get increasing complaints that the error messages were too technical (full of hex numbers) and confusing to new users - the kind of people who maybe burn a corrupt liveCD and then get screenfulls of these errors full of different numbers coming up on the screen. They would do a websearch and discover that the errors meant "corrupted disk", and then ask why didn't it just say that, and not give all those numbers. Or worse they'd just silently give up and go back to Windows. So in March 2009 I changed it to the error message ERROR("zlib_inflate error, data probably corrupt\n") With *no* numbers ... and I copied the same approach for xz. In kernel 3.13 (released earlier this year) I went further and pulled out the error message printing from the compression wrappers, and made it a single generic message, because I realised there was no longer any decompressor specific error handling (just the same message in each wrapper). ERROR("%s decompression failed, data probably corrupt\n", msblk->decompressor->name); Putting back separate error messages into each decompressor wrapper, and then putting back different error messages based on the error code return seems like a retrograde step because distros don't like them. Below is an *untested* fix. I'm not sure about the exact wording of the error message, so feel free to improve it. diff -Narup linux-3.14-rc5.orig/fs/squashfs/xz_wrapper.c linux-3.14-rc5/fs/squashfs/xz_wrapper.c --- linux-3.14-rc5.orig/fs/squashfs/xz_wrapper.c2014-03-03 04:56:16.0 +0200 +++ linux-3.14-rc5/fs/squashfs/xz_wrapper.c 2014-03-05 18:08:58.729643127 +0200 @@ -170,8 +170,13 @@ static int squashfs_xz_uncompress(struct squashfs_finish_page(output); - if (xz_err != XZ_STREAM_END || k < b) + if (xz_err != XZ_STREAM_END || k < b) { + if (xz_err == XZ_OPTIONS_ERROR) + ERROR("Unsupported XZ-compressed data; check the XZ " + "options in the kernel config\n"); + goto out; + } return total + stream->buf.out_pos; Moreover, without expert knowledge of Squashfs, and the config options, most people will not have a clue how to fix the issue. This is why I prefer the first option, which is to reinstate the enabling of all filters by default, and then to allow people to remove the filters they don't want. I will submit the first option. In the other email Florian Fainelli seemed to be OK with that too. BTW there is a potential additional fix for Squashfs that will make its handling of (lack of) BCJ filters more intelligent at mount time, but this of course only addresses Squashfs, and it relies on an additional call into XZ being added. The BCJ filters specified at filesystem creation are stored in the compression options part of the superblock, and are known at mount time. Squashfs should check that these filters are supported by the kernel and refuse to mount it otherwise. This has not bee
Re: [PATCH] xz: make XZ_DEC_BCJ filters non-optional
On 05/03/14 16:24, Lasse Collin wrote: On 2014-03-05 Phillip Lougher wrote: (BTW Kyle you should have CC'd me on the patch as a courtesy). I could have done that too but somehow I didn't, sorry. np But speaking as the Squashfs author, the lack of BCJ support for an architecture creates a subtle failure mode in Squashfs, this is because not all blocks in a Squashfs filesystem get compressed with a BCJ filter. At compression time each block is compressed without any BCJ filter, and then with the BCJ filter(s) selected on the command line, and the best compression for *that* block is chosen. What this means is kernels without a particular BCJ filter can still read the Squashfs metadata (mount, ls etc.) and read many of the files, it is only some files that mysteriously fail with decompression error. As such this will be (and has been) invariably treated as a bug in Squashfs. There is an easy way to make Squashfs give an error message in the kernel log. xz_dec_run() gives XZ_OPTIONS_ERROR when valid-looking but unsupported input is detected. Currently Squashfs treats all error codes from xz_dec_run() the same way so the reason for the decompression error is silently lost. Yes, that is deliberate. It's to give a generic easy to understand error message for potentially novice users that may be running Linux from LiveCDs. When I wrote the original zlib support for Squashfs, I put in a lot of debug information in the zlib error messages, e.g. ERROR(zlib_inflate returned unexpected result 0x%x, srclength %d, avail_in %d, avail_out %d\n, zlib_err, srclength, msblk-stream.avail_in, msblk-stream.avail_out); But after mainlining Squashfs in 2009, I started to get increasing complaints that the error messages were too technical (full of hex numbers) and confusing to new users - the kind of people who maybe burn a corrupt liveCD and then get screenfulls of these errors full of different numbers coming up on the screen. They would do a websearch and discover that the errors meant corrupted disk, and then ask why didn't it just say that, and not give all those numbers. Or worse they'd just silently give up and go back to Windows. So in March 2009 I changed it to the error message ERROR(zlib_inflate error, data probably corrupt\n) With *no* numbers ... and I copied the same approach for xz. In kernel 3.13 (released earlier this year) I went further and pulled out the error message printing from the compression wrappers, and made it a single generic message, because I realised there was no longer any decompressor specific error handling (just the same message in each wrapper). ERROR(%s decompression failed, data probably corrupt\n, msblk-decompressor-name); Putting back separate error messages into each decompressor wrapper, and then putting back different error messages based on the error code return seems like a retrograde step because distros don't like them. Below is an *untested* fix. I'm not sure about the exact wording of the error message, so feel free to improve it. diff -Narup linux-3.14-rc5.orig/fs/squashfs/xz_wrapper.c linux-3.14-rc5/fs/squashfs/xz_wrapper.c --- linux-3.14-rc5.orig/fs/squashfs/xz_wrapper.c2014-03-03 04:56:16.0 +0200 +++ linux-3.14-rc5/fs/squashfs/xz_wrapper.c 2014-03-05 18:08:58.729643127 +0200 @@ -170,8 +170,13 @@ static int squashfs_xz_uncompress(struct squashfs_finish_page(output); - if (xz_err != XZ_STREAM_END || k b) + if (xz_err != XZ_STREAM_END || k b) { + if (xz_err == XZ_OPTIONS_ERROR) + ERROR(Unsupported XZ-compressed data; check the XZ + options in the kernel config\n); + goto out; + } return total + stream-buf.out_pos; Moreover, without expert knowledge of Squashfs, and the config options, most people will not have a clue how to fix the issue. This is why I prefer the first option, which is to reinstate the enabling of all filters by default, and then to allow people to remove the filters they don't want. I will submit the first option. In the other email Florian Fainelli seemed to be OK with that too. BTW there is a potential additional fix for Squashfs that will make its handling of (lack of) BCJ filters more intelligent at mount time, but this of course only addresses Squashfs, and it relies on an additional call into XZ being added. The BCJ filters specified at filesystem creation are stored in the compression options part of the superblock, and are known at mount time. Squashfs should check that these filters are supported by the kernel and refuse to mount it otherwise. This has not been done because AFAIK there is no way to query XZ to determine which BCJ filters are supported (beyond passing it a test stream
Re: [PATCH] Squashfs: add asynchronous read support
On 16/12/13 05:30, Chanho Min wrote: This patch removes synchronous wait for the up-to-date of buffer in the file system level. Instead all operations after submit_bh are moved into the End-of-IO handler and its associated workeque. It decompresses/copies data into pages and unlock them asynchronously. This patch enhances the performance of Squashfs in most cases. Especially, large file reading is improved significantly. Hi, The following is the summarised results of a set of comprehensive tests of the asynchronous patch against the current synchronous Squashfs readpage implementation. The following tables should be fairly self-explanatory, but, the testing methodology was: Generate a series of Squashfs filesystems, with block size 1024K, 512K, 256K, 128K and 64K. Then for each filesystem Run "dd if=/mnt/file of=/dev/null bs=X" Where X is 4K, 8K, 16K, 32K, 64K, 128K, 256K, 512K, and 1024K For each dd, run it against six different Squashfs modules, configured with the following different options: 1. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_SINGLE selected i.e. Async patch and single threaded decompression == Asyn Single in following tables 2. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_SINGLE selected i.e. No Async patch and single threaded decompression == No Asyn Single in following tables 3. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_MULTI selected i.e. Async patch and multi-threaded decompression == Asyn Multi in following tables 4. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_MULTI selected i.e. No Async patch and multi-threaded decompression == No Asyn Multi in following tables 5. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_MULTI_PERCPU selected i.e. Async patch and percpu multi-threaded decompression == Asyn Percpu in following tables 6. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_MULTI_PERCPU selected i.e. No Async patch and percpu multi-threaded decompression == No Asyn Percpu in following tables The figures in the following tables are the MB/s reported by dd. The tests were performed on a KVM guest with 4 cores and 4Gb of memory, running on a core i5 based host. The Squashfs filesystem was on "/dev/hdb". /mnt/file is a 3Gb file, average compression 22% (635 Mb) Squashfs: gzip filesystem 1024K blocks AsynNo Asyn AsynNo Asyn AsynNo Asyn Single Single Multi Multi Percpu Percpu --- 4K: 89.497.589.998.190.699.1 8K: 89.999.089.799.490.399.4 16K:90.699.890.8100 90.297.0 32K:90.398.790.398.089.9101 64K:90.397.690.297.190.199.7 128K: 90.498.690.297.690.798.5 256K: 89.796.989.899.290.2101 512K: 89.798.990.898.189.497.8 1024K: 89.398.089.698.688.796.4 Squashfs: gzip filesystem 512K blocks AsynNo Asyn AsynNo Asyn AsynNo Asyn Single Single Multi Multi Percpu Percpu --- 4K: 68.594.967.699.068.997.0 8K: 69.3101 68.994.369.097.2 16K:68.998.669.498.968.898.0 32K:68.696.569.498.969.4108 64K:68.792.969.7101 68.898.2 128K: 67.4102 68.790.369.4100 256K: 68.795.168.299.768.597.7 512K: 69.9114 82.0104 74.294.4 1024K: 71.6105 79.2105 69.198.0 Squashfs: gzip filesystem 256K blocks AsynNo Asyn AsynNo Asyn AsynNo Asyn Single Single Multi Multi Percpu Percpu --- 4K: 53.692.254.687.553.782.1 8K: 53.587.353.585.053.585.7 16K:53.189.053.895.753.591.1 32K:54.095.953.898.753.985.3 64K:53.786.953.4103 53.486.3 128K: 53.294.453.6100 53.797.9 256K: 55.5101 53.094.153.387.0 512K: 53.193.053.487.753.289.8 1024K: 53.291.452.791.353.095.4 A couple of points about the above can be noticed: 1. With a Squashfs block size of 256K and greater, Squashfs readpage() does its own readahead. This means the asynchronous readpage is never called multiply (to run in parallel), because there is never any more work to do after the first readpage(). The above results therefore reflect the basic performance of the asynchronous readpage implementation versus the synchronous readpage implementation. 2. It can be seen in all cases the asynchronous readpage implementation performs worse than the synchronous readpage implementation.
Re: [PATCH] Squashfs: add asynchronous read support
On 16/12/13 05:30, Chanho Min wrote: This patch removes synchronous wait for the up-to-date of buffer in the file system level. Instead all operations after submit_bh are moved into the End-of-IO handler and its associated workeque. It decompresses/copies data into pages and unlock them asynchronously. This patch enhances the performance of Squashfs in most cases. Especially, large file reading is improved significantly. Hi, The following is the summarised results of a set of comprehensive tests of the asynchronous patch against the current synchronous Squashfs readpage implementation. The following tables should be fairly self-explanatory, but, the testing methodology was: Generate a series of Squashfs filesystems, with block size 1024K, 512K, 256K, 128K and 64K. Then for each filesystem Run dd if=/mnt/file of=/dev/null bs=X Where X is 4K, 8K, 16K, 32K, 64K, 128K, 256K, 512K, and 1024K For each dd, run it against six different Squashfs modules, configured with the following different options: 1. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_SINGLE selected i.e. Async patch and single threaded decompression == Asyn Single in following tables 2. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_SINGLE selected i.e. No Async patch and single threaded decompression == No Asyn Single in following tables 3. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_MULTI selected i.e. Async patch and multi-threaded decompression == Asyn Multi in following tables 4. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_MULTI selected i.e. No Async patch and multi-threaded decompression == No Asyn Multi in following tables 5. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_MULTI_PERCPU selected i.e. Async patch and percpu multi-threaded decompression == Asyn Percpu in following tables 6. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_MULTI_PERCPU selected i.e. No Async patch and percpu multi-threaded decompression == No Asyn Percpu in following tables The figures in the following tables are the MB/s reported by dd. The tests were performed on a KVM guest with 4 cores and 4Gb of memory, running on a core i5 based host. The Squashfs filesystem was on /dev/hdb. /mnt/file is a 3Gb file, average compression 22% (635 Mb) Squashfs: gzip filesystem 1024K blocks AsynNo Asyn AsynNo Asyn AsynNo Asyn Single Single Multi Multi Percpu Percpu --- 4K: 89.497.589.998.190.699.1 8K: 89.999.089.799.490.399.4 16K:90.699.890.8100 90.297.0 32K:90.398.790.398.089.9101 64K:90.397.690.297.190.199.7 128K: 90.498.690.297.690.798.5 256K: 89.796.989.899.290.2101 512K: 89.798.990.898.189.497.8 1024K: 89.398.089.698.688.796.4 Squashfs: gzip filesystem 512K blocks AsynNo Asyn AsynNo Asyn AsynNo Asyn Single Single Multi Multi Percpu Percpu --- 4K: 68.594.967.699.068.997.0 8K: 69.3101 68.994.369.097.2 16K:68.998.669.498.968.898.0 32K:68.696.569.498.969.4108 64K:68.792.969.7101 68.898.2 128K: 67.4102 68.790.369.4100 256K: 68.795.168.299.768.597.7 512K: 69.9114 82.0104 74.294.4 1024K: 71.6105 79.2105 69.198.0 Squashfs: gzip filesystem 256K blocks AsynNo Asyn AsynNo Asyn AsynNo Asyn Single Single Multi Multi Percpu Percpu --- 4K: 53.692.254.687.553.782.1 8K: 53.587.353.585.053.585.7 16K:53.189.053.895.753.591.1 32K:54.095.953.898.753.985.3 64K:53.786.953.4103 53.486.3 128K: 53.294.453.6100 53.797.9 256K: 55.5101 53.094.153.387.0 512K: 53.193.053.487.753.289.8 1024K: 53.291.452.791.353.095.4 A couple of points about the above can be noticed: 1. With a Squashfs block size of 256K and greater, Squashfs readpage() does its own readahead. This means the asynchronous readpage is never called multiply (to run in parallel), because there is never any more work to do after the first readpage(). The above results therefore reflect the basic performance of the asynchronous readpage implementation versus the synchronous readpage implementation. 2. It can be seen in all cases the asynchronous readpage implementation performs worse than the synchronous readpage implementation. This
[GIT PULL] Squashfs bug fixes for 3.13
Hi Linus, Please consider pulling the following Squashfs bug fix. Thanks Phillip The following changes since commit ed4f381ec15e5f11724cdbc68cffd2c22d1eaebd: Squashfs: Check stream is not NULL in decompressor_multi.c (2013-11-20 03:59:20 +) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next.git tags/squashfs-fixes for you to fetch changes up to 6d565409503f4e1f74ac30de14e8c91a2b826cd8: Squashfs: fix failure to unlock pages on decompress error (2013-11-24 01:02:50 +) Just a single bug fix to the new "directly decompress into the page cache" code. -------- Phillip Lougher (1): Squashfs: fix failure to unlock pages on decompress error fs/squashfs/file_direct.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] Squashfs bug fixes for 3.13
Hi Linus, Please consider pulling the following Squashfs bug fix. Thanks Phillip The following changes since commit ed4f381ec15e5f11724cdbc68cffd2c22d1eaebd: Squashfs: Check stream is not NULL in decompressor_multi.c (2013-11-20 03:59:20 +) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next.git tags/squashfs-fixes for you to fetch changes up to 6d565409503f4e1f74ac30de14e8c91a2b826cd8: Squashfs: fix failure to unlock pages on decompress error (2013-11-24 01:02:50 +) Just a single bug fix to the new directly decompress into the page cache code. Phillip Lougher (1): Squashfs: fix failure to unlock pages on decompress error fs/squashfs/file_direct.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Squashfs: fix failure to unlock pages on decompress error
Direct decompression into the page cache. If we fall back to using an intermediate buffer (because we cannot grab all the page cache pages) and we get a decompress fail, we forgot to release the pages. Reported-by: Roman Peniaev Signed-off-by: Phillip Lougher --- fs/squashfs/file_direct.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c index 2943b2b..62a0de6 100644 --- a/fs/squashfs/file_direct.c +++ b/fs/squashfs/file_direct.c @@ -84,6 +84,9 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize) */ res = squashfs_read_cache(target_page, block, bsize, pages, page); + if (res < 0) + goto mark_errored; + goto out; } @@ -119,7 +122,7 @@ mark_errored: * dealt with by the caller */ for (i = 0; i < pages; i++) { - if (page[i] == target_page) + if (page[i] == NULL || page[i] == target_page) continue; flush_dcache_page(page[i]); SetPageError(page[i]); -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] Squashfs: Directly decompress into the page cache for file data
On 22/11/13 07:40, Roman Peniaev wrote: Hello, Phillip. one remark below: +static int squashfs_read_cache(struct page *target_page, u64 block, int bsize, + int pages, struct page **page) +{ + struct inode *i = target_page->mapping->host; + struct squashfs_cache_entry *buffer = squashfs_get_datablock(i- i_sb, +block, bsize); + int bytes = buffer->length, res = buffer->error, n, offset = 0; + void *pageaddr; + + if (res) { + ERROR("Unable to read page, block %llx, size %x\n", block, + bsize); + goto out; have you forgotten to unlock the pages on error path? in case of error squashfs_readpage will unlock only target page. Yup, a fix for -rc1 is on its way. This error path failed to trigger in my stress tests because it is pretty hard to hit, because it relies on decompress failure at the same time as we've raced against another process also grabbing the pages. Phillip + } + + for (n = 0; n < pages && bytes > 0; n++, + bytes -= PAGE_CACHE_SIZE, offset += PAGE_CACHE_SIZE) { + int avail = min_t(int, bytes, PAGE_CACHE_SIZE); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] Squashfs: Directly decompress into the page cache for file data
On 22/11/13 07:40, Roman Peniaev wrote: Hello, Phillip. one remark below: +static int squashfs_read_cache(struct page *target_page, u64 block, int bsize, + int pages, struct page **page) +{ + struct inode *i = target_page-mapping-host; + struct squashfs_cache_entry *buffer = squashfs_get_datablock(i- i_sb, +block, bsize); + int bytes = buffer-length, res = buffer-error, n, offset = 0; + void *pageaddr; + + if (res) { + ERROR(Unable to read page, block %llx, size %x\n, block, + bsize); + goto out; have you forgotten to unlock the pages on error path? in case of error squashfs_readpage will unlock only target page. Yup, a fix for -rc1 is on its way. This error path failed to trigger in my stress tests because it is pretty hard to hit, because it relies on decompress failure at the same time as we've raced against another process also grabbing the pages. Phillip + } + + for (n = 0; n pages bytes 0; n++, + bytes -= PAGE_CACHE_SIZE, offset += PAGE_CACHE_SIZE) { + int avail = min_t(int, bytes, PAGE_CACHE_SIZE); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Squashfs: fix failure to unlock pages on decompress error
Direct decompression into the page cache. If we fall back to using an intermediate buffer (because we cannot grab all the page cache pages) and we get a decompress fail, we forgot to release the pages. Reported-by: Roman Peniaev r.peni...@gmail.com Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- fs/squashfs/file_direct.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c index 2943b2b..62a0de6 100644 --- a/fs/squashfs/file_direct.c +++ b/fs/squashfs/file_direct.c @@ -84,6 +84,9 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize) */ res = squashfs_read_cache(target_page, block, bsize, pages, page); + if (res 0) + goto mark_errored; + goto out; } @@ -119,7 +122,7 @@ mark_errored: * dealt with by the caller */ for (i = 0; i pages; i++) { - if (page[i] == target_page) + if (page[i] == NULL || page[i] == target_page) continue; flush_dcache_page(page[i]); SetPageError(page[i]); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] double free in decompressor.c
On 22/11/13 21:50, Geyslan Gregório Bem wrote: Coverity caught double free possibility (CID 1130962). I can patch this, but I have to know if is correct to free comp_opts in the function squashfs_decompressor_create() or it had to be done in the caller. My bet is the caller. 128void *squashfs_decompressor_setup(struct super_block *sb, unsigned short flags) 129{ 130struct squashfs_sb_info *msblk = sb->s_fs_info; 131void *stream, *comp_opts = get_comp_opts(sb, flags); 132 1. Condition "IS_ERR(comp_opts)", taking false branch 133if (IS_ERR(comp_opts)) 134return comp_opts; 135 2. freed_arg: "squashfs_decompressor_create(struct squashfs_sb_info *, void *)" frees "comp_opts".[show details] 136stream = squashfs_decompressor_create(msblk, comp_opts); 3. Condition "IS_ERR(stream)", taking true branch 137if (IS_ERR(stream)) FALSE positive. squashfs_decompressor_create() frees comp_opts only on success. If IS_ERR(stream) is true, then comp_opts has not been freed by squashfs_decompressor_create(). Phillip CID 1130962 (#1 of 1): Double free (USE_AFTER_FREE)4. double_free: Calling "kfree(void const *)" frees pointer "comp_opts" which has already been freed. 138kfree(comp_opts); 139 140return stream; 141} -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] double free in decompressor.c
On 22/11/13 21:50, Geyslan Gregório Bem wrote: Coverity caught double free possibility (CID 1130962). I can patch this, but I have to know if is correct to free comp_opts in the function squashfs_decompressor_create() or it had to be done in the caller. My bet is the caller. 128void *squashfs_decompressor_setup(struct super_block *sb, unsigned short flags) 129{ 130struct squashfs_sb_info *msblk = sb-s_fs_info; 131void *stream, *comp_opts = get_comp_opts(sb, flags); 132 1. Condition IS_ERR(comp_opts), taking false branch 133if (IS_ERR(comp_opts)) 134return comp_opts; 135 2. freed_arg: squashfs_decompressor_create(struct squashfs_sb_info *, void *) frees comp_opts.[show details] 136stream = squashfs_decompressor_create(msblk, comp_opts); 3. Condition IS_ERR(stream), taking true branch 137if (IS_ERR(stream)) FALSE positive. squashfs_decompressor_create() frees comp_opts only on success. If IS_ERR(stream) is true, then comp_opts has not been freed by squashfs_decompressor_create(). Phillip CID 1130962 (#1 of 1): Double free (USE_AFTER_FREE)4. double_free: Calling kfree(void const *) frees pointer comp_opts which has already been freed. 138kfree(comp_opts); 139 140return stream; 141} -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] Squashfs updates for 3.13
Hi Linus, Please pull the following Squashfs updates ... Thanks Phillip The following changes since commit 959f58544b7f20c92d5eb43d1232c96c15c01bfb: Linux 3.12-rc7 (2013-10-27 16:12:03 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next.git tags/squashfs-updates for you to fetch changes up to ed4f381ec15e5f11724cdbc68cffd2c22d1eaebd: Squashfs: Check stream is not NULL in decompressor_multi.c (2013-11-20 03:59:20 +) These patches optionally improve the multi-threading peformance of Squashfs by adding parallel decompression, and direct decompression into the page cache, eliminating an intermediate buffer (removing memcpy overhead and lock contention). Minchan Kim (1): squashfs: Enhance parallel I/O Phillip Lougher (6): Squashfs: Refactor decompressor interface and code Squashfs: add multi-threaded decompression using percpu variable Squashfs: Generalise paging handling in the decompressors Squashfs: Restructure squashfs_readpage() Squashfs: Directly decompress into the page cache for file data Squashfs: Check stream is not NULL in decompressor_multi.c fs/squashfs/Kconfig | 72 fs/squashfs/Makefile| 5 + fs/squashfs/block.c | 36 +++--- fs/squashfs/cache.c | 28 - fs/squashfs/decompressor.c | 59 +++--- fs/squashfs/decompressor.h | 24 ++-- fs/squashfs/decompressor_multi.c| 198 fs/squashfs/decompressor_multi_percpu.c | 97 fs/squashfs/decompressor_single.c | 85 ++ fs/squashfs/file.c | 142 --- fs/squashfs/file_cache.c| 38 ++ fs/squashfs/file_direct.c | 173 fs/squashfs/lzo_wrapper.c | 47 fs/squashfs/page_actor.c| 100 fs/squashfs/page_actor.h| 81 + fs/squashfs/squashfs.h | 20 +++- fs/squashfs/squashfs_fs_sb.h| 4 +- fs/squashfs/super.c | 10 +- fs/squashfs/xz_wrapper.c| 105 + fs/squashfs/zlib_wrapper.c | 64 --- 20 files changed, 1145 insertions(+), 243 deletions(-) create mode 100644 fs/squashfs/decompressor_multi.c create mode 100644 fs/squashfs/decompressor_multi_percpu.c create mode 100644 fs/squashfs/decompressor_single.c create mode 100644 fs/squashfs/file_cache.c create mode 100644 fs/squashfs/file_direct.c create mode 100644 fs/squashfs/page_actor.c create mode 100644 fs/squashfs/page_actor.h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/7] Squashfs: add multi-threaded decompression using percpu variables (V2)
On 20/11/13 08:33, Minchan Kim wrote: On Wed, Nov 20, 2013 at 01:48:06AM +, Phillip Lougher wrote: Add a multi-threaded decompression implementation which uses percpu variables. Using percpu variables has advantages and disadvantages over implementations which do not use percpu variables. Advantages: * the nature of percpu variables ensures decompression is load-balanced across the multiple cores. * simplicity. Disadvantages: it limits decompression to one thread per core. V2: * squashfs_decompressor_create: improve error handling path, re freeing of decompressors and comp_opts * decompressor_multi_percpu.c: include percpu.h header * Kconfig: indentation Signed-off-by: Phillip Lougher --- fs/squashfs/Kconfig | 57 ++- fs/squashfs/Makefile| 10 +--- fs/squashfs/decompressor_multi_percpu.c | 98 + 3 files changed, 145 insertions(+), 20 deletions(-) create mode 100644 fs/squashfs/decompressor_multi_percpu.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index 1c6d340..159bd66 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -25,6 +25,50 @@ config SQUASHFS If unsure, say N. +choice + prompt "Decompressor parallelisation options" Nitpick: How about adding default explicitly? default SQUASHFS_DECOMP_SINGLE I initially did that :-) only to get an error returned from kbuild that defaults on choices were not supported. PITA. + depends on SQUASHFS + help + Squashfs now supports three parallelisation options for + decompression. Each one exhibits various trade-offs between + decompression performance and CPU and memory usage. + + If in doubt, select "Single threaded compression" + +config SQUASHFS_DECOMP_SINGLE + bool "Single threaded compression" + help + Traditionally Squashfs has used single-threaded decompression. + Only one block (data or metadata) can be decompressed at any + one time. This limits CPU and memory usage to a minimum. + +config SQUASHFS_DECOMP_MULTI + bool "Use multiple decompressors for parallel I/O" + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + If you have a parallel I/O workload and your system has enough memory, + using this option may improve overall I/O performance. + + This decompressor implementation uses up to two parallel + decompressors per core. It dynamically allocates decompressors + on a demand basis. + +config SQUASHFS_DECOMP_MULTI_PERCPU + bool "Use percpu multiple decompressors for parallel I/O" + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + This decompressor implementation uses a maximum of one + decompressor per core. It uses percpu variables to ensure Minor: ^ unnecessary white space. Two spaces after the full stop, starting a new sentence? As a kid back in the 1970s and early 80s that's what I was taught. Maybe it's become old fashioned and I've not noticed. http://www.rednovalabs.com/generation-gap-one-space-or-two-after-a-period/ + decompression is load-balanced across the cores. Actually, I am not sure it's good idea to mention percpu in description. Normal people wouldn't know that and I think what they can want to know is what's benefit compared to SQUASHFS_DECOMP_MULTI. The people who have asked me for the percpu implementation will know :-) Some people want to have the percpu implementation. Some don't. Some people who want the percpu implementation *only* want the percpu implementation. Some people who don't want the percpu implementation *only* want the non-percpu implementation. People are different, and try as I might, that's not going to go away anytime soon. I have spent more time arguing over this 90 odd lines of code than anything else. Should I call the percpu implementation the new "bike shed"? Because that's how it seems to me at the moment. Life's too short to get hung up about this. I'm adding both implementations because that keeps more people happy than the alternative. Maybe I should just drop both implementations for this merge window, and invite everyone else to fight it out over which single implementation they want, and I'll take a ring side seat while it happens. How about this? This decompressor implementation uses a maximum of one decompressor per core and t
Re: [PATCH 3/7] Squashfs: add multi-threaded decompression using percpu variables (V2)
On 20/11/13 08:33, Minchan Kim wrote: On Wed, Nov 20, 2013 at 01:48:06AM +, Phillip Lougher wrote: Add a multi-threaded decompression implementation which uses percpu variables. Using percpu variables has advantages and disadvantages over implementations which do not use percpu variables. Advantages: * the nature of percpu variables ensures decompression is load-balanced across the multiple cores. * simplicity. Disadvantages: it limits decompression to one thread per core. V2: * squashfs_decompressor_create: improve error handling path, re freeing of decompressors and comp_opts * decompressor_multi_percpu.c: include percpu.h header * Kconfig: indentation Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- fs/squashfs/Kconfig | 57 ++- fs/squashfs/Makefile| 10 +--- fs/squashfs/decompressor_multi_percpu.c | 98 + 3 files changed, 145 insertions(+), 20 deletions(-) create mode 100644 fs/squashfs/decompressor_multi_percpu.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index 1c6d340..159bd66 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -25,6 +25,50 @@ config SQUASHFS If unsure, say N. +choice + prompt Decompressor parallelisation options Nitpick: How about adding default explicitly? default SQUASHFS_DECOMP_SINGLE I initially did that :-) only to get an error returned from kbuild that defaults on choices were not supported. PITA. + depends on SQUASHFS + help + Squashfs now supports three parallelisation options for + decompression. Each one exhibits various trade-offs between + decompression performance and CPU and memory usage. + + If in doubt, select Single threaded compression + +config SQUASHFS_DECOMP_SINGLE + bool Single threaded compression + help + Traditionally Squashfs has used single-threaded decompression. + Only one block (data or metadata) can be decompressed at any + one time. This limits CPU and memory usage to a minimum. + +config SQUASHFS_DECOMP_MULTI + bool Use multiple decompressors for parallel I/O + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + If you have a parallel I/O workload and your system has enough memory, + using this option may improve overall I/O performance. + + This decompressor implementation uses up to two parallel + decompressors per core. It dynamically allocates decompressors + on a demand basis. + +config SQUASHFS_DECOMP_MULTI_PERCPU + bool Use percpu multiple decompressors for parallel I/O + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + This decompressor implementation uses a maximum of one + decompressor per core. It uses percpu variables to ensure Minor: ^ unnecessary white space. Two spaces after the full stop, starting a new sentence? As a kid back in the 1970s and early 80s that's what I was taught. Maybe it's become old fashioned and I've not noticed. http://www.rednovalabs.com/generation-gap-one-space-or-two-after-a-period/ + decompression is load-balanced across the cores. Actually, I am not sure it's good idea to mention percpu in description. Normal people wouldn't know that and I think what they can want to know is what's benefit compared to SQUASHFS_DECOMP_MULTI. The people who have asked me for the percpu implementation will know :-) Some people want to have the percpu implementation. Some don't. Some people who want the percpu implementation *only* want the percpu implementation. Some people who don't want the percpu implementation *only* want the non-percpu implementation. People are different, and try as I might, that's not going to go away anytime soon. I have spent more time arguing over this 90 odd lines of code than anything else. Should I call the percpu implementation the new bike shed? Because that's how it seems to me at the moment. Life's too short to get hung up about this. I'm adding both implementations because that keeps more people happy than the alternative. Maybe I should just drop both implementations for this merge window, and invite everyone else to fight it out over which single implementation they want, and I'll take a ring side seat while it happens. How about this? This decompressor implementation uses a maximum of one decompressor per core and the decompressor is allocated statically so
[GIT PULL] Squashfs updates for 3.13
Hi Linus, Please pull the following Squashfs updates ... Thanks Phillip The following changes since commit 959f58544b7f20c92d5eb43d1232c96c15c01bfb: Linux 3.12-rc7 (2013-10-27 16:12:03 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next.git tags/squashfs-updates for you to fetch changes up to ed4f381ec15e5f11724cdbc68cffd2c22d1eaebd: Squashfs: Check stream is not NULL in decompressor_multi.c (2013-11-20 03:59:20 +) These patches optionally improve the multi-threading peformance of Squashfs by adding parallel decompression, and direct decompression into the page cache, eliminating an intermediate buffer (removing memcpy overhead and lock contention). Minchan Kim (1): squashfs: Enhance parallel I/O Phillip Lougher (6): Squashfs: Refactor decompressor interface and code Squashfs: add multi-threaded decompression using percpu variable Squashfs: Generalise paging handling in the decompressors Squashfs: Restructure squashfs_readpage() Squashfs: Directly decompress into the page cache for file data Squashfs: Check stream is not NULL in decompressor_multi.c fs/squashfs/Kconfig | 72 fs/squashfs/Makefile| 5 + fs/squashfs/block.c | 36 +++--- fs/squashfs/cache.c | 28 - fs/squashfs/decompressor.c | 59 +++--- fs/squashfs/decompressor.h | 24 ++-- fs/squashfs/decompressor_multi.c| 198 fs/squashfs/decompressor_multi_percpu.c | 97 fs/squashfs/decompressor_single.c | 85 ++ fs/squashfs/file.c | 142 --- fs/squashfs/file_cache.c| 38 ++ fs/squashfs/file_direct.c | 173 fs/squashfs/lzo_wrapper.c | 47 fs/squashfs/page_actor.c| 100 fs/squashfs/page_actor.h| 81 + fs/squashfs/squashfs.h | 20 +++- fs/squashfs/squashfs_fs_sb.h| 4 +- fs/squashfs/super.c | 10 +- fs/squashfs/xz_wrapper.c| 105 + fs/squashfs/zlib_wrapper.c | 64 --- 20 files changed, 1145 insertions(+), 243 deletions(-) create mode 100644 fs/squashfs/decompressor_multi.c create mode 100644 fs/squashfs/decompressor_multi_percpu.c create mode 100644 fs/squashfs/decompressor_single.c create mode 100644 fs/squashfs/file_cache.c create mode 100644 fs/squashfs/file_direct.c create mode 100644 fs/squashfs/page_actor.c create mode 100644 fs/squashfs/page_actor.h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/7] Squashfs: Check stream is not NULL in decompressor_multi.c
Fix static checker complaint that stream is not checked in squashfs_decompressor_destroy(). Reported-by: Dan Carpenter Signed-off-by: Phillip Lougher Reviewed-by: Minchan Kim --- fs/squashfs/decompressor_multi.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/squashfs/decompressor_multi.c b/fs/squashfs/decompressor_multi.c index ae54675..d6008a6 100644 --- a/fs/squashfs/decompressor_multi.c +++ b/fs/squashfs/decompressor_multi.c @@ -119,11 +119,10 @@ void squashfs_decompressor_destroy(struct squashfs_sb_info *msblk) kfree(decomp_strm); stream->avail_decomp--; } + WARN_ON(stream->avail_decomp); + kfree(stream->comp_opts); + kfree(stream); } - - WARN_ON(stream->avail_decomp); - kfree(stream->comp_opts); - kfree(stream); } -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/7] Squashfs: Generalise paging handling in the decompressors
Further generalise the decompressors by adding a page handler abstraction. This adds helpers to allow the decompressors to access and process the output buffers in an implementation independant manner. This allows different types of output buffer to be passed to the decompressors, with the implementation specific aspects handled at decompression time, but without the knowledge being held in the decompressor wrapper code. This will allow the decompressors to handle Squashfs cache buffers, and page cache pages. This patch adds the abstraction and an implementation for the caches. Signed-off-by: Phillip Lougher Reviewed-by: Minchan Kim --- fs/squashfs/block.c | 27 ++ fs/squashfs/cache.c | 28 +++ fs/squashfs/decompressor.c | 14 -- fs/squashfs/decompressor.h | 5 ++-- fs/squashfs/decompressor_multi.c| 7 ++--- fs/squashfs/decompressor_multi_percpu.c | 9 +++--- fs/squashfs/decompressor_single.c | 9 +++--- fs/squashfs/lzo_wrapper.c | 27 -- fs/squashfs/page_actor.h| 49 + fs/squashfs/squashfs.h | 8 +++--- fs/squashfs/squashfs_fs_sb.h| 1 + fs/squashfs/xz_wrapper.c| 22 +-- fs/squashfs/zlib_wrapper.c | 24 ++-- 13 files changed, 163 insertions(+), 67 deletions(-) create mode 100644 fs/squashfs/page_actor.h diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 4dd4025..0cea9b9 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -36,6 +36,7 @@ #include "squashfs_fs_sb.h" #include "squashfs.h" #include "decompressor.h" +#include "page_actor.h" /* * Read the metadata block length, this is stored in the first two @@ -86,16 +87,16 @@ static struct buffer_head *get_block_length(struct super_block *sb, * generated a larger block - this does occasionally happen with compression * algorithms). */ -int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, - int length, u64 *next_index, int srclength, int pages) +int squashfs_read_data(struct super_block *sb, u64 index, int length, + u64 *next_index, struct squashfs_page_actor *output) { struct squashfs_sb_info *msblk = sb->s_fs_info; struct buffer_head **bh; int offset = index & ((1 << msblk->devblksize_log2) - 1); u64 cur_index = index >> msblk->devblksize_log2; - int bytes, compressed, b = 0, k = 0, page = 0, avail, i; + int bytes, compressed, b = 0, k = 0, avail, i; - bh = kcalloc(((srclength + msblk->devblksize - 1) + bh = kcalloc(((output->length + msblk->devblksize - 1) >> msblk->devblksize_log2) + 1, sizeof(*bh), GFP_KERNEL); if (bh == NULL) return -ENOMEM; @@ -111,9 +112,9 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, *next_index = index + length; TRACE("Block @ 0x%llx, %scompressed size %d, src size %d\n", - index, compressed ? "" : "un", length, srclength); + index, compressed ? "" : "un", length, output->length); - if (length < 0 || length > srclength || + if (length < 0 || length > output->length || (index + length) > msblk->bytes_used) goto read_failure; @@ -145,7 +146,7 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, TRACE("Block @ 0x%llx, %scompressed size %d\n", index, compressed ? "" : "un", length); - if (length < 0 || length > srclength || + if (length < 0 || length > output->length || (index + length) > msblk->bytes_used) goto block_release; @@ -165,8 +166,8 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, } if (compressed) { - length = squashfs_decompress(msblk, buffer, bh, b, offset, -length, srclength, pages); + length = squashfs_decompress(msblk, bh, b, offset, length, + output); if (length < 0) goto read_failure; } else { @@ -174,19 +175,20 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, * Block is uncompressed. */ int in, pg_offset = 0; + void *data = squashfs_first_page(output); for (bytes = length; k < b;
[PATCH 6/7] Squashfs: Directly decompress into the page cache for file data
This introduces an implementation of squashfs_readpage_block() that directly decompresses into the page cache. This uses the previously added page handler abstraction to push down the necessary kmap_atomic/kunmap_atomic operations on the page cache buffers into the decompressors. This enables direct copying into the page cache without using the slow kmap/kunmap calls. The code detects when multiple threads are racing in squashfs_readpage() to decompress the same block, and avoids this regression by falling back to using an intermediate buffer. This patch enhances the performance of Squashfs significantly when multiple processes are accessing the filesystem simultaneously because it not only reduces memcopying, but it more importantly eliminates the lock contention on the intermediate buffer. Using single-thread decompression. dd if=file1 of=/dev/null bs=4096 & dd if=file2 of=/dev/null bs=4096 & dd if=file3 of=/dev/null bs=4096 & dd if=file4 of=/dev/null bs=4096 Before: 629145600 bytes (629 MB) copied, 45.8046 s, 13.7 MB/s After: 629145600 bytes (629 MB) copied, 9.29414 s, 67.7 MB/s Signed-off-by: Phillip Lougher Reviewed-by: Minchan Kim --- fs/squashfs/Kconfig | 28 fs/squashfs/Makefile | 4 +- fs/squashfs/file_direct.c | 173 ++ fs/squashfs/page_actor.c | 100 +++ fs/squashfs/page_actor.h | 32 + 5 files changed, 336 insertions(+), 1 deletion(-) create mode 100644 fs/squashfs/file_direct.c create mode 100644 fs/squashfs/page_actor.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index 159bd66..b6fa865 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -26,6 +26,34 @@ config SQUASHFS If unsure, say N. choice + prompt "File decompression options" + depends on SQUASHFS + help + Squashfs now supports two options for decompressing file + data. Traditionally Squashfs has decompressed into an + intermediate buffer and then memcopied it into the page cache. + Squashfs now supports the ability to decompress directly into + the page cache. + + If unsure, select "Decompress file data into an intermediate buffer" + +config SQUASHFS_FILE_CACHE + bool "Decompress file data into an intermediate buffer" + help + Decompress file data into an intermediate buffer and then + memcopy it into the page cache. + +config SQUASHFS_FILE_DIRECT + bool "Decompress files directly into the page cache" + help + Directly decompress file data into the page cache. + Doing so can significantly improve performance because + it eliminates a memcpy and it also removes the lock contention + on the single buffer. + +endchoice + +choice prompt "Decompressor parallelisation options" depends on SQUASHFS help diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index e01ba11..4132520 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,9 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o file_cache.o +squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-$(CONFIG_SQUASHFS_FILE_CACHE) += file_cache.o +squashfs-$(CONFIG_SQUASHFS_FILE_DIRECT) += file_direct.o page_actor.o squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c new file mode 100644 index 000..2943b2b --- /dev/null +++ b/fs/squashfs/file_direct.c @@ -0,0 +1,173 @@ +/* + * Copyright (c) 2013 + * Phillip Lougher + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "squashfs_fs.h" +#include "squashfs_fs_sb.h" +#include "squashfs_fs_i.h" +#include "squashfs.h" +#include "page_actor.h" + +static int squashfs_read_cache(struct page *target_page, u64 block, int bsize, + int pages, struct page **page); + +/* Read separately compressed datablock directly into page cache */ +int squashfs_readpage_block(struct page *target_page, u64 block, int bsize) + +{ + struct inode *inode = target_page->mapping->host; + struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info; + + int file_end = (i_size_read(inode) - 1) >> PAGE_CACHE_SHIFT; + int mask = (1 << (msblk->block_log - PAGE_CACHE_SHIFT)) - 1; + int start_index = target_page->index
[PATCH 5/7] Squashfs: Restructure squashfs_readpage()
Restructure squashfs_readpage() splitting it into separate functions for datablocks, fragments and sparse blocks. Move the memcpying (from squashfs cache entry) implementation of squashfs_readpage_block into file_cache.c This allows different implementations to be supported. Signed-off-by: Phillip Lougher Reviewed-by: Minchan Kim --- fs/squashfs/Makefile | 2 +- fs/squashfs/file.c | 142 --- fs/squashfs/file_cache.c | 38 + fs/squashfs/squashfs.h | 7 +++ 4 files changed, 118 insertions(+), 71 deletions(-) create mode 100644 fs/squashfs/file_cache.c diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 5833b96..e01ba11 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,7 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-y += namei.o super.o symlink.o decompressor.o file_cache.o squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c index 8ca62c2..e5c9689 100644 --- a/fs/squashfs/file.c +++ b/fs/squashfs/file.c @@ -370,77 +370,15 @@ static int read_blocklist(struct inode *inode, int index, u64 *block) return le32_to_cpu(size); } - -static int squashfs_readpage(struct file *file, struct page *page) +/* Copy data into page cache */ +void squashfs_copy_cache(struct page *page, struct squashfs_cache_entry *buffer, + int bytes, int offset) { struct inode *inode = page->mapping->host; struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info; - int bytes, i, offset = 0, sparse = 0; - struct squashfs_cache_entry *buffer = NULL; void *pageaddr; - - int mask = (1 << (msblk->block_log - PAGE_CACHE_SHIFT)) - 1; - int index = page->index >> (msblk->block_log - PAGE_CACHE_SHIFT); - int start_index = page->index & ~mask; - int end_index = start_index | mask; - int file_end = i_size_read(inode) >> msblk->block_log; - - TRACE("Entered squashfs_readpage, page index %lx, start block %llx\n", - page->index, squashfs_i(inode)->start); - - if (page->index >= ((i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> - PAGE_CACHE_SHIFT)) - goto out; - - if (index < file_end || squashfs_i(inode)->fragment_block == - SQUASHFS_INVALID_BLK) { - /* -* Reading a datablock from disk. Need to read block list -* to get location and block size. -*/ - u64 block = 0; - int bsize = read_blocklist(inode, index, ); - if (bsize < 0) - goto error_out; - - if (bsize == 0) { /* hole */ - bytes = index == file_end ? - (i_size_read(inode) & (msblk->block_size - 1)) : -msblk->block_size; - sparse = 1; - } else { - /* -* Read and decompress datablock. -*/ - buffer = squashfs_get_datablock(inode->i_sb, - block, bsize); - if (buffer->error) { - ERROR("Unable to read page, block %llx, size %x" - "\n", block, bsize); - squashfs_cache_put(buffer); - goto error_out; - } - bytes = buffer->length; - } - } else { - /* -* Datablock is stored inside a fragment (tail-end packed -* block). -*/ - buffer = squashfs_get_fragment(inode->i_sb, - squashfs_i(inode)->fragment_block, - squashfs_i(inode)->fragment_size); - - if (buffer->error) { - ERROR("Unable to read page, block %llx, size %x\n", - squashfs_i(inode)->fragment_block, - squashfs_i(inode)->fragment_size); - squashfs_cache_put(buffer); - goto error_out; - } - bytes = i_size_read(inode) & (msblk->block_size - 1); - offset = squashfs_i(inode)->fragment_offs
[PATCH 3/7] Squashfs: add multi-threaded decompression using percpu variables (V2)
Add a multi-threaded decompression implementation which uses percpu variables. Using percpu variables has advantages and disadvantages over implementations which do not use percpu variables. Advantages: * the nature of percpu variables ensures decompression is load-balanced across the multiple cores. * simplicity. Disadvantages: it limits decompression to one thread per core. V2: * squashfs_decompressor_create: improve error handling path, re freeing of decompressors and comp_opts * decompressor_multi_percpu.c: include percpu.h header * Kconfig: indentation Signed-off-by: Phillip Lougher --- fs/squashfs/Kconfig | 57 ++- fs/squashfs/Makefile| 10 +--- fs/squashfs/decompressor_multi_percpu.c | 98 + 3 files changed, 145 insertions(+), 20 deletions(-) create mode 100644 fs/squashfs/decompressor_multi_percpu.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index 1c6d340..159bd66 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -25,6 +25,50 @@ config SQUASHFS If unsure, say N. +choice + prompt "Decompressor parallelisation options" + depends on SQUASHFS + help + Squashfs now supports three parallelisation options for + decompression. Each one exhibits various trade-offs between + decompression performance and CPU and memory usage. + + If in doubt, select "Single threaded compression" + +config SQUASHFS_DECOMP_SINGLE + bool "Single threaded compression" + help + Traditionally Squashfs has used single-threaded decompression. + Only one block (data or metadata) can be decompressed at any + one time. This limits CPU and memory usage to a minimum. + +config SQUASHFS_DECOMP_MULTI + bool "Use multiple decompressors for parallel I/O" + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + If you have a parallel I/O workload and your system has enough memory, + using this option may improve overall I/O performance. + + This decompressor implementation uses up to two parallel + decompressors per core. It dynamically allocates decompressors + on a demand basis. + +config SQUASHFS_DECOMP_MULTI_PERCPU + bool "Use percpu multiple decompressors for parallel I/O" + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + This decompressor implementation uses a maximum of one + decompressor per core. It uses percpu variables to ensure + decompression is load-balanced across the cores. + +endchoice + config SQUASHFS_XATTR bool "Squashfs XATTR support" depends on SQUASHFS @@ -63,19 +107,6 @@ config SQUASHFS_LZO If unsure, say N. -config SQUASHFS_MULTI_DECOMPRESSOR - bool "Use multiple decompressors for handling parallel I/O" - depends on SQUASHFS - help - By default Squashfs uses a single decompressor but it gives - poor performance on parallel I/O workloads when using multiple CPU - machines due to waiting on decompressor availability. - - If you have a parallel I/O workload and your system has enough memory, - using this option may improve overall I/O performance. - - If unsure, say N. - config SQUASHFS_XZ bool "Include support for XZ compressed file systems" depends on SQUASHFS diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index dfebc3b..5833b96 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -5,14 +5,10 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o squashfs-y += namei.o super.o symlink.o decompressor.o - +squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o +squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o +squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o - -ifdef CONFIG_SQUASHFS_MULTI_DECOMPRESSOR - squashfs-y += decompressor_multi.o -else - squashfs-y += decompressor_single.o -endif diff --git a/fs/squashfs/decompressor_multi_percpu.c b/fs/squashfs/decompressor_multi_percpu.c new file mode 100644 index 000..0e7b679 --- /dev/null +++ b/fs/squashfs/decompressor_multi_percpu.c @@
[PATCH 2/7] squashfs: Enhance parallel I/O
From: Minchan Kim Now squashfs have used for only one stream buffer for decompression so it hurts parallel read performance so this patch supports multiple decompressor to enhance performance parallel I/O. Four 1G file dd read on KVM machine which has 2 CPU and 4G memory. dd if=test/test1.dat of=/dev/null & dd if=test/test2.dat of=/dev/null & dd if=test/test3.dat of=/dev/null & dd if=test/test4.dat of=/dev/null & old : 1m39s -> new : 9s * From v1 * Change comp_strm with decomp_strm - Phillip * Change/add comments - Phillip Signed-off-by: Minchan Kim Signed-off-by: Phillip Lougher --- fs/squashfs/Kconfig | 13 +++ fs/squashfs/Makefile | 9 +- fs/squashfs/decompressor_multi.c | 200 +++ 3 files changed, 221 insertions(+), 1 deletion(-) create mode 100644 fs/squashfs/decompressor_multi.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index c70111e..1c6d340 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -63,6 +63,19 @@ config SQUASHFS_LZO If unsure, say N. +config SQUASHFS_MULTI_DECOMPRESSOR + bool "Use multiple decompressors for handling parallel I/O" + depends on SQUASHFS + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + If you have a parallel I/O workload and your system has enough memory, + using this option may improve overall I/O performance. + + If unsure, say N. + config SQUASHFS_XZ bool "Include support for XZ compressed file systems" depends on SQUASHFS diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index c223c84..dfebc3b 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,8 +4,15 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o decompressor_single.o +squashfs-y += namei.o super.o symlink.o decompressor.o + squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o + +ifdef CONFIG_SQUASHFS_MULTI_DECOMPRESSOR + squashfs-y += decompressor_multi.o +else + squashfs-y += decompressor_single.o +endif diff --git a/fs/squashfs/decompressor_multi.c b/fs/squashfs/decompressor_multi.c new file mode 100644 index 000..462731d --- /dev/null +++ b/fs/squashfs/decompressor_multi.c @@ -0,0 +1,200 @@ +/* + * Copyright (c) 2013 + * Minchan Kim + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ +#include +#include +#include +#include +#include +#include +#include + +#include "squashfs_fs.h" +#include "squashfs_fs_sb.h" +#include "decompressor.h" +#include "squashfs.h" + +/* + * This file implements multi-threaded decompression in the + * decompressor framework + */ + + +/* + * The reason that multiply two is that a CPU can request new I/O + * while it is waiting previous request. + */ +#define MAX_DECOMPRESSOR (num_online_cpus() * 2) + + +int squashfs_max_decompressors(void) +{ + return MAX_DECOMPRESSOR; +} + + +struct squashfs_stream { + void*comp_opts; + struct list_headstrm_list; + struct mutexmutex; + int avail_decomp; + wait_queue_head_t wait; +}; + + +struct decomp_stream { + void *stream; + struct list_head list; +}; + + +static void put_decomp_stream(struct decomp_stream *decomp_strm, + struct squashfs_stream *stream) +{ + mutex_lock(>mutex); + list_add(_strm->list, >strm_list); + mutex_unlock(>mutex); + wake_up(>wait); +} + +void *squashfs_decompressor_create(struct squashfs_sb_info *msblk, + void *comp_opts) +{ + struct squashfs_stream *stream; + struct decomp_stream *decomp_strm = NULL; + int err = -ENOMEM; + + stream = kzalloc(sizeof(*stream), GFP_KERNEL); + if (!stream) + goto out; + + stream->comp_opts = comp_opts; + mutex_init(>mutex); + INIT_LIST_HEAD(>strm_list); + init_waitqueue_head(>wait); + + /* +* We should have a decompressor at least as default +* so if we fail to allocate new decompressor dynamically, +* we could always fall back to default decompressor and +* file system works. +*/ + decomp_strm = kmalloc(sizeof(*decomp_strm), GFP_KERNEL); + if (!decomp_strm) + goto out; + + dec
[PATCH 1/7] Squashfs: Refactor decompressor interface and code
The decompressor interface and code was written from the point of view of single-threaded operation. In doing so it mixed a lot of single-threaded implementation specific aspects into the decompressor code and elsewhere which makes it difficult to seamlessly support multiple different decompressor implementations. This patch does the following: 1. It removes compressor_options parsing from the decompressor init() function. This allows the decompressor init() function to be dynamically called to instantiate multiple decompressors, without the compressor options needing to be read and parsed each time. 2. It moves threading and all sleeping operations out of the decompressors. In doing so, it makes the decompressors non-blocking wrappers which only deal with interfacing with the decompressor implementation. 3. It splits decompressor.[ch] into decompressor generic functions in decompressor.[ch], and moves the single threaded decompressor implementation into decompressor_single.c. The result of this patch is Squashfs should now be able to support multiple decompressors by adding new decompressor_xxx.c files with specialised implementations of the functions in decompressor_single.c Signed-off-by: Phillip Lougher Reviewed-by: Minchan Kim --- fs/squashfs/Makefile | 2 +- fs/squashfs/block.c | 11 +++-- fs/squashfs/decompressor.c| 47 ++--- fs/squashfs/decompressor.h| 21 +++-- fs/squashfs/decompressor_single.c | 86 + fs/squashfs/lzo_wrapper.c | 24 +++ fs/squashfs/squashfs.h| 9 +++- fs/squashfs/squashfs_fs_sb.h | 3 +- fs/squashfs/super.c | 10 ++--- fs/squashfs/xz_wrapper.c | 89 +-- fs/squashfs/zlib_wrapper.c| 50 +++--- 11 files changed, 216 insertions(+), 136 deletions(-) create mode 100644 fs/squashfs/decompressor_single.c diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 110b047..c223c84 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,7 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-y += namei.o super.o symlink.o decompressor.o decompressor_single.o squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 41d108e..4dd4025 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -93,7 +93,7 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, struct buffer_head **bh; int offset = index & ((1 << msblk->devblksize_log2) - 1); u64 cur_index = index >> msblk->devblksize_log2; - int bytes, compressed, b = 0, k = 0, page = 0, avail; + int bytes, compressed, b = 0, k = 0, page = 0, avail, i; bh = kcalloc(((srclength + msblk->devblksize - 1) >> msblk->devblksize_log2) + 1, sizeof(*bh), GFP_KERNEL); @@ -158,6 +158,12 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, ll_rw_block(READ, b - 1, bh + 1); } + for (i = 0; i < b; i++) { + wait_on_buffer(bh[i]); + if (!buffer_uptodate(bh[i])) + goto block_release; + } + if (compressed) { length = squashfs_decompress(msblk, buffer, bh, b, offset, length, srclength, pages); @@ -172,9 +178,6 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, for (bytes = length; k < b; k++) { in = min(bytes, msblk->devblksize - offset); bytes -= in; - wait_on_buffer(bh[k]); - if (!buffer_uptodate(bh[k])) - goto block_release; while (in) { if (pg_offset == PAGE_CACHE_SIZE) { page++; diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c index 3f6271d..234291f 100644 --- a/fs/squashfs/decompressor.c +++ b/fs/squashfs/decompressor.c @@ -37,29 +37,29 @@ */ static const struct squashfs_decompressor squashfs_lzma_unsupported_comp_ops = { - NULL, NULL, NULL, LZMA_COMPRESSION, "lzma", 0 + NULL, NULL, NULL, NULL, LZMA_COMPRESSION, "lzma", 0 }; #ifndef CONFIG_SQUASHFS_LZO static const struct squashfs_decompressor squashfs_lzo_comp_ops = { - NULL, NULL, NULL, LZO_COMPRESSION, "lzo", 0 + NULL, NULL, NULL, NULL, LZO_COMPRESSION, "lzo", 0 }; #endif #ifndef CONFIG_SQUASHFS_
[PATCH 0/7] Squashfs: performance enhancement patches
Hi, This is the latest version of the performance improvement patches. Patch 3/7 is a V2 of the percpu patch following review. If there are no further review comments/objections I intend to get these merged. Any objections, then please object now. Thanks Phillip Minchan Kim (1): squashfs: Enhance parallel I/O Phillip Lougher (6): Squashfs: Refactor decompressor interface and code Squashfs: Add multi-threaded decompression using percpu variables (V2) Squashfs: Generalise paging handling in the decompressors Squashfs: Restructure squashfs_readpage() Squashfs: Directly decompress into the page cache for file data Squashfs: Check stream is not NULL in decompressor_multi.c fs/squashfs/Kconfig | 72 fs/squashfs/Makefile| 5 fs/squashfs/block.c | 36 -- fs/squashfs/cache.c | 28 + fs/squashfs/decompressor.c | 59 +--- fs/squashfs/decompressor.h | 24 ++- fs/squashfs/decompressor_multi.c| 198 fs/squashfs/decompressor_multi_percpu.c | 97 +++ fs/squashfs/decompressor_single.c | 85 fs/squashfs/file.c | 142 +++--- fs/squashfs/file_cache.c| 38 +++ fs/squashfs/file_direct.c | 173 fs/squashfs/lzo_wrapper.c | 47 + fs/squashfs/page_actor.c| 100 + fs/squashfs/page_actor.h| 81 + fs/squashfs/squashfs.h | 20 +++-- fs/squashfs/squashfs_fs_sb.h| 4 +-- fs/squashfs/super.c | 10 +++ fs/squashfs/xz_wrapper.c| 105 fs/squashfs/zlib_wrapper.c | 64 --- 20 files changed, 1145 insertions(+), 243 deletions(-) create mode 100644 fs/squashfs/decompressor_multi.c create mode 100644 fs/squashfs/decompressor_multi_percpu.c create mode 100644 fs/squashfs/decompressor_single.c create mode 100644 fs/squashfs/file_cache.c create mode 100644 fs/squashfs/file_direct.c create mode 100644 fs/squashfs/page_actor.c create mode 100644 fs/squashfs/page_actor.h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/7] Squashfs: performance enhancement patches
Hi, This is the latest version of the performance improvement patches. Patch 3/7 is a V2 of the percpu patch following review. If there are no further review comments/objections I intend to get these merged. Any objections, then please object now. Thanks Phillip Minchan Kim (1): squashfs: Enhance parallel I/O Phillip Lougher (6): Squashfs: Refactor decompressor interface and code Squashfs: Add multi-threaded decompression using percpu variables (V2) Squashfs: Generalise paging handling in the decompressors Squashfs: Restructure squashfs_readpage() Squashfs: Directly decompress into the page cache for file data Squashfs: Check stream is not NULL in decompressor_multi.c fs/squashfs/Kconfig | 72 fs/squashfs/Makefile| 5 fs/squashfs/block.c | 36 -- fs/squashfs/cache.c | 28 + fs/squashfs/decompressor.c | 59 +--- fs/squashfs/decompressor.h | 24 ++- fs/squashfs/decompressor_multi.c| 198 fs/squashfs/decompressor_multi_percpu.c | 97 +++ fs/squashfs/decompressor_single.c | 85 fs/squashfs/file.c | 142 +++--- fs/squashfs/file_cache.c| 38 +++ fs/squashfs/file_direct.c | 173 fs/squashfs/lzo_wrapper.c | 47 + fs/squashfs/page_actor.c| 100 + fs/squashfs/page_actor.h| 81 + fs/squashfs/squashfs.h | 20 +++-- fs/squashfs/squashfs_fs_sb.h| 4 +-- fs/squashfs/super.c | 10 +++ fs/squashfs/xz_wrapper.c| 105 fs/squashfs/zlib_wrapper.c | 64 --- 20 files changed, 1145 insertions(+), 243 deletions(-) create mode 100644 fs/squashfs/decompressor_multi.c create mode 100644 fs/squashfs/decompressor_multi_percpu.c create mode 100644 fs/squashfs/decompressor_single.c create mode 100644 fs/squashfs/file_cache.c create mode 100644 fs/squashfs/file_direct.c create mode 100644 fs/squashfs/page_actor.c create mode 100644 fs/squashfs/page_actor.h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/7] Squashfs: Refactor decompressor interface and code
The decompressor interface and code was written from the point of view of single-threaded operation. In doing so it mixed a lot of single-threaded implementation specific aspects into the decompressor code and elsewhere which makes it difficult to seamlessly support multiple different decompressor implementations. This patch does the following: 1. It removes compressor_options parsing from the decompressor init() function. This allows the decompressor init() function to be dynamically called to instantiate multiple decompressors, without the compressor options needing to be read and parsed each time. 2. It moves threading and all sleeping operations out of the decompressors. In doing so, it makes the decompressors non-blocking wrappers which only deal with interfacing with the decompressor implementation. 3. It splits decompressor.[ch] into decompressor generic functions in decompressor.[ch], and moves the single threaded decompressor implementation into decompressor_single.c. The result of this patch is Squashfs should now be able to support multiple decompressors by adding new decompressor_xxx.c files with specialised implementations of the functions in decompressor_single.c Signed-off-by: Phillip Lougher phil...@squashfs.org.uk Reviewed-by: Minchan Kim minc...@kernel.org --- fs/squashfs/Makefile | 2 +- fs/squashfs/block.c | 11 +++-- fs/squashfs/decompressor.c| 47 ++--- fs/squashfs/decompressor.h| 21 +++-- fs/squashfs/decompressor_single.c | 86 + fs/squashfs/lzo_wrapper.c | 24 +++ fs/squashfs/squashfs.h| 9 +++- fs/squashfs/squashfs_fs_sb.h | 3 +- fs/squashfs/super.c | 10 ++--- fs/squashfs/xz_wrapper.c | 89 +-- fs/squashfs/zlib_wrapper.c| 50 +++--- 11 files changed, 216 insertions(+), 136 deletions(-) create mode 100644 fs/squashfs/decompressor_single.c diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 110b047..c223c84 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,7 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-y += namei.o super.o symlink.o decompressor.o decompressor_single.o squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 41d108e..4dd4025 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -93,7 +93,7 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, struct buffer_head **bh; int offset = index ((1 msblk-devblksize_log2) - 1); u64 cur_index = index msblk-devblksize_log2; - int bytes, compressed, b = 0, k = 0, page = 0, avail; + int bytes, compressed, b = 0, k = 0, page = 0, avail, i; bh = kcalloc(((srclength + msblk-devblksize - 1) msblk-devblksize_log2) + 1, sizeof(*bh), GFP_KERNEL); @@ -158,6 +158,12 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, ll_rw_block(READ, b - 1, bh + 1); } + for (i = 0; i b; i++) { + wait_on_buffer(bh[i]); + if (!buffer_uptodate(bh[i])) + goto block_release; + } + if (compressed) { length = squashfs_decompress(msblk, buffer, bh, b, offset, length, srclength, pages); @@ -172,9 +178,6 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, for (bytes = length; k b; k++) { in = min(bytes, msblk-devblksize - offset); bytes -= in; - wait_on_buffer(bh[k]); - if (!buffer_uptodate(bh[k])) - goto block_release; while (in) { if (pg_offset == PAGE_CACHE_SIZE) { page++; diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c index 3f6271d..234291f 100644 --- a/fs/squashfs/decompressor.c +++ b/fs/squashfs/decompressor.c @@ -37,29 +37,29 @@ */ static const struct squashfs_decompressor squashfs_lzma_unsupported_comp_ops = { - NULL, NULL, NULL, LZMA_COMPRESSION, lzma, 0 + NULL, NULL, NULL, NULL, LZMA_COMPRESSION, lzma, 0 }; #ifndef CONFIG_SQUASHFS_LZO static const struct squashfs_decompressor squashfs_lzo_comp_ops = { - NULL, NULL, NULL, LZO_COMPRESSION, lzo, 0 + NULL, NULL, NULL, NULL, LZO_COMPRESSION, lzo, 0 }; #endif #ifndef CONFIG_SQUASHFS_XZ static const struct squashfs_decompressor squashfs_xz_comp_ops
[PATCH 2/7] squashfs: Enhance parallel I/O
From: Minchan Kim minc...@kernel.org Now squashfs have used for only one stream buffer for decompression so it hurts parallel read performance so this patch supports multiple decompressor to enhance performance parallel I/O. Four 1G file dd read on KVM machine which has 2 CPU and 4G memory. dd if=test/test1.dat of=/dev/null dd if=test/test2.dat of=/dev/null dd if=test/test3.dat of=/dev/null dd if=test/test4.dat of=/dev/null old : 1m39s - new : 9s * From v1 * Change comp_strm with decomp_strm - Phillip * Change/add comments - Phillip Signed-off-by: Minchan Kim minc...@kernel.org Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- fs/squashfs/Kconfig | 13 +++ fs/squashfs/Makefile | 9 +- fs/squashfs/decompressor_multi.c | 200 +++ 3 files changed, 221 insertions(+), 1 deletion(-) create mode 100644 fs/squashfs/decompressor_multi.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index c70111e..1c6d340 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -63,6 +63,19 @@ config SQUASHFS_LZO If unsure, say N. +config SQUASHFS_MULTI_DECOMPRESSOR + bool Use multiple decompressors for handling parallel I/O + depends on SQUASHFS + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + If you have a parallel I/O workload and your system has enough memory, + using this option may improve overall I/O performance. + + If unsure, say N. + config SQUASHFS_XZ bool Include support for XZ compressed file systems depends on SQUASHFS diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index c223c84..dfebc3b 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,8 +4,15 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o decompressor_single.o +squashfs-y += namei.o super.o symlink.o decompressor.o + squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o + +ifdef CONFIG_SQUASHFS_MULTI_DECOMPRESSOR + squashfs-y += decompressor_multi.o +else + squashfs-y += decompressor_single.o +endif diff --git a/fs/squashfs/decompressor_multi.c b/fs/squashfs/decompressor_multi.c new file mode 100644 index 000..462731d --- /dev/null +++ b/fs/squashfs/decompressor_multi.c @@ -0,0 +1,200 @@ +/* + * Copyright (c) 2013 + * Minchan Kim minc...@kernel.org + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ +#include linux/types.h +#include linux/mutex.h +#include linux/slab.h +#include linux/buffer_head.h +#include linux/sched.h +#include linux/wait.h +#include linux/cpumask.h + +#include squashfs_fs.h +#include squashfs_fs_sb.h +#include decompressor.h +#include squashfs.h + +/* + * This file implements multi-threaded decompression in the + * decompressor framework + */ + + +/* + * The reason that multiply two is that a CPU can request new I/O + * while it is waiting previous request. + */ +#define MAX_DECOMPRESSOR (num_online_cpus() * 2) + + +int squashfs_max_decompressors(void) +{ + return MAX_DECOMPRESSOR; +} + + +struct squashfs_stream { + void*comp_opts; + struct list_headstrm_list; + struct mutexmutex; + int avail_decomp; + wait_queue_head_t wait; +}; + + +struct decomp_stream { + void *stream; + struct list_head list; +}; + + +static void put_decomp_stream(struct decomp_stream *decomp_strm, + struct squashfs_stream *stream) +{ + mutex_lock(stream-mutex); + list_add(decomp_strm-list, stream-strm_list); + mutex_unlock(stream-mutex); + wake_up(stream-wait); +} + +void *squashfs_decompressor_create(struct squashfs_sb_info *msblk, + void *comp_opts) +{ + struct squashfs_stream *stream; + struct decomp_stream *decomp_strm = NULL; + int err = -ENOMEM; + + stream = kzalloc(sizeof(*stream), GFP_KERNEL); + if (!stream) + goto out; + + stream-comp_opts = comp_opts; + mutex_init(stream-mutex); + INIT_LIST_HEAD(stream-strm_list); + init_waitqueue_head(stream-wait); + + /* +* We should have a decompressor at least as default +* so if we fail to allocate new decompressor dynamically, +* we could always fall back to default decompressor and +* file system works. +*/ + decomp_strm = kmalloc
[PATCH 6/7] Squashfs: Directly decompress into the page cache for file data
This introduces an implementation of squashfs_readpage_block() that directly decompresses into the page cache. This uses the previously added page handler abstraction to push down the necessary kmap_atomic/kunmap_atomic operations on the page cache buffers into the decompressors. This enables direct copying into the page cache without using the slow kmap/kunmap calls. The code detects when multiple threads are racing in squashfs_readpage() to decompress the same block, and avoids this regression by falling back to using an intermediate buffer. This patch enhances the performance of Squashfs significantly when multiple processes are accessing the filesystem simultaneously because it not only reduces memcopying, but it more importantly eliminates the lock contention on the intermediate buffer. Using single-thread decompression. dd if=file1 of=/dev/null bs=4096 dd if=file2 of=/dev/null bs=4096 dd if=file3 of=/dev/null bs=4096 dd if=file4 of=/dev/null bs=4096 Before: 629145600 bytes (629 MB) copied, 45.8046 s, 13.7 MB/s After: 629145600 bytes (629 MB) copied, 9.29414 s, 67.7 MB/s Signed-off-by: Phillip Lougher phil...@squashfs.org.uk Reviewed-by: Minchan Kim minc...@kernel.org --- fs/squashfs/Kconfig | 28 fs/squashfs/Makefile | 4 +- fs/squashfs/file_direct.c | 173 ++ fs/squashfs/page_actor.c | 100 +++ fs/squashfs/page_actor.h | 32 + 5 files changed, 336 insertions(+), 1 deletion(-) create mode 100644 fs/squashfs/file_direct.c create mode 100644 fs/squashfs/page_actor.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index 159bd66..b6fa865 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -26,6 +26,34 @@ config SQUASHFS If unsure, say N. choice + prompt File decompression options + depends on SQUASHFS + help + Squashfs now supports two options for decompressing file + data. Traditionally Squashfs has decompressed into an + intermediate buffer and then memcopied it into the page cache. + Squashfs now supports the ability to decompress directly into + the page cache. + + If unsure, select Decompress file data into an intermediate buffer + +config SQUASHFS_FILE_CACHE + bool Decompress file data into an intermediate buffer + help + Decompress file data into an intermediate buffer and then + memcopy it into the page cache. + +config SQUASHFS_FILE_DIRECT + bool Decompress files directly into the page cache + help + Directly decompress file data into the page cache. + Doing so can significantly improve performance because + it eliminates a memcpy and it also removes the lock contention + on the single buffer. + +endchoice + +choice prompt Decompressor parallelisation options depends on SQUASHFS help diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index e01ba11..4132520 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,9 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o file_cache.o +squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-$(CONFIG_SQUASHFS_FILE_CACHE) += file_cache.o +squashfs-$(CONFIG_SQUASHFS_FILE_DIRECT) += file_direct.o page_actor.o squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c new file mode 100644 index 000..2943b2b --- /dev/null +++ b/fs/squashfs/file_direct.c @@ -0,0 +1,173 @@ +/* + * Copyright (c) 2013 + * Phillip Lougher phil...@squashfs.org.uk + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include linux/fs.h +#include linux/vfs.h +#include linux/kernel.h +#include linux/slab.h +#include linux/string.h +#include linux/pagemap.h +#include linux/mutex.h + +#include squashfs_fs.h +#include squashfs_fs_sb.h +#include squashfs_fs_i.h +#include squashfs.h +#include page_actor.h + +static int squashfs_read_cache(struct page *target_page, u64 block, int bsize, + int pages, struct page **page); + +/* Read separately compressed datablock directly into page cache */ +int squashfs_readpage_block(struct page *target_page, u64 block, int bsize) + +{ + struct inode *inode = target_page-mapping-host; + struct squashfs_sb_info *msblk = inode-i_sb-s_fs_info; + + int file_end = (i_size_read(inode) - 1) PAGE_CACHE_SHIFT; + int mask = (1 (msblk-block_log - PAGE_CACHE_SHIFT)) - 1; + int start_index = target_page-index ~mask; + int end_index
[PATCH 5/7] Squashfs: Restructure squashfs_readpage()
Restructure squashfs_readpage() splitting it into separate functions for datablocks, fragments and sparse blocks. Move the memcpying (from squashfs cache entry) implementation of squashfs_readpage_block into file_cache.c This allows different implementations to be supported. Signed-off-by: Phillip Lougher phil...@squashfs.org.uk Reviewed-by: Minchan Kim minc...@kernel.org --- fs/squashfs/Makefile | 2 +- fs/squashfs/file.c | 142 --- fs/squashfs/file_cache.c | 38 + fs/squashfs/squashfs.h | 7 +++ 4 files changed, 118 insertions(+), 71 deletions(-) create mode 100644 fs/squashfs/file_cache.c diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 5833b96..e01ba11 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,7 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-y += namei.o super.o symlink.o decompressor.o file_cache.o squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c index 8ca62c2..e5c9689 100644 --- a/fs/squashfs/file.c +++ b/fs/squashfs/file.c @@ -370,77 +370,15 @@ static int read_blocklist(struct inode *inode, int index, u64 *block) return le32_to_cpu(size); } - -static int squashfs_readpage(struct file *file, struct page *page) +/* Copy data into page cache */ +void squashfs_copy_cache(struct page *page, struct squashfs_cache_entry *buffer, + int bytes, int offset) { struct inode *inode = page-mapping-host; struct squashfs_sb_info *msblk = inode-i_sb-s_fs_info; - int bytes, i, offset = 0, sparse = 0; - struct squashfs_cache_entry *buffer = NULL; void *pageaddr; - - int mask = (1 (msblk-block_log - PAGE_CACHE_SHIFT)) - 1; - int index = page-index (msblk-block_log - PAGE_CACHE_SHIFT); - int start_index = page-index ~mask; - int end_index = start_index | mask; - int file_end = i_size_read(inode) msblk-block_log; - - TRACE(Entered squashfs_readpage, page index %lx, start block %llx\n, - page-index, squashfs_i(inode)-start); - - if (page-index = ((i_size_read(inode) + PAGE_CACHE_SIZE - 1) - PAGE_CACHE_SHIFT)) - goto out; - - if (index file_end || squashfs_i(inode)-fragment_block == - SQUASHFS_INVALID_BLK) { - /* -* Reading a datablock from disk. Need to read block list -* to get location and block size. -*/ - u64 block = 0; - int bsize = read_blocklist(inode, index, block); - if (bsize 0) - goto error_out; - - if (bsize == 0) { /* hole */ - bytes = index == file_end ? - (i_size_read(inode) (msblk-block_size - 1)) : -msblk-block_size; - sparse = 1; - } else { - /* -* Read and decompress datablock. -*/ - buffer = squashfs_get_datablock(inode-i_sb, - block, bsize); - if (buffer-error) { - ERROR(Unable to read page, block %llx, size %x - \n, block, bsize); - squashfs_cache_put(buffer); - goto error_out; - } - bytes = buffer-length; - } - } else { - /* -* Datablock is stored inside a fragment (tail-end packed -* block). -*/ - buffer = squashfs_get_fragment(inode-i_sb, - squashfs_i(inode)-fragment_block, - squashfs_i(inode)-fragment_size); - - if (buffer-error) { - ERROR(Unable to read page, block %llx, size %x\n, - squashfs_i(inode)-fragment_block, - squashfs_i(inode)-fragment_size); - squashfs_cache_put(buffer); - goto error_out; - } - bytes = i_size_read(inode) (msblk-block_size - 1); - offset = squashfs_i(inode)-fragment_offset; - } + int i, mask = (1 (msblk-block_log - PAGE_CACHE_SHIFT)) - 1; + int start_index = page-index ~mask, end_index = start_index | mask
[PATCH 3/7] Squashfs: add multi-threaded decompression using percpu variables (V2)
Add a multi-threaded decompression implementation which uses percpu variables. Using percpu variables has advantages and disadvantages over implementations which do not use percpu variables. Advantages: * the nature of percpu variables ensures decompression is load-balanced across the multiple cores. * simplicity. Disadvantages: it limits decompression to one thread per core. V2: * squashfs_decompressor_create: improve error handling path, re freeing of decompressors and comp_opts * decompressor_multi_percpu.c: include percpu.h header * Kconfig: indentation Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- fs/squashfs/Kconfig | 57 ++- fs/squashfs/Makefile| 10 +--- fs/squashfs/decompressor_multi_percpu.c | 98 + 3 files changed, 145 insertions(+), 20 deletions(-) create mode 100644 fs/squashfs/decompressor_multi_percpu.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index 1c6d340..159bd66 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -25,6 +25,50 @@ config SQUASHFS If unsure, say N. +choice + prompt Decompressor parallelisation options + depends on SQUASHFS + help + Squashfs now supports three parallelisation options for + decompression. Each one exhibits various trade-offs between + decompression performance and CPU and memory usage. + + If in doubt, select Single threaded compression + +config SQUASHFS_DECOMP_SINGLE + bool Single threaded compression + help + Traditionally Squashfs has used single-threaded decompression. + Only one block (data or metadata) can be decompressed at any + one time. This limits CPU and memory usage to a minimum. + +config SQUASHFS_DECOMP_MULTI + bool Use multiple decompressors for parallel I/O + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + If you have a parallel I/O workload and your system has enough memory, + using this option may improve overall I/O performance. + + This decompressor implementation uses up to two parallel + decompressors per core. It dynamically allocates decompressors + on a demand basis. + +config SQUASHFS_DECOMP_MULTI_PERCPU + bool Use percpu multiple decompressors for parallel I/O + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + This decompressor implementation uses a maximum of one + decompressor per core. It uses percpu variables to ensure + decompression is load-balanced across the cores. + +endchoice + config SQUASHFS_XATTR bool Squashfs XATTR support depends on SQUASHFS @@ -63,19 +107,6 @@ config SQUASHFS_LZO If unsure, say N. -config SQUASHFS_MULTI_DECOMPRESSOR - bool Use multiple decompressors for handling parallel I/O - depends on SQUASHFS - help - By default Squashfs uses a single decompressor but it gives - poor performance on parallel I/O workloads when using multiple CPU - machines due to waiting on decompressor availability. - - If you have a parallel I/O workload and your system has enough memory, - using this option may improve overall I/O performance. - - If unsure, say N. - config SQUASHFS_XZ bool Include support for XZ compressed file systems depends on SQUASHFS diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index dfebc3b..5833b96 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -5,14 +5,10 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o squashfs-y += namei.o super.o symlink.o decompressor.o - +squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o +squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o +squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o - -ifdef CONFIG_SQUASHFS_MULTI_DECOMPRESSOR - squashfs-y += decompressor_multi.o -else - squashfs-y += decompressor_single.o -endif diff --git a/fs/squashfs/decompressor_multi_percpu.c b/fs/squashfs/decompressor_multi_percpu.c new file mode 100644 index 000..0e7b679 --- /dev/null +++ b/fs/squashfs/decompressor_multi_percpu.c @@ -0,0 +1,98 @@ +/* + * Copyright (c) 2013 + * Phillip Lougher phil
[PATCH 7/7] Squashfs: Check stream is not NULL in decompressor_multi.c
Fix static checker complaint that stream is not checked in squashfs_decompressor_destroy(). Reported-by: Dan Carpenter dan.carpen...@oracle.com Signed-off-by: Phillip Lougher phil...@squashfs.org.uk Reviewed-by: Minchan Kim minc...@kernel.org --- fs/squashfs/decompressor_multi.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/squashfs/decompressor_multi.c b/fs/squashfs/decompressor_multi.c index ae54675..d6008a6 100644 --- a/fs/squashfs/decompressor_multi.c +++ b/fs/squashfs/decompressor_multi.c @@ -119,11 +119,10 @@ void squashfs_decompressor_destroy(struct squashfs_sb_info *msblk) kfree(decomp_strm); stream-avail_decomp--; } + WARN_ON(stream-avail_decomp); + kfree(stream-comp_opts); + kfree(stream); } - - WARN_ON(stream-avail_decomp); - kfree(stream-comp_opts); - kfree(stream); } -- 1.8.3.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/7] Squashfs: Generalise paging handling in the decompressors
Further generalise the decompressors by adding a page handler abstraction. This adds helpers to allow the decompressors to access and process the output buffers in an implementation independant manner. This allows different types of output buffer to be passed to the decompressors, with the implementation specific aspects handled at decompression time, but without the knowledge being held in the decompressor wrapper code. This will allow the decompressors to handle Squashfs cache buffers, and page cache pages. This patch adds the abstraction and an implementation for the caches. Signed-off-by: Phillip Lougher phil...@squashfs.org.uk Reviewed-by: Minchan Kim minc...@kernel.org --- fs/squashfs/block.c | 27 ++ fs/squashfs/cache.c | 28 +++ fs/squashfs/decompressor.c | 14 -- fs/squashfs/decompressor.h | 5 ++-- fs/squashfs/decompressor_multi.c| 7 ++--- fs/squashfs/decompressor_multi_percpu.c | 9 +++--- fs/squashfs/decompressor_single.c | 9 +++--- fs/squashfs/lzo_wrapper.c | 27 -- fs/squashfs/page_actor.h| 49 + fs/squashfs/squashfs.h | 8 +++--- fs/squashfs/squashfs_fs_sb.h| 1 + fs/squashfs/xz_wrapper.c| 22 +-- fs/squashfs/zlib_wrapper.c | 24 ++-- 13 files changed, 163 insertions(+), 67 deletions(-) create mode 100644 fs/squashfs/page_actor.h diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 4dd4025..0cea9b9 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -36,6 +36,7 @@ #include squashfs_fs_sb.h #include squashfs.h #include decompressor.h +#include page_actor.h /* * Read the metadata block length, this is stored in the first two @@ -86,16 +87,16 @@ static struct buffer_head *get_block_length(struct super_block *sb, * generated a larger block - this does occasionally happen with compression * algorithms). */ -int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, - int length, u64 *next_index, int srclength, int pages) +int squashfs_read_data(struct super_block *sb, u64 index, int length, + u64 *next_index, struct squashfs_page_actor *output) { struct squashfs_sb_info *msblk = sb-s_fs_info; struct buffer_head **bh; int offset = index ((1 msblk-devblksize_log2) - 1); u64 cur_index = index msblk-devblksize_log2; - int bytes, compressed, b = 0, k = 0, page = 0, avail, i; + int bytes, compressed, b = 0, k = 0, avail, i; - bh = kcalloc(((srclength + msblk-devblksize - 1) + bh = kcalloc(((output-length + msblk-devblksize - 1) msblk-devblksize_log2) + 1, sizeof(*bh), GFP_KERNEL); if (bh == NULL) return -ENOMEM; @@ -111,9 +112,9 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, *next_index = index + length; TRACE(Block @ 0x%llx, %scompressed size %d, src size %d\n, - index, compressed ? : un, length, srclength); + index, compressed ? : un, length, output-length); - if (length 0 || length srclength || + if (length 0 || length output-length || (index + length) msblk-bytes_used) goto read_failure; @@ -145,7 +146,7 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, TRACE(Block @ 0x%llx, %scompressed size %d\n, index, compressed ? : un, length); - if (length 0 || length srclength || + if (length 0 || length output-length || (index + length) msblk-bytes_used) goto block_release; @@ -165,8 +166,8 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, } if (compressed) { - length = squashfs_decompress(msblk, buffer, bh, b, offset, -length, srclength, pages); + length = squashfs_decompress(msblk, bh, b, offset, length, + output); if (length 0) goto read_failure; } else { @@ -174,19 +175,20 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, * Block is uncompressed. */ int in, pg_offset = 0; + void *data = squashfs_first_page(output); for (bytes = length; k b; k++) { in = min(bytes, msblk-devblksize - offset); bytes -= in; while (in) { if (pg_offset == PAGE_CACHE_SIZE
Re: [PATCH 3/6] Squashfs: add multi-threaded decompression using percpu variables
CCing Junjiro Okijima and Stephen Hemminger On 08/11/13 02:42, Minchan Kim wrote: Hello Phillip, On Thu, Nov 07, 2013 at 08:24:22PM +, Phillip Lougher wrote: Add a multi-threaded decompression implementation which uses percpu variables. Using percpu variables has advantages and disadvantages over implementations which do not use percpu variables. Advantages: the nature of percpu variables ensures decompression is load-balanced across the multiple cores. Disadvantages: it limits decompression to one thread per core. At a glance, I understand your concern but I don't see benefit to make this feature as separate new config because we can modify the number of decompressor per core in the future. I don't want to create new config SQUASHFS_DECOMP_MULTI_3, SQUASHFS_DECOMP_MULTI_4 and so on. :) You misunderstand I have been sent two multi-threaded implementations in the past which use percpu variables: 1. First patch set: http://www.spinics.net/lists/linux-fsdevel/msg34365.html Later in early 2011, I explained why I'd not merged the patches, and promised to do so when I got time http://www.spinics.net/lists/linux-fsdevel/msg42392.html 2. Second patch set sent in 2011 http://www.spinics.net/lists/linux-fsdevel/msg44111.html So, these patches have been in my inbox, waiting until I got time to refactor Squashfs so that they could be merged... and I finally got to do this last month, which is why I'm merging a combined version of both patches now. As to why have *two* implementations, I previously explained these two approaches are complementary, and merging both allows the user to decide which method of parallelising Squashfs they want to do. The percpu implementation is a good approach to parallelising Squashfs. It is extremely simple, both in code and overhead. The decompression hotpath simply consists of taking a percpu variable, doing the decompression, and then a release. Looking at code sizes: fs/squashfs/decompressor_multi.c| 199 +++ fs/squashfs/decompressor_multi_percpu.c | 104 fs/squashfs/decompressor_single.c | 85 + The simplicity of the percpu approach is readily apparent, at 104 lines it is only slightly larger than the single threaded implementation. Personally I like both approaches, and I have no reason not to merge both implementations I have been sent. But what does the community think here? Do you want the percpu implementation? Do you see value in having two implementations? Feedback is appreciated. How about this? 1. Let's make CONFIG_DECOMPRESSOR_MAX which could be tuned by admin in Kconfig. default is CPU *2 or CPU, Otherwise, we can put it to sysfs so user can tune it in rumtime. 2. put decompressor shrink logic by slab shrinker so if system has memory pressure, we could catch the event and free some of decompressor but memory pressure is not severe again in the future, we can create new decompressor until reaching threadhold user define. We could know system memory is enough by GFP_NOWAIT, not GFP_KERNEL in get_decomp_stream's allocation indirectly. This adds extra complexity to an implementation already 199 lines long (as opposed to 104 for the percpu implementation). The whole point of the percpu implementation is to add a simple implementation that may suit many systems. Phillip In short, let's make decompressor_multi as dynamically tuned system and user can limit the max. Signed-off-by: Phillip Lougher --- fs/squashfs/Kconfig | 57 + fs/squashfs/Makefile| 10 +-- fs/squashfs/decompressor_multi_percpu.c | 105 +++ 3 files changed, 152 insertions(+), 20 deletions(-) create mode 100644 fs/squashfs/decompressor_multi_percpu.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index 1c6d340..c92c75f 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -25,6 +25,50 @@ config SQUASHFS If unsure, say N. +choice + prompt "Decompressor parallelisation options" + depends on SQUASHFS + help + Squashfs now supports three parallelisation options for + decompression. Each one exhibits various trade-offs between + decompression performance and CPU and memory usage. + + If in doubt, select "Single threaded compression" + +config SQUASHFS_DECOMP_SINGLE + bool "Single threaded compression" + help + Traditionally Squashfs has used single-threaded decompression. + Only one block (data or metadata) can be decompressed at any + one time. This limits CPU and memory usage to a minimum. + +config SQUASHFS_DECOMP_MULTI + bool "Use multiple decompressors for parallel I/O" + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when u
Re: [PATCH 3/6] Squashfs: add multi-threaded decompression using percpu variables
CCing Junjiro Okijima and Stephen Hemminger On 08/11/13 02:42, Minchan Kim wrote: Hello Phillip, On Thu, Nov 07, 2013 at 08:24:22PM +, Phillip Lougher wrote: Add a multi-threaded decompression implementation which uses percpu variables. Using percpu variables has advantages and disadvantages over implementations which do not use percpu variables. Advantages: the nature of percpu variables ensures decompression is load-balanced across the multiple cores. Disadvantages: it limits decompression to one thread per core. At a glance, I understand your concern but I don't see benefit to make this feature as separate new config because we can modify the number of decompressor per core in the future. I don't want to create new config SQUASHFS_DECOMP_MULTI_3, SQUASHFS_DECOMP_MULTI_4 and so on. :) You misunderstand I have been sent two multi-threaded implementations in the past which use percpu variables: 1. First patch set: http://www.spinics.net/lists/linux-fsdevel/msg34365.html Later in early 2011, I explained why I'd not merged the patches, and promised to do so when I got time http://www.spinics.net/lists/linux-fsdevel/msg42392.html 2. Second patch set sent in 2011 http://www.spinics.net/lists/linux-fsdevel/msg44111.html So, these patches have been in my inbox, waiting until I got time to refactor Squashfs so that they could be merged... and I finally got to do this last month, which is why I'm merging a combined version of both patches now. As to why have *two* implementations, I previously explained these two approaches are complementary, and merging both allows the user to decide which method of parallelising Squashfs they want to do. The percpu implementation is a good approach to parallelising Squashfs. It is extremely simple, both in code and overhead. The decompression hotpath simply consists of taking a percpu variable, doing the decompression, and then a release. Looking at code sizes: fs/squashfs/decompressor_multi.c| 199 +++ fs/squashfs/decompressor_multi_percpu.c | 104 fs/squashfs/decompressor_single.c | 85 + The simplicity of the percpu approach is readily apparent, at 104 lines it is only slightly larger than the single threaded implementation. Personally I like both approaches, and I have no reason not to merge both implementations I have been sent. But what does the community think here? Do you want the percpu implementation? Do you see value in having two implementations? Feedback is appreciated. How about this? 1. Let's make CONFIG_DECOMPRESSOR_MAX which could be tuned by admin in Kconfig. default is CPU *2 or CPU, Otherwise, we can put it to sysfs so user can tune it in rumtime. 2. put decompressor shrink logic by slab shrinker so if system has memory pressure, we could catch the event and free some of decompressor but memory pressure is not severe again in the future, we can create new decompressor until reaching threadhold user define. We could know system memory is enough by GFP_NOWAIT, not GFP_KERNEL in get_decomp_stream's allocation indirectly. This adds extra complexity to an implementation already 199 lines long (as opposed to 104 for the percpu implementation). The whole point of the percpu implementation is to add a simple implementation that may suit many systems. Phillip In short, let's make decompressor_multi as dynamically tuned system and user can limit the max. Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- fs/squashfs/Kconfig | 57 + fs/squashfs/Makefile| 10 +-- fs/squashfs/decompressor_multi_percpu.c | 105 +++ 3 files changed, 152 insertions(+), 20 deletions(-) create mode 100644 fs/squashfs/decompressor_multi_percpu.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index 1c6d340..c92c75f 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -25,6 +25,50 @@ config SQUASHFS If unsure, say N. +choice + prompt Decompressor parallelisation options + depends on SQUASHFS + help + Squashfs now supports three parallelisation options for + decompression. Each one exhibits various trade-offs between + decompression performance and CPU and memory usage. + + If in doubt, select Single threaded compression + +config SQUASHFS_DECOMP_SINGLE + bool Single threaded compression + help + Traditionally Squashfs has used single-threaded decompression. + Only one block (data or metadata) can be decompressed at any + one time. This limits CPU and memory usage to a minimum. + +config SQUASHFS_DECOMP_MULTI + bool Use multiple decompressors for parallel I/O + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU
[PATCH 4/4] Squashfs: Check stream is not NULL in decompressor_multi.c
Fix static checker complaint that stream is not checked in squashfs_decompressor_destroy(). Reported-by: Dan Carpenter Signed-off-by: Phillip Lougher --- fs/squashfs/decompressor_multi.c |7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/squashfs/decompressor_multi.c b/fs/squashfs/decompressor_multi.c index ae54675..d6008a6 100644 --- a/fs/squashfs/decompressor_multi.c +++ b/fs/squashfs/decompressor_multi.c @@ -119,11 +119,10 @@ void squashfs_decompressor_destroy(struct squashfs_sb_info *msblk) kfree(decomp_strm); stream->avail_decomp--; } + WARN_ON(stream->avail_decomp); + kfree(stream->comp_opts); + kfree(stream); } - - WARN_ON(stream->avail_decomp); - kfree(stream->comp_opts); - kfree(stream); } -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/4] Squashfs: Directly decompress into the page cache for file data (V3)
This introduces an implementation of squashfs_readpage_block() that directly decompresses into the page cache. This uses the previously added page handler abstraction to push down the necessary kmap_atomic/kunmap_atomic operations on the page cache buffers into the decompressors. This enables direct copying into the page cache without using the slow kmap/kunmap calls. The code detects when multiple threads are racing in squashfs_readpage() to decompress the same block, and avoids this regression by falling back to using an intermediate buffer. This patch enhances the performance of Squashfs significantly when multiple processes are accessing the filesystem simultaneously because it not only reduces memcopying, but it more importantly eliminates the lock contention on the intermediate buffer. Using single-thread decompression. dd if=file1 of=/dev/null bs=4096 & dd if=file2 of=/dev/null bs=4096 & dd if=file3 of=/dev/null bs=4096 & dd if=file4 of=/dev/null bs=4096 Before: 629145600 bytes (629 MB) copied, 45.8046 s, 13.7 MB/s After: 629145600 bytes (629 MB) copied, 9.29414 s, 67.7 MB/s V3: * Kconfig: fix mempcpy typo * squashfs_readpage_block(): Do not zero pages on decompress error * squashfs_read_cache(): free page in caller not here * page_actor.c: add comment regarding sleeping * page_actor.c: optimise code slightly Signed-off-by: Phillip Lougher --- fs/squashfs/Kconfig | 28 fs/squashfs/Makefile |4 +- fs/squashfs/file_direct.c | 173 + fs/squashfs/page_actor.c | 100 ++ fs/squashfs/page_actor.h | 32 + 5 files changed, 336 insertions(+), 1 deletion(-) create mode 100644 fs/squashfs/file_direct.c create mode 100644 fs/squashfs/page_actor.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index c92c75f..e88fc37 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -26,6 +26,34 @@ config SQUASHFS If unsure, say N. choice + prompt "File decompression options" + depends on SQUASHFS + help + Squashfs now supports two options for decompressing file + data. Traditionally Squashfs has decompressed into an + intermediate buffer and then memcopied it into the page cache. + Squashfs now supports the ability to decompress directly into + the page cache. + + If unsure, select "Decompress file data into an intermediate buffer" + +config SQUASHFS_FILE_CACHE + bool "Decompress file data into an intermediate buffer" + help + Decompress file data into an intermediate buffer and then + memcopy it into the page cache. + +config SQUASHFS_FILE_DIRECT + bool "Decompress files directly into the page cache" + help + Directly decompress file data into the page cache. + Doing so can significantly improve performance because + it eliminates a memcpy and it also removes the lock contention + on the single buffer. + +endchoice + +choice prompt "Decompressor parallelisation options" depends on SQUASHFS help diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index e01ba11..4132520 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,9 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o file_cache.o +squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-$(CONFIG_SQUASHFS_FILE_CACHE) += file_cache.o +squashfs-$(CONFIG_SQUASHFS_FILE_DIRECT) += file_direct.o page_actor.o squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c new file mode 100644 index 000..2943b2b --- /dev/null +++ b/fs/squashfs/file_direct.c @@ -0,0 +1,173 @@ +/* + * Copyright (c) 2013 + * Phillip Lougher + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "squashfs_fs.h" +#include "squashfs_fs_sb.h" +#include "squashfs_fs_i.h" +#include "squashfs.h" +#include "page_actor.h" + +static int squashfs_read_cache(struct page *target_page, u64 block, int bsize, + int pages, struct page **page); + +/* Read separately compressed datablock directly into page cache */ +int squashfs_readpage_block(struct page *target_page, u64 block, int bsize) + +{ + struct inode *inode = target_page->mapping->host; + struct squashfs_sb_info *msblk = inode-&g
[PATCH 1/4] Squashfs: Refactor decompressor interface and code (V3)
The decompressor interface and code was written from the point of view of single-threaded operation. In doing so it mixed a lot of single-threaded implementation specific aspects into the decompressor code and elsewhere which makes it difficult to seamlessly support multiple different decompressor implementations. This patch does the following: 1. It removes compressor_options parsing from the decompressor init() function. This allows the decompressor init() function to be dynamically called to instantiate multiple decompressors, without the compressor options needing to be read and parsed each time. 2. It moves threading and all sleeping operations out of the decompressors. In doing so, it makes the decompressors non-blocking wrappers which only deal with interfacing with the decompressor implementation. 3. It splits decompressor.[ch] into decompressor generic functions in decompressor.[ch], and moves the single threaded decompressor implementation into decompressor_single.c. The result of this patch is Squashfs should now be able to support multiple decompressors by adding new decompressor_xxx.c files with specialised implementations of the functions in decompressor_single.c V3: * decompressor_single.c: Remove kfree(comp_opts) in error path * of squashfs_decompressor_create() * Double free of comp_opts found by static analysis, reported by * Dan Carpenter Signed-off-by: Phillip Lougher --- fs/squashfs/Makefile |2 +- fs/squashfs/block.c | 11 +++-- fs/squashfs/decompressor.c| 47 +--- fs/squashfs/decompressor.h| 21 +++-- fs/squashfs/decompressor_single.c | 86 +++ fs/squashfs/lzo_wrapper.c | 24 +++--- fs/squashfs/squashfs.h|9 +++- fs/squashfs/squashfs_fs_sb.h |3 +- fs/squashfs/super.c | 10 ++--- fs/squashfs/xz_wrapper.c | 89 - fs/squashfs/zlib_wrapper.c| 50 +++-- 11 files changed, 216 insertions(+), 136 deletions(-) create mode 100644 fs/squashfs/decompressor_single.c diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 110b047..c223c84 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,7 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-y += namei.o super.o symlink.o decompressor.o decompressor_single.o squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 41d108e..4dd4025 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -93,7 +93,7 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, struct buffer_head **bh; int offset = index & ((1 << msblk->devblksize_log2) - 1); u64 cur_index = index >> msblk->devblksize_log2; - int bytes, compressed, b = 0, k = 0, page = 0, avail; + int bytes, compressed, b = 0, k = 0, page = 0, avail, i; bh = kcalloc(((srclength + msblk->devblksize - 1) >> msblk->devblksize_log2) + 1, sizeof(*bh), GFP_KERNEL); @@ -158,6 +158,12 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, ll_rw_block(READ, b - 1, bh + 1); } + for (i = 0; i < b; i++) { + wait_on_buffer(bh[i]); + if (!buffer_uptodate(bh[i])) + goto block_release; + } + if (compressed) { length = squashfs_decompress(msblk, buffer, bh, b, offset, length, srclength, pages); @@ -172,9 +178,6 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, for (bytes = length; k < b; k++) { in = min(bytes, msblk->devblksize - offset); bytes -= in; - wait_on_buffer(bh[k]); - if (!buffer_uptodate(bh[k])) - goto block_release; while (in) { if (pg_offset == PAGE_CACHE_SIZE) { page++; diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c index 3f6271d..234291f 100644 --- a/fs/squashfs/decompressor.c +++ b/fs/squashfs/decompressor.c @@ -37,29 +37,29 @@ */ static const struct squashfs_decompressor squashfs_lzma_unsupported_comp_ops = { - NULL, NULL, NULL, LZMA_COMPRESSION, "lzma", 0 + NULL, NULL, NULL, NULL, LZMA_COMPRESSION, "lzma", 0 }; #ifndef CONFIG_SQUASHFS_LZO static const struct squashfs_decompressor squashfs_lzo_c
[PATCH 2/4] Squashfs: Generalise paging handling in the decompressors (V3)
Further generalise the decompressors by adding a page handler abstraction. This adds helpers to allow the decompressors to access and process the output buffers in an implementation independant manner. This allows different types of output buffer to be passed to the decompressors, with the implementation specific aspects handled at decompression time, but without the knowledge being held in the decompressor wrapper code. This will allow the decompressors to handle Squashfs cache buffers, and page cache pages. This patch adds the abstraction and an implementation for the caches. V3: slightly optimise code in page_actor.h Signed-off-by: Phillip Lougher --- fs/squashfs/block.c | 27 + fs/squashfs/cache.c | 28 ++ fs/squashfs/decompressor.c | 14 +++-- fs/squashfs/decompressor.h |5 ++-- fs/squashfs/decompressor_multi.c|7 ++--- fs/squashfs/decompressor_multi_percpu.c |9 +++--- fs/squashfs/decompressor_single.c |9 +++--- fs/squashfs/lzo_wrapper.c | 27 +++-- fs/squashfs/page_actor.h| 49 +++ fs/squashfs/squashfs.h |8 ++--- fs/squashfs/squashfs_fs_sb.h|1 + fs/squashfs/xz_wrapper.c| 22 -- fs/squashfs/zlib_wrapper.c | 24 +-- 13 files changed, 163 insertions(+), 67 deletions(-) create mode 100644 fs/squashfs/page_actor.h diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 4dd4025..0cea9b9 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -36,6 +36,7 @@ #include "squashfs_fs_sb.h" #include "squashfs.h" #include "decompressor.h" +#include "page_actor.h" /* * Read the metadata block length, this is stored in the first two @@ -86,16 +87,16 @@ static struct buffer_head *get_block_length(struct super_block *sb, * generated a larger block - this does occasionally happen with compression * algorithms). */ -int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, - int length, u64 *next_index, int srclength, int pages) +int squashfs_read_data(struct super_block *sb, u64 index, int length, + u64 *next_index, struct squashfs_page_actor *output) { struct squashfs_sb_info *msblk = sb->s_fs_info; struct buffer_head **bh; int offset = index & ((1 << msblk->devblksize_log2) - 1); u64 cur_index = index >> msblk->devblksize_log2; - int bytes, compressed, b = 0, k = 0, page = 0, avail, i; + int bytes, compressed, b = 0, k = 0, avail, i; - bh = kcalloc(((srclength + msblk->devblksize - 1) + bh = kcalloc(((output->length + msblk->devblksize - 1) >> msblk->devblksize_log2) + 1, sizeof(*bh), GFP_KERNEL); if (bh == NULL) return -ENOMEM; @@ -111,9 +112,9 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, *next_index = index + length; TRACE("Block @ 0x%llx, %scompressed size %d, src size %d\n", - index, compressed ? "" : "un", length, srclength); + index, compressed ? "" : "un", length, output->length); - if (length < 0 || length > srclength || + if (length < 0 || length > output->length || (index + length) > msblk->bytes_used) goto read_failure; @@ -145,7 +146,7 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, TRACE("Block @ 0x%llx, %scompressed size %d\n", index, compressed ? "" : "un", length); - if (length < 0 || length > srclength || + if (length < 0 || length > output->length || (index + length) > msblk->bytes_used) goto block_release; @@ -165,8 +166,8 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, } if (compressed) { - length = squashfs_decompress(msblk, buffer, bh, b, offset, -length, srclength, pages); + length = squashfs_decompress(msblk, bh, b, offset, length, + output); if (length < 0) goto read_failure; } else { @@ -174,19 +175,20 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, * Block is uncompressed. */ int in, pg_offset = 0; + void *data = squash
[PATCH 0/4] Squashfs: performance improvement patches (V3)
Hi, This is a partial patchset of V3 patches following review by myself and others. Changes are described in the individual patches. There is one additional patch which fixes a static checker error. Squashfs: Refactor decompressor interface and code (V3) Squashfs: Generalise paging handling in the decompressors (V3) Squashfs: Directly decompress into the page cache for file data (V3) Squashfs: Check stream is not NULL in decompressor_multi.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/4] Squashfs: performance improvement patches (V3)
Hi, This is a partial patchset of V3 patches following review by myself and others. Changes are described in the individual patches. There is one additional patch which fixes a static checker error. Squashfs: Refactor decompressor interface and code (V3) Squashfs: Generalise paging handling in the decompressors (V3) Squashfs: Directly decompress into the page cache for file data (V3) Squashfs: Check stream is not NULL in decompressor_multi.c -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/4] Squashfs: Refactor decompressor interface and code (V3)
The decompressor interface and code was written from the point of view of single-threaded operation. In doing so it mixed a lot of single-threaded implementation specific aspects into the decompressor code and elsewhere which makes it difficult to seamlessly support multiple different decompressor implementations. This patch does the following: 1. It removes compressor_options parsing from the decompressor init() function. This allows the decompressor init() function to be dynamically called to instantiate multiple decompressors, without the compressor options needing to be read and parsed each time. 2. It moves threading and all sleeping operations out of the decompressors. In doing so, it makes the decompressors non-blocking wrappers which only deal with interfacing with the decompressor implementation. 3. It splits decompressor.[ch] into decompressor generic functions in decompressor.[ch], and moves the single threaded decompressor implementation into decompressor_single.c. The result of this patch is Squashfs should now be able to support multiple decompressors by adding new decompressor_xxx.c files with specialised implementations of the functions in decompressor_single.c V3: * decompressor_single.c: Remove kfree(comp_opts) in error path * of squashfs_decompressor_create() * Double free of comp_opts found by static analysis, reported by * Dan Carpenter Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- fs/squashfs/Makefile |2 +- fs/squashfs/block.c | 11 +++-- fs/squashfs/decompressor.c| 47 +--- fs/squashfs/decompressor.h| 21 +++-- fs/squashfs/decompressor_single.c | 86 +++ fs/squashfs/lzo_wrapper.c | 24 +++--- fs/squashfs/squashfs.h|9 +++- fs/squashfs/squashfs_fs_sb.h |3 +- fs/squashfs/super.c | 10 ++--- fs/squashfs/xz_wrapper.c | 89 - fs/squashfs/zlib_wrapper.c| 50 +++-- 11 files changed, 216 insertions(+), 136 deletions(-) create mode 100644 fs/squashfs/decompressor_single.c diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 110b047..c223c84 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,7 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-y += namei.o super.o symlink.o decompressor.o decompressor_single.o squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 41d108e..4dd4025 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -93,7 +93,7 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, struct buffer_head **bh; int offset = index ((1 msblk-devblksize_log2) - 1); u64 cur_index = index msblk-devblksize_log2; - int bytes, compressed, b = 0, k = 0, page = 0, avail; + int bytes, compressed, b = 0, k = 0, page = 0, avail, i; bh = kcalloc(((srclength + msblk-devblksize - 1) msblk-devblksize_log2) + 1, sizeof(*bh), GFP_KERNEL); @@ -158,6 +158,12 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, ll_rw_block(READ, b - 1, bh + 1); } + for (i = 0; i b; i++) { + wait_on_buffer(bh[i]); + if (!buffer_uptodate(bh[i])) + goto block_release; + } + if (compressed) { length = squashfs_decompress(msblk, buffer, bh, b, offset, length, srclength, pages); @@ -172,9 +178,6 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, for (bytes = length; k b; k++) { in = min(bytes, msblk-devblksize - offset); bytes -= in; - wait_on_buffer(bh[k]); - if (!buffer_uptodate(bh[k])) - goto block_release; while (in) { if (pg_offset == PAGE_CACHE_SIZE) { page++; diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c index 3f6271d..234291f 100644 --- a/fs/squashfs/decompressor.c +++ b/fs/squashfs/decompressor.c @@ -37,29 +37,29 @@ */ static const struct squashfs_decompressor squashfs_lzma_unsupported_comp_ops = { - NULL, NULL, NULL, LZMA_COMPRESSION, lzma, 0 + NULL, NULL, NULL, NULL, LZMA_COMPRESSION, lzma, 0 }; #ifndef CONFIG_SQUASHFS_LZO static const struct squashfs_decompressor squashfs_lzo_comp_ops = { - NULL, NULL, NULL, LZO_COMPRESSION, lzo
[PATCH 2/4] Squashfs: Generalise paging handling in the decompressors (V3)
Further generalise the decompressors by adding a page handler abstraction. This adds helpers to allow the decompressors to access and process the output buffers in an implementation independant manner. This allows different types of output buffer to be passed to the decompressors, with the implementation specific aspects handled at decompression time, but without the knowledge being held in the decompressor wrapper code. This will allow the decompressors to handle Squashfs cache buffers, and page cache pages. This patch adds the abstraction and an implementation for the caches. V3: slightly optimise code in page_actor.h Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- fs/squashfs/block.c | 27 + fs/squashfs/cache.c | 28 ++ fs/squashfs/decompressor.c | 14 +++-- fs/squashfs/decompressor.h |5 ++-- fs/squashfs/decompressor_multi.c|7 ++--- fs/squashfs/decompressor_multi_percpu.c |9 +++--- fs/squashfs/decompressor_single.c |9 +++--- fs/squashfs/lzo_wrapper.c | 27 +++-- fs/squashfs/page_actor.h| 49 +++ fs/squashfs/squashfs.h |8 ++--- fs/squashfs/squashfs_fs_sb.h|1 + fs/squashfs/xz_wrapper.c| 22 -- fs/squashfs/zlib_wrapper.c | 24 +-- 13 files changed, 163 insertions(+), 67 deletions(-) create mode 100644 fs/squashfs/page_actor.h diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 4dd4025..0cea9b9 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -36,6 +36,7 @@ #include squashfs_fs_sb.h #include squashfs.h #include decompressor.h +#include page_actor.h /* * Read the metadata block length, this is stored in the first two @@ -86,16 +87,16 @@ static struct buffer_head *get_block_length(struct super_block *sb, * generated a larger block - this does occasionally happen with compression * algorithms). */ -int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, - int length, u64 *next_index, int srclength, int pages) +int squashfs_read_data(struct super_block *sb, u64 index, int length, + u64 *next_index, struct squashfs_page_actor *output) { struct squashfs_sb_info *msblk = sb-s_fs_info; struct buffer_head **bh; int offset = index ((1 msblk-devblksize_log2) - 1); u64 cur_index = index msblk-devblksize_log2; - int bytes, compressed, b = 0, k = 0, page = 0, avail, i; + int bytes, compressed, b = 0, k = 0, avail, i; - bh = kcalloc(((srclength + msblk-devblksize - 1) + bh = kcalloc(((output-length + msblk-devblksize - 1) msblk-devblksize_log2) + 1, sizeof(*bh), GFP_KERNEL); if (bh == NULL) return -ENOMEM; @@ -111,9 +112,9 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, *next_index = index + length; TRACE(Block @ 0x%llx, %scompressed size %d, src size %d\n, - index, compressed ? : un, length, srclength); + index, compressed ? : un, length, output-length); - if (length 0 || length srclength || + if (length 0 || length output-length || (index + length) msblk-bytes_used) goto read_failure; @@ -145,7 +146,7 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, TRACE(Block @ 0x%llx, %scompressed size %d\n, index, compressed ? : un, length); - if (length 0 || length srclength || + if (length 0 || length output-length || (index + length) msblk-bytes_used) goto block_release; @@ -165,8 +166,8 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, } if (compressed) { - length = squashfs_decompress(msblk, buffer, bh, b, offset, -length, srclength, pages); + length = squashfs_decompress(msblk, bh, b, offset, length, + output); if (length 0) goto read_failure; } else { @@ -174,19 +175,20 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, * Block is uncompressed. */ int in, pg_offset = 0; + void *data = squashfs_first_page(output); for (bytes = length; k b; k++) { in = min(bytes, msblk-devblksize - offset); bytes -= in; while (in) { if (pg_offset == PAGE_CACHE_SIZE
[PATCH 4/4] Squashfs: Check stream is not NULL in decompressor_multi.c
Fix static checker complaint that stream is not checked in squashfs_decompressor_destroy(). Reported-by: Dan Carpenter dan.carpen...@oracle.com Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- fs/squashfs/decompressor_multi.c |7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/squashfs/decompressor_multi.c b/fs/squashfs/decompressor_multi.c index ae54675..d6008a6 100644 --- a/fs/squashfs/decompressor_multi.c +++ b/fs/squashfs/decompressor_multi.c @@ -119,11 +119,10 @@ void squashfs_decompressor_destroy(struct squashfs_sb_info *msblk) kfree(decomp_strm); stream-avail_decomp--; } + WARN_ON(stream-avail_decomp); + kfree(stream-comp_opts); + kfree(stream); } - - WARN_ON(stream-avail_decomp); - kfree(stream-comp_opts); - kfree(stream); } -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/4] Squashfs: Directly decompress into the page cache for file data (V3)
This introduces an implementation of squashfs_readpage_block() that directly decompresses into the page cache. This uses the previously added page handler abstraction to push down the necessary kmap_atomic/kunmap_atomic operations on the page cache buffers into the decompressors. This enables direct copying into the page cache without using the slow kmap/kunmap calls. The code detects when multiple threads are racing in squashfs_readpage() to decompress the same block, and avoids this regression by falling back to using an intermediate buffer. This patch enhances the performance of Squashfs significantly when multiple processes are accessing the filesystem simultaneously because it not only reduces memcopying, but it more importantly eliminates the lock contention on the intermediate buffer. Using single-thread decompression. dd if=file1 of=/dev/null bs=4096 dd if=file2 of=/dev/null bs=4096 dd if=file3 of=/dev/null bs=4096 dd if=file4 of=/dev/null bs=4096 Before: 629145600 bytes (629 MB) copied, 45.8046 s, 13.7 MB/s After: 629145600 bytes (629 MB) copied, 9.29414 s, 67.7 MB/s V3: * Kconfig: fix mempcpy typo * squashfs_readpage_block(): Do not zero pages on decompress error * squashfs_read_cache(): free page in caller not here * page_actor.c: add comment regarding sleeping * page_actor.c: optimise code slightly Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- fs/squashfs/Kconfig | 28 fs/squashfs/Makefile |4 +- fs/squashfs/file_direct.c | 173 + fs/squashfs/page_actor.c | 100 ++ fs/squashfs/page_actor.h | 32 + 5 files changed, 336 insertions(+), 1 deletion(-) create mode 100644 fs/squashfs/file_direct.c create mode 100644 fs/squashfs/page_actor.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index c92c75f..e88fc37 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -26,6 +26,34 @@ config SQUASHFS If unsure, say N. choice + prompt File decompression options + depends on SQUASHFS + help + Squashfs now supports two options for decompressing file + data. Traditionally Squashfs has decompressed into an + intermediate buffer and then memcopied it into the page cache. + Squashfs now supports the ability to decompress directly into + the page cache. + + If unsure, select Decompress file data into an intermediate buffer + +config SQUASHFS_FILE_CACHE + bool Decompress file data into an intermediate buffer + help + Decompress file data into an intermediate buffer and then + memcopy it into the page cache. + +config SQUASHFS_FILE_DIRECT + bool Decompress files directly into the page cache + help + Directly decompress file data into the page cache. + Doing so can significantly improve performance because + it eliminates a memcpy and it also removes the lock contention + on the single buffer. + +endchoice + +choice prompt Decompressor parallelisation options depends on SQUASHFS help diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index e01ba11..4132520 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,9 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o file_cache.o +squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-$(CONFIG_SQUASHFS_FILE_CACHE) += file_cache.o +squashfs-$(CONFIG_SQUASHFS_FILE_DIRECT) += file_direct.o page_actor.o squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c new file mode 100644 index 000..2943b2b --- /dev/null +++ b/fs/squashfs/file_direct.c @@ -0,0 +1,173 @@ +/* + * Copyright (c) 2013 + * Phillip Lougher phil...@squashfs.org.uk + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include linux/fs.h +#include linux/vfs.h +#include linux/kernel.h +#include linux/slab.h +#include linux/string.h +#include linux/pagemap.h +#include linux/mutex.h + +#include squashfs_fs.h +#include squashfs_fs_sb.h +#include squashfs_fs_i.h +#include squashfs.h +#include page_actor.h + +static int squashfs_read_cache(struct page *target_page, u64 block, int bsize, + int pages, struct page **page); + +/* Read separately compressed datablock directly into page cache */ +int squashfs_readpage_block(struct page *target_page, u64 block, int bsize) + +{ + struct inode *inode = target_page-mapping-host; + struct squashfs_sb_info *msblk = inode-i_sb-s_fs_info
[PATCH 4/6] Squashfs: Generalise paging handling in the decompressors (V2)
Further generalise the decompressors by adding a page handler abstraction. This adds helpers to allow the decompressors to access and process the output buffers in an implementation independant manner. This allows different types of output buffer to be passed to the decompressors, with the implementation specific aspects handled at decompression time, but without the knowledge being held in the decompressor wrapper code. This will allow the decompressors to handle Squashfs cache buffers, and page cache pages. This patch adds the abstraction and an implementation for the caches. V2: also update the code in decompressor_multi*.c Signed-off-by: Phillip Lougher --- fs/squashfs/block.c | 27 +--- fs/squashfs/cache.c | 28 +--- fs/squashfs/decompressor.c | 14 ++-- fs/squashfs/decompressor.h |5 +-- fs/squashfs/decompressor_multi.c|7 ++-- fs/squashfs/decompressor_multi_percpu.c |9 +++--- fs/squashfs/decompressor_single.c |9 +++--- fs/squashfs/lzo_wrapper.c | 27 ++-- fs/squashfs/page_actor.h| 54 +++ fs/squashfs/squashfs.h |8 ++--- fs/squashfs/squashfs_fs_sb.h|1 + fs/squashfs/xz_wrapper.c| 22 +++-- fs/squashfs/zlib_wrapper.c | 24 -- 13 files changed, 168 insertions(+), 67 deletions(-) create mode 100644 fs/squashfs/page_actor.h diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 4dd4025..0cea9b9 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -36,6 +36,7 @@ #include "squashfs_fs_sb.h" #include "squashfs.h" #include "decompressor.h" +#include "page_actor.h" /* * Read the metadata block length, this is stored in the first two @@ -86,16 +87,16 @@ static struct buffer_head *get_block_length(struct super_block *sb, * generated a larger block - this does occasionally happen with compression * algorithms). */ -int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, - int length, u64 *next_index, int srclength, int pages) +int squashfs_read_data(struct super_block *sb, u64 index, int length, + u64 *next_index, struct squashfs_page_actor *output) { struct squashfs_sb_info *msblk = sb->s_fs_info; struct buffer_head **bh; int offset = index & ((1 << msblk->devblksize_log2) - 1); u64 cur_index = index >> msblk->devblksize_log2; - int bytes, compressed, b = 0, k = 0, page = 0, avail, i; + int bytes, compressed, b = 0, k = 0, avail, i; - bh = kcalloc(((srclength + msblk->devblksize - 1) + bh = kcalloc(((output->length + msblk->devblksize - 1) >> msblk->devblksize_log2) + 1, sizeof(*bh), GFP_KERNEL); if (bh == NULL) return -ENOMEM; @@ -111,9 +112,9 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, *next_index = index + length; TRACE("Block @ 0x%llx, %scompressed size %d, src size %d\n", - index, compressed ? "" : "un", length, srclength); + index, compressed ? "" : "un", length, output->length); - if (length < 0 || length > srclength || + if (length < 0 || length > output->length || (index + length) > msblk->bytes_used) goto read_failure; @@ -145,7 +146,7 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, TRACE("Block @ 0x%llx, %scompressed size %d\n", index, compressed ? "" : "un", length); - if (length < 0 || length > srclength || + if (length < 0 || length > output->length || (index + length) > msblk->bytes_used) goto block_release; @@ -165,8 +166,8 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, } if (compressed) { - length = squashfs_decompress(msblk, buffer, bh, b, offset, -length, srclength, pages); + length = squashfs_decompress(msblk, bh, b, offset, length, + output); if (length < 0) goto read_failure; } else { @@ -174,19 +175,20 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, * Block is uncompressed. */ int in, pg_offset = 0; + void *data = squash
[PATCH 6/6] Squashfs: Directly decompress into the page cache for file data (V2)
This introduces an implementation of squashfs_readpage_block() that directly decompresses into the page cache. This uses the previously added page handler abstraction to push down the necessary kmap_atomic/kunmap_atomic operations on the page cache buffers into the decompressors. This enables direct copying into the page cache without using the slow kmap/kunmap calls. The code detects when multiple threads are racing in squashfs_readpage() to decompress the same block, and avoids this regression by falling back to using an intermediate buffer. This patch enhances the performance of Squashfs significantly when multiple processes are accessing the filesystem simultaneously because it not only reduces memcopying, but it more importantly eliminates the lock contention on the intermediate buffer. Using single-thread decompression. dd if=file1 of=/dev/null bs=4096 & dd if=file2 of=/dev/null bs=4096 & dd if=file3 of=/dev/null bs=4096 & dd if=file4 of=/dev/null bs=4096 Before: 629145600 bytes (629 MB) copied, 45.8046 s, 13.7 MB/s After: 629145600 bytes (629 MB) copied, 9.29414 s, 67.7 MB/s V2: * update comment adding failure to grab pages could be because we've been VM reclaimed, but the other pages are still in the page cache and uptodate. * Make Kconfig option a choice, making the either-other nature of the option more explicit, and also tidying up the ifdef in the Makefile Signed-off-by: Phillip Lougher --- fs/squashfs/Kconfig | 28 +++ fs/squashfs/Makefile |4 +- fs/squashfs/file_direct.c | 178 + fs/squashfs/page_actor.c | 104 ++ fs/squashfs/page_actor.h | 32 5 files changed, 345 insertions(+), 1 deletion(-) create mode 100644 fs/squashfs/file_direct.c create mode 100644 fs/squashfs/page_actor.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index c92c75f..3a21adf 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -26,6 +26,34 @@ config SQUASHFS If unsure, say N. choice + prompt "File decompression options" + depends on SQUASHFS + help + Squashfs now supports two options for decompressing file + data. Traditionally Squashfs has decompressed into an + intermediate buffer and then memcopied it into the page cache. + Squashfs now supports the ability to decompress directly into + the page cache. + + If unsure, select "Decompress file data into an intermediate buffer" + +config SQUASHFS_FILE_CACHE + bool "Decompress file data into an intermediate buffer" + help + Decompress file data into an intermediate buffer and then + memcopy it into the page cache. + +config SQUASHFS_FILE_DIRECT + bool "Decompress files directly into the page cache" + help + Directly decompress file data into the page cache. + Doing so can significantly improve performance because + it eliminates a mempcpy and it also removes the lock contention + on the single buffer. + +endchoice + +choice prompt "Decompressor parallelisation options" depends on SQUASHFS help diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 908c0d9..4132520 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,9 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o file_cache.c +squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-$(CONFIG_SQUASHFS_FILE_CACHE) += file_cache.o +squashfs-$(CONFIG_SQUASHFS_FILE_DIRECT) += file_direct.o page_actor.o squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c new file mode 100644 index 000..d020d94 --- /dev/null +++ b/fs/squashfs/file_direct.c @@ -0,0 +1,178 @@ +/* + * Copyright (c) 2013 + * Phillip Lougher + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "squashfs_fs.h" +#include "squashfs_fs_sb.h" +#include "squashfs_fs_i.h" +#include "squashfs.h" +#include "page_actor.h" + +static int squashfs_read_cache(struct page *target_page, u64 block, int bsize, + int pages, struct page **page); + +/* Read separately compressed datablock directly into page cache */ +int squashfs_readpage_block(struct page *target_page, u64 block, int bsize) + +{ + struct inode *inode = target_page->mapping-
[PATCH 1/6] Squashfs: Refactor decompressor interface and code (V2)
The decompressor interface and code was written from the point of view of single-threaded operation. In doing so it mixed a lot of single-threaded implementation specific aspects into the decompressor code and elsewhere which makes it difficult to seamlessly support multiple different decompressor implementations. This patch does the following: 1. It removes compressor_options parsing from the decompressor init() function. This allows the decompressor init() function to be dynamically called to instantiate multiple decompressors, without the compressor options needing to be read and parsed each time. 2. It moves threading and all sleeping operations out of the decompressors. In doing so, it makes the decompressors non-blocking wrappers which only deal with interfacing with the decompressor implementation. 3. It splits decompressor.[ch] into decompressor generic functions in decompressor.[ch], and moves the single threaded decompressor implementation into decompressor_single.c. The result of this patch is Squashfs should now be able to support multiple decompressors by adding new decompressor_xxx.c files with specialised implementations of the functions in decompressor_single.c V2: * decompressor_single.c: Shorten header comment by removing GPL licence text * decompressor_single.c: Remove comp_opts from struct squashfs_stream, it's not needed in this implementation * Fix checkpatch.pl errors Signed-off-by: Phillip Lougher Reviewed-by: Minchan Kim --- fs/squashfs/Makefile |2 +- fs/squashfs/block.c | 11 +++-- fs/squashfs/decompressor.c| 47 +--- fs/squashfs/decompressor.h| 21 +++-- fs/squashfs/decompressor_single.c | 86 +++ fs/squashfs/lzo_wrapper.c | 24 +++--- fs/squashfs/squashfs.h|9 +++- fs/squashfs/squashfs_fs_sb.h |3 +- fs/squashfs/super.c | 10 ++--- fs/squashfs/xz_wrapper.c | 89 - fs/squashfs/zlib_wrapper.c| 50 +++-- 11 files changed, 216 insertions(+), 136 deletions(-) create mode 100644 fs/squashfs/decompressor_single.c diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 110b047..c223c84 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,7 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-y += namei.o super.o symlink.o decompressor.o decompressor_single.o squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 41d108e..4dd4025 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -93,7 +93,7 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, struct buffer_head **bh; int offset = index & ((1 << msblk->devblksize_log2) - 1); u64 cur_index = index >> msblk->devblksize_log2; - int bytes, compressed, b = 0, k = 0, page = 0, avail; + int bytes, compressed, b = 0, k = 0, page = 0, avail, i; bh = kcalloc(((srclength + msblk->devblksize - 1) >> msblk->devblksize_log2) + 1, sizeof(*bh), GFP_KERNEL); @@ -158,6 +158,12 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, ll_rw_block(READ, b - 1, bh + 1); } + for (i = 0; i < b; i++) { + wait_on_buffer(bh[i]); + if (!buffer_uptodate(bh[i])) + goto block_release; + } + if (compressed) { length = squashfs_decompress(msblk, buffer, bh, b, offset, length, srclength, pages); @@ -172,9 +178,6 @@ int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, for (bytes = length; k < b; k++) { in = min(bytes, msblk->devblksize - offset); bytes -= in; - wait_on_buffer(bh[k]); - if (!buffer_uptodate(bh[k])) - goto block_release; while (in) { if (pg_offset == PAGE_CACHE_SIZE) { page++; diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c index 3f6271d..234291f 100644 --- a/fs/squashfs/decompressor.c +++ b/fs/squashfs/decompressor.c @@ -37,29 +37,29 @@ */ static const struct squashfs_decompressor squashfs_lzma_unsupported_comp_ops = { - NULL, NULL, NULL, LZMA_COMPRESSION, "lzma", 0 + NULL, NULL, NULL, NULL, LZMA_COMPRESSION, "lzma", 0 }; #ifndef
[PATCH 2/6] Squashfs: enhance parallel I/O
From: Minchan Kim Now squashfs have used for only one stream buffer for decompression so it hurts parallel read performance so this patch supports multiple decompressor to enhance performance parallel I/O. Four 1G file dd read on KVM machine which has 2 CPU and 4G memory. dd if=test/test1.dat of=/dev/null & dd if=test/test2.dat of=/dev/null & dd if=test/test3.dat of=/dev/null & dd if=test/test4.dat of=/dev/null & old : 1m39s -> new : 9s * From v1 * Change comp_strm with decomp_strm - Phillip * Change/add comments - Phillip Signed-off-by: Minchan Kim Signed-off-by: Phillip Lougher --- fs/squashfs/Kconfig | 13 +++ fs/squashfs/Makefile |9 +- fs/squashfs/decompressor_multi.c | 200 ++ 3 files changed, 221 insertions(+), 1 deletion(-) create mode 100644 fs/squashfs/decompressor_multi.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index c70111e..1c6d340 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -63,6 +63,19 @@ config SQUASHFS_LZO If unsure, say N. +config SQUASHFS_MULTI_DECOMPRESSOR + bool "Use multiple decompressors for handling parallel I/O" + depends on SQUASHFS + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + If you have a parallel I/O workload and your system has enough memory, + using this option may improve overall I/O performance. + + If unsure, say N. + config SQUASHFS_XZ bool "Include support for XZ compressed file systems" depends on SQUASHFS diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index c223c84..dfebc3b 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,8 +4,15 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o decompressor_single.o +squashfs-y += namei.o super.o symlink.o decompressor.o + squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o + +ifdef CONFIG_SQUASHFS_MULTI_DECOMPRESSOR + squashfs-y += decompressor_multi.o +else + squashfs-y += decompressor_single.o +endif diff --git a/fs/squashfs/decompressor_multi.c b/fs/squashfs/decompressor_multi.c new file mode 100644 index 000..462731d --- /dev/null +++ b/fs/squashfs/decompressor_multi.c @@ -0,0 +1,200 @@ +/* + * Copyright (c) 2013 + * Minchan Kim + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ +#include +#include +#include +#include +#include +#include +#include + +#include "squashfs_fs.h" +#include "squashfs_fs_sb.h" +#include "decompressor.h" +#include "squashfs.h" + +/* + * This file implements multi-threaded decompression in the + * decompressor framework + */ + + +/* + * The reason that multiply two is that a CPU can request new I/O + * while it is waiting previous request. + */ +#define MAX_DECOMPRESSOR (num_online_cpus() * 2) + + +int squashfs_max_decompressors(void) +{ + return MAX_DECOMPRESSOR; +} + + +struct squashfs_stream { + void*comp_opts; + struct list_headstrm_list; + struct mutexmutex; + int avail_decomp; + wait_queue_head_t wait; +}; + + +struct decomp_stream { + void *stream; + struct list_head list; +}; + + +static void put_decomp_stream(struct decomp_stream *decomp_strm, + struct squashfs_stream *stream) +{ + mutex_lock(>mutex); + list_add(_strm->list, >strm_list); + mutex_unlock(>mutex); + wake_up(>wait); +} + +void *squashfs_decompressor_create(struct squashfs_sb_info *msblk, + void *comp_opts) +{ + struct squashfs_stream *stream; + struct decomp_stream *decomp_strm = NULL; + int err = -ENOMEM; + + stream = kzalloc(sizeof(*stream), GFP_KERNEL); + if (!stream) + goto out; + + stream->comp_opts = comp_opts; + mutex_init(>mutex); + INIT_LIST_HEAD(>strm_list); + init_waitqueue_head(>wait); + + /* +* We should have a decompressor at least as default +* so if we fail to allocate new decompressor dynamically, +* we could always fall back to default decompressor and +* file system works. +*/ + decomp_strm = kmalloc(sizeof(*decomp_strm), GFP_KERNEL); + i
[PATCH 3/6] Squashfs: add multi-threaded decompression using percpu variables
Add a multi-threaded decompression implementation which uses percpu variables. Using percpu variables has advantages and disadvantages over implementations which do not use percpu variables. Advantages: the nature of percpu variables ensures decompression is load-balanced across the multiple cores. Disadvantages: it limits decompression to one thread per core. Signed-off-by: Phillip Lougher --- fs/squashfs/Kconfig | 57 + fs/squashfs/Makefile| 10 +-- fs/squashfs/decompressor_multi_percpu.c | 105 +++ 3 files changed, 152 insertions(+), 20 deletions(-) create mode 100644 fs/squashfs/decompressor_multi_percpu.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index 1c6d340..c92c75f 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -25,6 +25,50 @@ config SQUASHFS If unsure, say N. +choice + prompt "Decompressor parallelisation options" + depends on SQUASHFS + help + Squashfs now supports three parallelisation options for + decompression. Each one exhibits various trade-offs between + decompression performance and CPU and memory usage. + + If in doubt, select "Single threaded compression" + +config SQUASHFS_DECOMP_SINGLE + bool "Single threaded compression" + help + Traditionally Squashfs has used single-threaded decompression. + Only one block (data or metadata) can be decompressed at any + one time. This limits CPU and memory usage to a minimum. + +config SQUASHFS_DECOMP_MULTI + bool "Use multiple decompressors for parallel I/O" + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + If you have a parallel I/O workload and your system has enough memory, + using this option may improve overall I/O performance. + + This decompressor implementation uses up to two parallel + decompressors per core. It dynamically allocates decompressors + on a demand basis. + +config SQUASHFS_DECOMP_MULTI_PERCPU + bool "Use percpu multiple decompressors for parallel I/O" + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + This decompressor implementation uses a maximum of one + decompressor per core. It uses percpu variables to ensure + decompression is load-balanced across the cores. + +endchoice + config SQUASHFS_XATTR bool "Squashfs XATTR support" depends on SQUASHFS @@ -63,19 +107,6 @@ config SQUASHFS_LZO If unsure, say N. -config SQUASHFS_MULTI_DECOMPRESSOR - bool "Use multiple decompressors for handling parallel I/O" - depends on SQUASHFS - help - By default Squashfs uses a single decompressor but it gives - poor performance on parallel I/O workloads when using multiple CPU - machines due to waiting on decompressor availability. - - If you have a parallel I/O workload and your system has enough memory, - using this option may improve overall I/O performance. - - If unsure, say N. - config SQUASHFS_XZ bool "Include support for XZ compressed file systems" depends on SQUASHFS diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index dfebc3b..5833b96 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -5,14 +5,10 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o squashfs-y += namei.o super.o symlink.o decompressor.o - +squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o +squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o +squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o - -ifdef CONFIG_SQUASHFS_MULTI_DECOMPRESSOR - squashfs-y += decompressor_multi.o -else - squashfs-y += decompressor_single.o -endif diff --git a/fs/squashfs/decompressor_multi_percpu.c b/fs/squashfs/decompressor_multi_percpu.c new file mode 100644 index 000..b5598ab --- /dev/null +++ b/fs/squashfs/decompressor_multi_percpu.c @@ -0,0 +1,105 @@ +/* + * Copyright (c) 2013 + * Phillip Lougher + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include +#include +#includ
[PATCH 5/6] Squashfs: restructure squashfs_readpage()
Restructure squashfs_readpage() splitting it into separate functions for datablocks, fragments and sparse blocks. Move the memcpying (from squashfs cache entry) implementation of squashfs_readpage_block into file_cache.c This allows different implementations to be supported. Signed-off-by: Phillip Lougher --- fs/squashfs/Makefile |2 +- fs/squashfs/file.c | 142 +++--- fs/squashfs/file_cache.c | 38 + fs/squashfs/squashfs.h |7 +++ 4 files changed, 118 insertions(+), 71 deletions(-) create mode 100644 fs/squashfs/file_cache.c diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 5833b96..908c0d9 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,7 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-y += namei.o super.o symlink.o decompressor.o file_cache.c squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c index 8ca62c2..e5c9689 100644 --- a/fs/squashfs/file.c +++ b/fs/squashfs/file.c @@ -370,77 +370,15 @@ static int read_blocklist(struct inode *inode, int index, u64 *block) return le32_to_cpu(size); } - -static int squashfs_readpage(struct file *file, struct page *page) +/* Copy data into page cache */ +void squashfs_copy_cache(struct page *page, struct squashfs_cache_entry *buffer, + int bytes, int offset) { struct inode *inode = page->mapping->host; struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info; - int bytes, i, offset = 0, sparse = 0; - struct squashfs_cache_entry *buffer = NULL; void *pageaddr; - - int mask = (1 << (msblk->block_log - PAGE_CACHE_SHIFT)) - 1; - int index = page->index >> (msblk->block_log - PAGE_CACHE_SHIFT); - int start_index = page->index & ~mask; - int end_index = start_index | mask; - int file_end = i_size_read(inode) >> msblk->block_log; - - TRACE("Entered squashfs_readpage, page index %lx, start block %llx\n", - page->index, squashfs_i(inode)->start); - - if (page->index >= ((i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> - PAGE_CACHE_SHIFT)) - goto out; - - if (index < file_end || squashfs_i(inode)->fragment_block == - SQUASHFS_INVALID_BLK) { - /* -* Reading a datablock from disk. Need to read block list -* to get location and block size. -*/ - u64 block = 0; - int bsize = read_blocklist(inode, index, ); - if (bsize < 0) - goto error_out; - - if (bsize == 0) { /* hole */ - bytes = index == file_end ? - (i_size_read(inode) & (msblk->block_size - 1)) : -msblk->block_size; - sparse = 1; - } else { - /* -* Read and decompress datablock. -*/ - buffer = squashfs_get_datablock(inode->i_sb, - block, bsize); - if (buffer->error) { - ERROR("Unable to read page, block %llx, size %x" - "\n", block, bsize); - squashfs_cache_put(buffer); - goto error_out; - } - bytes = buffer->length; - } - } else { - /* -* Datablock is stored inside a fragment (tail-end packed -* block). -*/ - buffer = squashfs_get_fragment(inode->i_sb, - squashfs_i(inode)->fragment_block, - squashfs_i(inode)->fragment_size); - - if (buffer->error) { - ERROR("Unable to read page, block %llx, size %x\n", - squashfs_i(inode)->fragment_block, - squashfs_i(inode)->fragment_size); - squashfs_cache_put(buffer); - goto error_out; - } - bytes = i_size_read(inode) & (msblk->block_size - 1); - offset = squashfs_i(inode)->fragment_offset; - } + int i, m
[PATCH 0/6] Squashfs performance improvements
Hi This patch-set is mainly a compilation of the various Squashfs performance improvement patches sent to the mailing list over the last month or so. These patches do the following: * Add support for different decompressor implementations * Add support for parallel decompression, with or without percpu variables * Add support for direct decompression into the page cache, rather than using an intermediate buffer. Patches 1, 4 and 6 are revised V2 versions following review. Patch 2 is Minchan Kim's multi-threading patch unmodified, it is included here for completeness. Patch 3 is new, it is a multi-threaded implementation which uses percpu variables to do load-balancing across multiple cores, providing one decompressor per core. This implementation is intended to be complementary to Minchan Kim's patch. Each parallelisation implementation exhibits trade-offs between decompression performance and CPU and memory usage. The multi-threading implementation without percpu variables offers the ability to do two simultaneous decompressions per core, which offers maximum performance but at the expense of very high CPU usage and memory overhead. For many multi-core embedded systems with weak CPUs such resource use may be prohibitive. The percpu implementation inherently limits CPU usage to one decompression per core, and because of the use of percpu variables it ensures decompression is load balanced too. Minchan Kim (1): Squashfs: enhance parallel I/O Phillip Lougher (5): Squashfs: Refactor decompressor interface and code (V2) Squashfs: add multi-threaded decompression using percpu variables Squashfs: Generalise paging handling in the decompressors (V2) Squashfs: restructure squashfs_readpage() Squashfs: Directly decompress into the page cache for file data (V2) fs/squashfs/Kconfig | 72 +++ fs/squashfs/Makefile|5 + fs/squashfs/block.c | 36 +++--- fs/squashfs/cache.c | 28 - fs/squashfs/decompressor.c | 59 ++--- fs/squashfs/decompressor.h | 24 ++-- fs/squashfs/decompressor_multi.c| 199 +++ fs/squashfs/decompressor_multi_percpu.c | 104 fs/squashfs/decompressor_single.c | 85 + fs/squashfs/file.c | 142 +++--- fs/squashfs/file_cache.c| 38 ++ fs/squashfs/file_direct.c | 178 +++ fs/squashfs/lzo_wrapper.c | 47 fs/squashfs/page_actor.c| 104 fs/squashfs/page_actor.h| 86 + fs/squashfs/squashfs.h | 20 +++- fs/squashfs/squashfs_fs_sb.h|4 +- fs/squashfs/super.c | 10 +- fs/squashfs/xz_wrapper.c| 105 +--- fs/squashfs/zlib_wrapper.c | 64 -- 20 files changed, 1167 insertions(+), 243 deletions(-) create mode 100644 fs/squashfs/decompressor_multi.c create mode 100644 fs/squashfs/decompressor_multi_percpu.c create mode 100644 fs/squashfs/decompressor_single.c create mode 100644 fs/squashfs/file_cache.c create mode 100644 fs/squashfs/file_direct.c create mode 100644 fs/squashfs/page_actor.c create mode 100644 fs/squashfs/page_actor.h Phillip -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/6] Squashfs performance improvements
Hi This patch-set is mainly a compilation of the various Squashfs performance improvement patches sent to the mailing list over the last month or so. These patches do the following: * Add support for different decompressor implementations * Add support for parallel decompression, with or without percpu variables * Add support for direct decompression into the page cache, rather than using an intermediate buffer. Patches 1, 4 and 6 are revised V2 versions following review. Patch 2 is Minchan Kim's multi-threading patch unmodified, it is included here for completeness. Patch 3 is new, it is a multi-threaded implementation which uses percpu variables to do load-balancing across multiple cores, providing one decompressor per core. This implementation is intended to be complementary to Minchan Kim's patch. Each parallelisation implementation exhibits trade-offs between decompression performance and CPU and memory usage. The multi-threading implementation without percpu variables offers the ability to do two simultaneous decompressions per core, which offers maximum performance but at the expense of very high CPU usage and memory overhead. For many multi-core embedded systems with weak CPUs such resource use may be prohibitive. The percpu implementation inherently limits CPU usage to one decompression per core, and because of the use of percpu variables it ensures decompression is load balanced too. Minchan Kim (1): Squashfs: enhance parallel I/O Phillip Lougher (5): Squashfs: Refactor decompressor interface and code (V2) Squashfs: add multi-threaded decompression using percpu variables Squashfs: Generalise paging handling in the decompressors (V2) Squashfs: restructure squashfs_readpage() Squashfs: Directly decompress into the page cache for file data (V2) fs/squashfs/Kconfig | 72 +++ fs/squashfs/Makefile|5 + fs/squashfs/block.c | 36 +++--- fs/squashfs/cache.c | 28 - fs/squashfs/decompressor.c | 59 ++--- fs/squashfs/decompressor.h | 24 ++-- fs/squashfs/decompressor_multi.c| 199 +++ fs/squashfs/decompressor_multi_percpu.c | 104 fs/squashfs/decompressor_single.c | 85 + fs/squashfs/file.c | 142 +++--- fs/squashfs/file_cache.c| 38 ++ fs/squashfs/file_direct.c | 178 +++ fs/squashfs/lzo_wrapper.c | 47 fs/squashfs/page_actor.c| 104 fs/squashfs/page_actor.h| 86 + fs/squashfs/squashfs.h | 20 +++- fs/squashfs/squashfs_fs_sb.h|4 +- fs/squashfs/super.c | 10 +- fs/squashfs/xz_wrapper.c| 105 +--- fs/squashfs/zlib_wrapper.c | 64 -- 20 files changed, 1167 insertions(+), 243 deletions(-) create mode 100644 fs/squashfs/decompressor_multi.c create mode 100644 fs/squashfs/decompressor_multi_percpu.c create mode 100644 fs/squashfs/decompressor_single.c create mode 100644 fs/squashfs/file_cache.c create mode 100644 fs/squashfs/file_direct.c create mode 100644 fs/squashfs/page_actor.c create mode 100644 fs/squashfs/page_actor.h Phillip -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/6] Squashfs: restructure squashfs_readpage()
Restructure squashfs_readpage() splitting it into separate functions for datablocks, fragments and sparse blocks. Move the memcpying (from squashfs cache entry) implementation of squashfs_readpage_block into file_cache.c This allows different implementations to be supported. Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- fs/squashfs/Makefile |2 +- fs/squashfs/file.c | 142 +++--- fs/squashfs/file_cache.c | 38 + fs/squashfs/squashfs.h |7 +++ 4 files changed, 118 insertions(+), 71 deletions(-) create mode 100644 fs/squashfs/file_cache.c diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index 5833b96..908c0d9 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -4,7 +4,7 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o -squashfs-y += namei.o super.o symlink.o decompressor.o +squashfs-y += namei.o super.o symlink.o decompressor.o file_cache.c squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c index 8ca62c2..e5c9689 100644 --- a/fs/squashfs/file.c +++ b/fs/squashfs/file.c @@ -370,77 +370,15 @@ static int read_blocklist(struct inode *inode, int index, u64 *block) return le32_to_cpu(size); } - -static int squashfs_readpage(struct file *file, struct page *page) +/* Copy data into page cache */ +void squashfs_copy_cache(struct page *page, struct squashfs_cache_entry *buffer, + int bytes, int offset) { struct inode *inode = page-mapping-host; struct squashfs_sb_info *msblk = inode-i_sb-s_fs_info; - int bytes, i, offset = 0, sparse = 0; - struct squashfs_cache_entry *buffer = NULL; void *pageaddr; - - int mask = (1 (msblk-block_log - PAGE_CACHE_SHIFT)) - 1; - int index = page-index (msblk-block_log - PAGE_CACHE_SHIFT); - int start_index = page-index ~mask; - int end_index = start_index | mask; - int file_end = i_size_read(inode) msblk-block_log; - - TRACE(Entered squashfs_readpage, page index %lx, start block %llx\n, - page-index, squashfs_i(inode)-start); - - if (page-index = ((i_size_read(inode) + PAGE_CACHE_SIZE - 1) - PAGE_CACHE_SHIFT)) - goto out; - - if (index file_end || squashfs_i(inode)-fragment_block == - SQUASHFS_INVALID_BLK) { - /* -* Reading a datablock from disk. Need to read block list -* to get location and block size. -*/ - u64 block = 0; - int bsize = read_blocklist(inode, index, block); - if (bsize 0) - goto error_out; - - if (bsize == 0) { /* hole */ - bytes = index == file_end ? - (i_size_read(inode) (msblk-block_size - 1)) : -msblk-block_size; - sparse = 1; - } else { - /* -* Read and decompress datablock. -*/ - buffer = squashfs_get_datablock(inode-i_sb, - block, bsize); - if (buffer-error) { - ERROR(Unable to read page, block %llx, size %x - \n, block, bsize); - squashfs_cache_put(buffer); - goto error_out; - } - bytes = buffer-length; - } - } else { - /* -* Datablock is stored inside a fragment (tail-end packed -* block). -*/ - buffer = squashfs_get_fragment(inode-i_sb, - squashfs_i(inode)-fragment_block, - squashfs_i(inode)-fragment_size); - - if (buffer-error) { - ERROR(Unable to read page, block %llx, size %x\n, - squashfs_i(inode)-fragment_block, - squashfs_i(inode)-fragment_size); - squashfs_cache_put(buffer); - goto error_out; - } - bytes = i_size_read(inode) (msblk-block_size - 1); - offset = squashfs_i(inode)-fragment_offset; - } + int i, mask = (1 (msblk-block_log - PAGE_CACHE_SHIFT)) - 1; + int start_index = page-index ~mask, end_index = start_index | mask; /* * Loop copying datablock
[PATCH 3/6] Squashfs: add multi-threaded decompression using percpu variables
Add a multi-threaded decompression implementation which uses percpu variables. Using percpu variables has advantages and disadvantages over implementations which do not use percpu variables. Advantages: the nature of percpu variables ensures decompression is load-balanced across the multiple cores. Disadvantages: it limits decompression to one thread per core. Signed-off-by: Phillip Lougher phil...@squashfs.org.uk --- fs/squashfs/Kconfig | 57 + fs/squashfs/Makefile| 10 +-- fs/squashfs/decompressor_multi_percpu.c | 105 +++ 3 files changed, 152 insertions(+), 20 deletions(-) create mode 100644 fs/squashfs/decompressor_multi_percpu.c diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index 1c6d340..c92c75f 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -25,6 +25,50 @@ config SQUASHFS If unsure, say N. +choice + prompt Decompressor parallelisation options + depends on SQUASHFS + help + Squashfs now supports three parallelisation options for + decompression. Each one exhibits various trade-offs between + decompression performance and CPU and memory usage. + + If in doubt, select Single threaded compression + +config SQUASHFS_DECOMP_SINGLE + bool Single threaded compression + help + Traditionally Squashfs has used single-threaded decompression. + Only one block (data or metadata) can be decompressed at any + one time. This limits CPU and memory usage to a minimum. + +config SQUASHFS_DECOMP_MULTI + bool Use multiple decompressors for parallel I/O + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + If you have a parallel I/O workload and your system has enough memory, + using this option may improve overall I/O performance. + + This decompressor implementation uses up to two parallel + decompressors per core. It dynamically allocates decompressors + on a demand basis. + +config SQUASHFS_DECOMP_MULTI_PERCPU + bool Use percpu multiple decompressors for parallel I/O + help + By default Squashfs uses a single decompressor but it gives + poor performance on parallel I/O workloads when using multiple CPU + machines due to waiting on decompressor availability. + + This decompressor implementation uses a maximum of one + decompressor per core. It uses percpu variables to ensure + decompression is load-balanced across the cores. + +endchoice + config SQUASHFS_XATTR bool Squashfs XATTR support depends on SQUASHFS @@ -63,19 +107,6 @@ config SQUASHFS_LZO If unsure, say N. -config SQUASHFS_MULTI_DECOMPRESSOR - bool Use multiple decompressors for handling parallel I/O - depends on SQUASHFS - help - By default Squashfs uses a single decompressor but it gives - poor performance on parallel I/O workloads when using multiple CPU - machines due to waiting on decompressor availability. - - If you have a parallel I/O workload and your system has enough memory, - using this option may improve overall I/O performance. - - If unsure, say N. - config SQUASHFS_XZ bool Include support for XZ compressed file systems depends on SQUASHFS diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile index dfebc3b..5833b96 100644 --- a/fs/squashfs/Makefile +++ b/fs/squashfs/Makefile @@ -5,14 +5,10 @@ obj-$(CONFIG_SQUASHFS) += squashfs.o squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o squashfs-y += namei.o super.o symlink.o decompressor.o - +squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o +squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o +squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += decompressor_multi_percpu.o squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o - -ifdef CONFIG_SQUASHFS_MULTI_DECOMPRESSOR - squashfs-y += decompressor_multi.o -else - squashfs-y += decompressor_single.o -endif diff --git a/fs/squashfs/decompressor_multi_percpu.c b/fs/squashfs/decompressor_multi_percpu.c new file mode 100644 index 000..b5598ab --- /dev/null +++ b/fs/squashfs/decompressor_multi_percpu.c @@ -0,0 +1,105 @@ +/* + * Copyright (c) 2013 + * Phillip Lougher phil...@squashfs.org.uk + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include linux/types.h +#include linux/mutex.h +#include linux/slab.h +#include