[PATCH] Btrfs: fix a bug when llseek for delalloc bytes behind prealloc extents
xfstests case 285 complains. It it because btrfs did not try to find unwritten delalloc bytes(only dirty pages, not yet writeback) behind prealloc extents, it ends up finding nothing while we're with SEEK_DATA. Signed-off-by: Liu Bo bo.li@oracle.com --- fs/btrfs/file.c |9 ++--- fs/btrfs/inode.c | 11 --- 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 1e16b6d..f76b1fd 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2308,9 +2308,12 @@ static int find_desired_extent(struct inode *inode, loff_t *offset, int whence) } } - *offset = start; - free_extent_map(em); - break; + if (!test_bit(EXTENT_FLAG_PREALLOC, + em-flags)) { + *offset = start; + free_extent_map(em); + break; + } } } diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 16d9e8e..7ac41aa 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5572,10 +5572,13 @@ struct extent_map *btrfs_get_extent_fiemap(struct inode *inode, struct page *pag return em; if (em) { /* -* if our em maps to a hole, there might -* actually be delalloc bytes behind it +* if our em maps to +* - a hole or +* - a pre-alloc extent, +* there might actually be delalloc bytes behind it. */ - if (em-block_start != EXTENT_MAP_HOLE) + if (em-block_start != EXTENT_MAP_HOLE + !test_bit(EXTENT_FLAG_PREALLOC, em-flags)) return em; else hole_em = em; @@ -5657,6 +5660,8 @@ struct extent_map *btrfs_get_extent_fiemap(struct inode *inode, struct page *pag */ em-block_start = hole_em-block_start; em-block_len = hole_len; + if (test_bit(EXTENT_FLAG_PREALLOC, hole_em-flags)) + set_bit(EXTENT_FLAG_PREALLOC, em-flags); } else { em-start = range_start; em-len = found; -- 1.7.7.6 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [bug] oops in linux-3.8-rc2
Hi, On Mon, Jan 07, 2013 at 01:51:21AM +0100, Tom Gundersen wrote: I ran into a bug today with 3.8-rc2 [0], following an make modules_install make install of the kernel. I have been using the same kernel without problems for some days, the only change immediately before the oops was that I switched from plain partitions to using multi-device and subvolumes. [0]: https://plus.google.com/114015603831160344127/posts/fzEYnF64NkZ BUG at mm/page-writeback.c:2164 2160 int clear_page_dirty_for_io(struct page *page) 2161 { 2162 struct address_space *mapping = page_mapping(page); 2163 2164 BUG_ON(!PageLocked(page)); btrfs on the stack is only __btrfs_buffered_write + btrfs_file_aio_write, ie. just a buffered write, originating from pwrite64. The bug_on condition is clear, the question is why the page is not locked. david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [bug] oops in linux-3.8-rc2
On Mon, Jan 07, 2013 at 04:10:37AM -0700, David Sterba wrote: Hi, On Mon, Jan 07, 2013 at 01:51:21AM +0100, Tom Gundersen wrote: I ran into a bug today with 3.8-rc2 [0], following an make modules_install make install of the kernel. I have been using the same kernel without problems for some days, the only change immediately before the oops was that I switched from plain partitions to using multi-device and subvolumes. [0]: https://plus.google.com/114015603831160344127/posts/fzEYnF64NkZ BUG at mm/page-writeback.c:2164 2160 int clear_page_dirty_for_io(struct page *page) 2161 { 2162 struct address_space *mapping = page_mapping(page); 2163 2164 BUG_ON(!PageLocked(page)); btrfs on the stack is only __btrfs_buffered_write + btrfs_file_aio_write, ie. just a buffered write, originating from pwrite64. The bug_on condition is clear, the question is why the page is not locked. The big question is, can you make it happen again? -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Open for contribution towards xfstests for btrfs
On Mon, Jan 07, 2013 at 03:48:43PM +0530, Kiran Patil wrote: Hello, We have a team of 5 students who would like to contribute to btrfs filesystem testing using xfstests. Is there space for them to contribute? xfstests is lacking of testcases targetting for btrfs send/recieve, you may be interested in it. thanks, liubo If yes, to whom do they need to keep in touch for guidance. Thank you, Kiran Patil (Director) Green Turtles Technologies, INDIA. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Open for contribution towards xfstests for btrfs
On 07.01.2013 14:01, Liu Bo wrote: On Mon, Jan 07, 2013 at 03:48:43PM +0530, Kiran Patil wrote: Hello, We have a team of 5 students who would like to contribute to btrfs filesystem testing using xfstests. Is there space for them to contribute? xfstests is lacking of testcases targetting for btrfs send/recieve, you may be interested in it. There is even a basis for that already, we have a testsuite for the zfs send - btrfs receive feature at git://git.kernel.org/pub/scm/linux/kernel/git/arne/far-progs.git see test.pl. It can also already be used to test btrfs send - btrfs receive. -Arne thanks, liubo If yes, to whom do they need to keep in touch for guidance. Thank you, Kiran Patil (Director) Green Turtles Technologies, INDIA. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Open for contribution towards xfstests for btrfs
On Mon, Jan 07, 2013 at 03:48:43PM +0530, Kiran Patil wrote: We have a team of 5 students who would like to contribute to btrfs filesystem testing using xfstests. By any chance, are these the same people behind http://sourceforge.net/projects/btrfstest/ ? Is there space for them to contribute? Sure. As an example of where to start, look at a some patches from Miao Xie with reproducers for several bugs. This IMHO is a smooth path to start with xfstests, the task is to embody the test itself into the teitesinfrastructure. The more advanced tasks: * look at commits in btrfs and try to write a specific test for them (depends on the nature of the bug, the difficulty ranges from easy to impossible) * write test coverage for a particular feature (eg. for Jeff Liu's label setting patches) * several btrfs specific xfstests were sent to the mailinglists over time, I'm sure not all of them were merged If yes, to whom do they need to keep in touch for guidance. You can learn a lot from existing xfstests and review comments of proposed tests. Feel free to ask in the respective mailinglists or IRC channels. david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: /boot as a btrfs subvolume
On 01/06/2013 09:00 PM, Chris Murphy wrote: On Jan 6, 2013, at 1:05 PM, Gene Czarcinski g...@czarc.net wrote: On 01/06/2013 02:17 PM, Swâmi Petaramesh wrote: Le 06/01/2013 20:11, Chris Murphy a écrit : If you use UUID, and you use subvol=, and you don't rename/move your subvolume, it's perfectly safe. Nevertheless, GRUB becoming subvolid aware seems like a good idea to me, but I have no idea what's involved in that. I actually run several machines on which I have /boot in a separate BTRFS subvol, without any issue. I have a multiboot between several different distros (typically Ubuntu, Mint, LMDE, Bodhi... All Ubuntu derivatives except for LMDE which is Debian-based...) sharing the same BTRFS container and using different subvols i.e. UBUNTU/@boot, LMDE/@boot etc... Works just great. I assume you have a grub partition (or its equivalent) with a grub.cfg file having menuentry definitions [pointing to the different grub.cfg file for each system ... that seems to work well (at least for me). Currently, os-prober does not support btrfs. I have taken a little look at the grub2 source code and there is some mention of both btrfs and zfs (and also btrfs subvolumes) in the changelogs. However, it is not clear to me (and I have not had the time yet) to explore exactly what the source code is doing or not doing. Well, at least with the f18 version of GRUB 2 2.00, whether alternative Btrfs bootable systems are mounted or not, -mkconfig isn't searching/finding for the /etc/fstab and /etc/default/grub of the other system like it appears to do with other file systems. I don't get any additional entries other than the currently booted Btrfs system. So it looks like I'd need to manually add a configfile menu entry, pointing to each Btrfs bootable system. Chainloading from one grub to another is not useful. Better if they're all on the same GRUB, and use configfile. There appear to be two situations where you have multiple software systems installed on the same hardware (real or virtual) -- root on a btrfs subvolume or /boot installed on an LV. I worked up a patch to os-prober and a small update to grub2 which addresses the root on a btrfs subvolume -- see https://bugzilla.redhat.com/show_bug.cgi?id=888341 I have not looked into the /boot in a LV so it may involve os-prober also and I am also not sure that it will work with /boot on a btrfs subvolume. Since developing those patches, I have had second thoughts about how multiboot should be done and now believe that it should involve a grub partition with a simple grub.cfg file and os-prober disabled. The simple grub.cfg file had menuentry definitions which point to the real grub.cfg file for each system. One of these simple definitions would look like: menuentry F18 on BTRFS { insmod part_msdos insmod btrfs set root='hd0,msdos5' configfile /boot1/grub2/grub.cfg } where /boot is on btrfs subvol boot1 I do not actually use a grub partition but, instead, a minimal system with grub2 and installation into the MBR. Gene -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: replace simple_strtoull() with kstrtoull() in btrfs_ioctl_resize()
On Sun, Jan 06, 2013 at 11:53:41AM +0800, Jeff Liu wrote: - devid = simple_strtoull(devstr, end, 10); + ret = kstrtoull(devstr, 10, devid); + if (ret) { + pr_err(btrfs: resizer unable to parse device %s\n, + devstr); Code looks ok, I would prefer the message error text to say: pr_err(btrfs: resize unable to parse device id %s\n, devstr); but that's maybe just me :) david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix off-by-one in lseek
On Mon, Jan 07, 2013 at 11:53:08AM +0800, Liu Bo wrote: Lock end is inclusive. Signed-off-by: Liu Bo bo.li@oracle.com --- fs/btrfs/file.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 77061bf..1e16b6d 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2241,6 +2241,7 @@ static int find_desired_extent(struct inode *inode, loff_t *offset, int whence) if (lockend = lockstart) lockend = lockstart + root-sectorsize; + lockend--; len = lockend - lockstart + 1; len = max_t(u64, len, root-sectorsize); Fix looks ok. I think this should be caught at runtime as well, the number of ways how the lock start and end are passed is not small and it need not be always possible to catch it from reading sources. The range is inclusive, so it's 'lock end % 2 == 1' right? (in the bit manipulating primitives). david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs: wipe all the superblock [redhat bugzilla 889888]
On Sun, Jan 06, 2013 at 07:28:55PM +0100, Goffredo Baroncelli wrote: Currently wipefs doesn't clear all the superblock of btrfs. Only the first one is cleared. Btrfs has three superblocks. The first one is placed at 64KB, the second one at 64MB, the third one at 256GB. It can have as much as 4 superblock backup copies: Superblock offset 0 is 65536 (0x1, block=16/0x10) Superblock offset 1 is 67108864 (0x400, block=16384/0x4000) Superblock offset 2 is 274877906944 (0x40, block=67108864/0x400) Superblock offset 3 is 1125899906842624 (0x4, block=274877906944/0x40) Superblock offset 4 is 4611686018427387904 (0x4000, block=1125899906842624/0x4) If the first superblock is valid except that the magic field is zeroed, btrfs skips the check of the other superblocks. If the first superblock is fully invalid, btrfs checks for the other superblock. So zeroing the first superblock magic field at the beginning seems that the filesystem is wiped. But when the first superblock is overwritten (e.g. by another filesystem), then the other two superblocks may be considered valid, and the filesystem may resurrect. And for that purpose all the superblock copies should be taken into account, regardless of the tricks that btrfs_mount applies. david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Open for contribution towards xfstests for btrfs
On Mon, Jan 7, 2013 at 9:28 PM, Eric Sandeen sand...@redhat.com wrote: On 1/7/13 4:18 AM, Kiran Patil wrote: Hello, We have a team of 5 students who would like to contribute to btrfs filesystem testing using xfstests. Is there space for them to contribute? If yes, to whom do they need to keep in touch for guidance. (cc: xfs list) Are you thinking of writing new tests, or doing more formal testing with the existing testsuite? Both would be useful. In particular, any time there's a bugfix, a regression test is helpful. Generic new tests of btrfs functionality would be good, too. I would suggest starting with a couple new tests, and cc: the xfs list for review. In the course of the review you'll get a sense of the various conventions used in xfstests, etc. Simply running xfstests over various btrfs configurations would also be useful, to be sure any regressions caught by existing tests don't go unnoticed. thanks, -Eric Dear Eric and David, Thanks a lot for your inputs. We will work on them and soon we will start our work. Thank you, -Kiran -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [bug] csum mismatches and failed xfstests with 3.8-rc1 and rc2
with top commit 5f243b9b46a22e5790dbbc36f574c2417af49a41 (something post -rc2) I see more checksum errors $ dmesg|grep csum|wc -l 100 and they appeared in a period of like last 30 minutes. previous test rounds were clean, and I can see that the same test sequenece run 3 time in a row with the same mount parameters without a mkfs between runs. Test messages seem to be related to no-space conditions. MKFS_OPTIONS -- /dev/sda9 MOUNT_OPTIONS -- -o space_cache,noatime /dev/sda9 /mnt/a2 091 41s ... [17:18:32] [17:18:33] [failed, exit status 1] - output mismatch (see 091.out.bad) --- 091.out 2011-11-01 10:31:12.0 +0100 +++ 091.out.bad 2013-01-07 17:18:33.0 +0100 @@ -1,7 +1,6 @@ QA output created by 091 fsx -N 1 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 1 -o 8192 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 1 -o 32768 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 1 -o 8192 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 1 -o 32768 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 1 -o 128000 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -W +./091: line 46: 11355 Segmentation fault $here/ltp/fsx $args $TEST_DIR/junk $seq.full 21 +fsx -N 1 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W +mapped writes DISABLED +truncating to largest ever: 0x12a00 133 220s ...[17:28:51] [17:31:13] - output mismatch (see 133.out.bad) --- 133.out 2011-02-11 11:42:31.0 +0100 +++ 133.out.bad 2013-01-07 17:31:13.0 +0100 @@ -2,4 +2,5 @@ Buffered writer, buffered reader Direct writer, buffered reader Buffered writer, direct reader +pread64: Input/output error Direct writer, direct reader 240 1s ... [17:40:51] [17:40:51] [failed, exit status 11] - output mismatch (see 240.out.bad) --- 240.out 2011-08-10 17:17:23.0 +0200 +++ 240.out.bad 2013-01-07 17:40:51.0 +0100 @@ -1,2 +1,4 @@ QA output created by 240 Silence is golden. +AIO write offset 33280 expected 4096 got -5 +short read() at offset 29184 247 41s ... [17:42:16] [17:42:36] [failed, exit status 1] - output mismatch (see 247.out.bad) --- 247.out 2011-08-10 17:17:23.0 +0200 +++ 247.out.bad 2013-01-07 17:42:36.0 +0100 @@ -1,2 +1,2 @@ QA output created by 247 -Silence is golden. +Bus error 263 237s ...[17:45:00] [17:45:01] [failed, exit status 1] - output mismatch (see 263.out.bad) --- 263.out 2011-11-01 10:31:12.0 +0100 +++ 263.out.bad 2013-01-07 17:45:01.0 +0100 @@ -1,3 +1,100 @@ QA output created by 263 fsx -N 1 -o 8192 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -fsx -N 1 -o 128000 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z +fsx -N 1 -o 8192 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z +truncating to largest ever: 0x12a00 +truncating to largest ever: 0x75400 +dowrite: write: Input/output error +LOG DUMP (91 total operations): +1( 1 mod 256): SKIPPED (no operation) +2( 2 mod 256): MAPWRITE 0x62600 thru 0x626e8 (0xe9 bytes) +3( 3 mod 256): FALLOC 0x2e0f2 thru 0x2f438 (0x1346 bytes) INTERIOR +4( 4 mod 256): TRUNCATE DOWN from 0x626e9 to 0x12a00 +5( 5 mod 256): MAPREAD 0x0 thru 0x90e(0x90f bytes) +6( 6 mod 256): FALLOC 0x7048 thru 0x80ca(0x1082 bytes) INTERIOR +7( 7 mod 256): WRITE0x5ea00 thru 0x5f3ff (0xa00 bytes) HOLE +8( 8 mod 256): SKIPPED (no operation) +9( 9 mod 256): FALLOC 0x4957f thru 0x4af1f (0x19a0 bytes) INTERIOR +10( 10 mod 256): SKIPPED (no operation) +11( 11 mod 256): WRITE0x10a00 thru 0x127ff (0x1e00 bytes) +12( 12 mod 256): MAPWRITE 0x53800 thru 0x54384 (0xb85 bytes) +13( 13 mod 256): MAPWRITE 0x5ae00 thru 0x5c8a9 (0x1aaa bytes) +14( 14 mod 256): READ 0x4a000 thru 0x4afff (0x1000 bytes) +15( 15 mod 256): SKIPPED (no operation) +16( 16 mod 256): MAPREAD 0x36000 thru 0x36acf (0xad0 bytes) +17( 17 mod 256): SKIPPED (no operation) +18( 18 mod 256): READ 0x12000 thru 0x12fff (0x1000 bytes) +19( 19 mod 256): MAPWRITE 0x17600 thru 0x17b42 (0x543 bytes) +20( 20 mod 256): READ 0x29000 thru 0x29fff (0x1000 bytes) +21( 21 mod 256): FALLOC 0xea89 thru 0x1023e (0x17b5 bytes) INTERIOR +22( 22 mod 256): FALLOC 0x569aa thru 0x5847f (0x1ad5 bytes) INTERIOR +23( 23 mod 256): WRITE0x35c00 thru 0x369ff (0xe00 bytes) +24( 24 mod 256): SKIPPED (no operation) +25( 25 mod 256): SKIPPED (no operation) +26( 26 mod 256): MAPREAD 0x32000 thru 0x32188 (0x189 bytes) +27( 27 mod 256): MAPREAD 0x3f000 thru 0x40e43 (0x1e44 bytes) +28( 28 mod 256): WRITE0x6f600 thru 0x705ff (0x1000 bytes) HOLE +29( 29 mod 256): MAPREAD 0x5b000 thru 0x5c99e (0x199f bytes) +30( 30 mod 256): TRUNCATE UP from 0x70600 to 0x75400 +31( 31 mod 256): MAPREAD 0x4000 thru 0x54d5 (0x14d6 bytes) +32( 32 mod 256): SKIPPED (no operation) +33( 33 mod 256): FALLOC 0x31d49 thru 0x32872 (0xb29 bytes) INTERIOR +34( 34 mod 256): FALLOC 0x2bbb3 thru 0x2bca7 (0xf4 bytes) INTERIOR +35( 35 mod 256): MAPREAD 0x68000 thru 0x68d8e (0xd8f bytes) +36( 36 mod
Re: [bug] csum mismatches and failed xfstests with 3.8-rc1 and rc2
On Mon, Jan 07, 2013 at 06:03:51PM +0100, David Sterba wrote: with top commit 5f243b9b46a22e5790dbbc36f574c2417af49a41 (something post -rc2) I see more checksum errors $ dmesg|grep csum|wc -l 100 more of dmesg: [15303.739076] btrfs csum failed ino 63791 off 368640 csum 3994424334 private 783210346 [15303.748113] [ cut here ] [15303.752052] kernel BUG at mm/page-writeback.c:2164! [15303.752052] invalid opcode: [#1] SMP [15303.752052] Modules linked in: dm_crypt loop btrfs [15303.752052] CPU 0 [15303.752052] Pid: 11355, comm: fsx Not tainted 3.8.0-rc2-default+ #228 Intel Corporation Santa Rosa platform/Matanzas [15303.752052] RIP: 0010:[81118239] [81118239] clear_page_dirty_for_io+0x119/0x130 [15303.752052] RSP: 0018:88002024fb28 EFLAGS: 00010246 [15303.752052] RAX: 4802 RBX: ea00013c68f0 RCX: [15303.752052] RDX: 0011 RSI: 81151320 RDI: ea00013c68f0 [15303.752052] RBP: 88002024fb38 R08: R09: 88004fa65640 [15303.752052] R10: R11: R12: 88006dd6e780 [15303.752052] R13: R14: 880063f4fac0 R15: 0001 [15303.752052] FS: 7f7067547700() GS:88007d80() knlGS: [15303.752052] CS: 0010 DS: ES: CR0: 8005003b [15303.752052] CR2: 7f7066d2cb10 CR3: 1dafe000 CR4: 07f0 [15303.752052] DR0: DR1: DR2: [15303.752052] DR3: DR6: 0ff0 DR7: 0400 [15303.752052] Process fsx (pid: 11355, threadinfo 88002024e000, task 880058334440) [15303.752052] Stack: [15303.752052] 880063f4fac0 0001 88002024fc08 a0044092 [15303.752052] 88002024fbd0 8850 00022000 1000819572cb [15303.752052] 0005b000 797903b8 0005ae00 [15303.752052] Call Trace: [15303.752052] [a0044092] prepare_pages+0x302/0x3a0 [btrfs] [15303.752052] [a00455ed] __btrfs_buffered_write+0x1ad/0x370 [btrfs] [15303.752052] [a0045cab] btrfs_file_aio_write+0x4fb/0x5e0 [btrfs] [15303.752052] [8115e04a] do_sync_write+0xaa/0xf0 [15303.752052] [8115e74e] vfs_write+0xce/0x190 [15303.752052] [8115eab2] sys_write+0x62/0xb0 [15303.752052] [8139a2de] ? trace_hardirqs_on_thunk+0x3a/0x3f [15303.752052] [81960699] system_call_fastpath+0x16/0x1b [15303.752052] Code: 00 00 48 89 df e8 38 fb ff ff e9 73 ff ff ff 0f 1f 00 48 89 df 57 9d 66 66 90 66 90 e8 51 e5 f8 ff b8 01 00 00 00 e9 3a ff ff ff 0f 0b 49 c7 c4 20 7f e4 81 e9 07 ff ff ff 66 0f 1f 84 00 00 00 [15303.752052] RIP [81118239] clear_page_dirty_for_io+0x119/0x130 [15303.752052] RSP 88002024fb28 [15304.049369] ---[ end trace e69c5ab147a5d842 ]--- [15304.358163] [ cut here ] [15304.364297] WARNING: at fs/btrfs/inode.c:7123 btrfs_destroy_inode+0x23e/0x2e0 [btrfs]() [15304.373804] Hardware name: Santa Rosa platform [15304.373809] Modules linked in: dm_crypt loop btrfs [15304.373813] Pid: 11387, comm: umount Tainted: G D 3.8.0-rc2-default+ #228 [15304.373814] Call Trace: [15304.373822] [8104c6bf] warn_slowpath_common+0x7f/0xc0 [15304.373831] [8104c71a] warn_slowpath_null+0x1a/0x20 [15304.373851] [a00435be] btrfs_destroy_inode+0x23e/0x2e0 [btrfs] [15304.373857] [81178a4c] destroy_inode+0x3c/0x70 [15304.373862] [81178ba6] evict+0x126/0x1c0 [15304.373867] [81178c8f] dispose_list+0x4f/0x60 [15304.373873] [811799f4] evict_inodes+0x114/0x130 [15304.373879] [81160903] generic_shutdown_super+0x53/0xf0 [15304.373886] [81160a36] kill_anon_super+0x16/0x30 [15304.373898] [a0004d9a] btrfs_kill_super+0x1a/0x90 [btrfs] [15304.373903] [81161902] ? deactivate_super+0x42/0x70 [15304.373909] [81160c8d] deactivate_locked_super+0x3d/0x90 [15304.373914] [8116190a] deactivate_super+0x4a/0x70 [15304.373920] [8117dc90] mntput_no_expire+0x100/0x160 [15304.373925] [8117ecd1] sys_umount+0x71/0x3c0 [15304.373931] [81960699] system_call_fastpath+0x16/0x1b [15304.373934] ---[ end trace e69c5ab147a5d843 ]--- [15304.373937] [ cut here ] [15304.373954] WARNING: at fs/btrfs/inode.c:7124 btrfs_destroy_inode+0x2c6/0x2e0 [btrfs]() [15304.373955] Hardware name: Santa Rosa platform [15304.373963] Modules linked in: dm_crypt loop btrfs [15304.373968] Pid: 11387, comm: umount Tainted: G D W 3.8.0-rc2-default+ #228 [15304.373969] Call Trace: [15304.373974] [8104c6bf] warn_slowpath_common+0x7f/0xc0 [15304.373979] [8104c71a] warn_slowpath_null+0x1a/0x20 [15304.373998] [a0043646] btrfs_destroy_inode+0x2c6/0x2e0 [btrfs] [15304.374003] [81178a4c] destroy_inode+0x3c/0x70
Re: Open for contribution towards xfstests for btrfs
On Mon, Jan 7, 2013 at 10:31 PM, Kiran Patil kirantpa...@gmail.com wrote: On Mon, Jan 7, 2013 at 9:28 PM, Eric Sandeen sand...@redhat.com wrote: On 1/7/13 4:18 AM, Kiran Patil wrote: Hello, We have a team of 5 students who would like to contribute to btrfs filesystem testing using xfstests. Is there space for them to contribute? If yes, to whom do they need to keep in touch for guidance. (cc: xfs list) Are you thinking of writing new tests, or doing more formal testing with the existing testsuite? Both would be useful. In particular, any time there's a bugfix, a regression test is helpful. Generic new tests of btrfs functionality would be good, too. I would suggest starting with a couple new tests, and cc: the xfs list for review. In the course of the review you'll get a sense of the various conventions used in xfstests, etc. Simply running xfstests over various btrfs configurations would also be useful, to be sure any regressions caught by existing tests don't go unnoticed. thanks, -Eric Dear Eric and David, Thanks a lot for your inputs. We will work on them and soon we will start our work. Thank you, -Kiran Sorry, I forgot to say thanks to Liu and Arne. Thank you Liu and Arne for your suggestions. -Kiran -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Online Deduplication for Btrfs (Master's thesis)
On Tue, Dec 18, 2012 at 2:31 AM, Chris Mason chris.ma...@fusionio.com wrote: On Mon, Dec 17, 2012 at 06:33:24AM -0700, Alexander Block wrote: I did some research on deduplication in the past and there are some problems that you will face. I'll try to list some of them (for sure not all). Thanks Alexander for writing all of this up. There are a lot of great points here, but I'll summarize with: [ many challenges to online dedup ] [ offline dedup is the best way ] So, the big problem with offline dedup is you're suddenly read bound. I don't disagree that offline makes a lot of the dedup problems easier, and Alexander describes a very interesting system here. I've tried to avoid features that rely on scanning though, just because idle disk time may not really exist. But with scrub, we have the scan as a feature, and it may make a lot of sense to leverage that. online dedup has a different set of tradeoffs, but as Alexander says the hard part really is the data structure to index the hashes. I think there are a few different options here, including changing the file extent pointers to point to a sha instead of a logical disk offset. I am not sure if I am following the approach with replacing the pointers, could you please explain? I agree that the data structure is the key part in designing the feature. So, part of my answer really depends on where you want to go with your thesis. I expect the data structure work for efficient hash lookup is going to be closer to what your course work requires? No, I don't think it is really necessary that the data structure should be close to my course work. What is your opinion on the most suitable data structure to index the hashes? Thanks! Martin -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Open for contribution towards xfstests for btrfs
On 1/7/13 9:31 AM, David Sterba wrote: On Mon, Jan 07, 2013 at 03:48:43PM +0530, Kiran Patil wrote: We have a team of 5 students who would like to contribute to btrfs filesystem testing using xfstests. By any chance, are these the same people behind http://sourceforge.net/projects/btrfstest/ ? If so, you might want to consider that http://www.tuxera.com/community/posix-test-suite/ probably already covers some of the tests that btrfstest is (re)writing. i.e.: +/* Creating Link on non-existing prefix path should return ENOTDIR and +/* Creating link on ENAMETOOLONG if either file name exceeds maximum characters are almost certainly covered in the above posix test suite. FWIW, that testsuite has a few failures on btrfs (not all unique to btrfs, IIRC): Test Summary Report --- /root/pjd-fstest/pjd-fstest-20090130-RC/tests/chown/00.t (Wstat: 0 Tests: 171 Failed: 6) Failed tests: 84, 88, 141, 145, 149, 153 /root/pjd-fstest/pjd-fstest-20090130-RC/tests/truncate/00.t (Wstat: 0 Tests: 21 Failed: 1) Failed test: 15 /root/pjd-fstest/pjd-fstest-20090130-RC/tests/xacl/00.t(Wstat: 0 Tests: 45 Failed: 1) Failed test: 45 -Eric -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs: wipe all the superblock [redhat bugzilla 889888]
Hi David, On 01/07/2013 05:33 PM, David Sterba wrote: On Sun, Jan 06, 2013 at 07:28:55PM +0100, Goffredo Baroncelli wrote: Currently wipefs doesn't clear all the superblock of btrfs. Only the first one is cleared. Btrfs has three superblocks. The first one is placed at 64KB, the second one at 64MB, the third one at 256GB. It can have as much as 4 superblock backup copies: Superblock offset 0 is 65536 (0x1, block=16/0x10) Superblock offset 1 is 67108864 (0x400, block=16384/0x4000) Superblock offset 2 is 274877906944 (0x40, block=67108864/0x400) Superblock offset 3 is 1125899906842624 (0x4, block=274877906944/0x40) Superblock offset 4 is 4611686018427387904 (0x4000, block=1125899906842624/0x4) Are you sure ? Regarding the btrfs-progs suite, I looked at the btrfs_read_dev_super(): [..] for (i = 0; i BTRFS_SUPER_MIRROR_MAX; i++) { bytenr = btrfs_sb_offset(i); ret = pread64(fd, buf, sizeof(buf), bytenr); Where BTRFS_SUPER_MIRROR_MAX is 3. Regarding the kernel code, I looked at several function which call btrfs_sb_offset(); everywhere there is an upper limit of the superblock numbero which is BTRFS_SUPER_MIRROR_MAX, which is still 3. Moreover I performed the following test: $ ls -lh 7tb-filesystem.img -rw-r--r-- 1 ghigo ghigo 7.1E Jan 7 18:49 7eb-filesystem.img $ /sbin/mkfs.btrfs 7eb-filesystem.img $ cat extract-sign.py import os BTRFS_SUPER_MIRROR_SHIFT = 12 BTRFS_SUPER_INFO_OFFSET = (64*1024) def btrfs_sb_offset(mirror): start = 16*1024 if(mirror): return start (BTRFS_SUPER_MIRROR_SHIFT * mirror) return BTRFS_SUPER_INFO_OFFSET f = open(7eb-filesystem.img,r) for i in range(5): pos = btrfs_sb_offset(i)+64 f.seek(pos) sign = f.read(8) print Superblock #%d - %20d - '%s'%(i,pos,sign) $ python extract-sign.py Superblock #0 -65600 - '_BHRfS_M' Superblock #1 - 67108928 - '_BHRfS_M' Superblock #2 - 274877907008 - '_BHRfS_M' Superblock #3 - 1125899906842688 - '' Superblock #4 - 4611686018427387968 - '' To me it seems that in a 7TB filesystem there is only 3 superblocks. If the first superblock is valid except that the magic field is zeroed, btrfs skips the check of the other superblocks. If the first superblock is fully invalid, btrfs checks for the other superblock. So zeroing the first superblock magic field at the beginning seems that the filesystem is wiped. But when the first superblock is overwritten (e.g. by another filesystem), then the other two superblocks may be considered valid, and the filesystem may resurrect. And for that purpose all the superblock copies should be taken into account, regardless of the tricks that btrfs_mount applies. david -- gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs: wipe all the superblock [redhat bugzilla 889888]
Hi hugo On 01/07/2013 07:24 PM, Hugo Mills wrote: [...] print Superblock #%d - %20d - '%s'%(i,pos,sign) $ python extract-sign.py Superblock #0 -65600 - '_BHRfS_M' 64 KiB OK, (above) Superblock #1 - 67108928 - '_BHRfS_M' 256 MiB above is 64M, below is 256M ! Superblock #2 - 274877907008 - '_BHRfS_M' 1 TiB Superblock #3 - 1125899906842688 - '' 4 PiB Superblock #4 - 4611686018427387968 - '' 16 EiB Not reachable in linux. There is a VFS limits to 8EB. To me it seems that in a 7TB filesystem there is only 3 superblocks. That would be as expected. How many on a 5 PiB filesystem, though? Or a 20 EiB one? I wrote 7TB, but I meant 8EB. I first tried a test with 7TB and I wrote the email, but then I retested with 8EB and I corrected the email, forgetting some 7t instead of 8e... Anyway, if BTRFS would allow more than three super-block, in the 7TB case, the super-block at 1TB would appeared Hugo. -- gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: /boot as a btrfs subvolume
On Jan 7, 2013, at 8:42 AM, Gene Czarcinski g...@czarc.net wrote: On 01/06/2013 09:00 PM, Chris Murphy wrote: On Jan 6, 2013, at 1:05 PM, Gene Czarcinski g...@czarc.net wrote: On 01/06/2013 02:17 PM, Swâmi Petaramesh wrote: Le 06/01/2013 20:11, Chris Murphy a écrit : If you use UUID, and you use subvol=, and you don't rename/move your subvolume, it's perfectly safe. Nevertheless, GRUB becoming subvolid aware seems like a good idea to me, but I have no idea what's involved in that. I actually run several machines on which I have /boot in a separate BTRFS subvol, without any issue. I have a multiboot between several different distros (typically Ubuntu, Mint, LMDE, Bodhi... All Ubuntu derivatives except for LMDE which is Debian-based...) sharing the same BTRFS container and using different subvols i.e. UBUNTU/@boot, LMDE/@boot etc... Works just great. I assume you have a grub partition (or its equivalent) with a grub.cfg file having menuentry definitions [pointing to the different grub.cfg file for each system ... that seems to work well (at least for me). Currently, os-prober does not support btrfs. I have taken a little look at the grub2 source code and there is some mention of both btrfs and zfs (and also btrfs subvolumes) in the changelogs. However, it is not clear to me (and I have not had the time yet) to explore exactly what the source code is doing or not doing. Well, at least with the f18 version of GRUB 2 2.00, whether alternative Btrfs bootable systems are mounted or not, -mkconfig isn't searching/finding for the /etc/fstab and /etc/default/grub of the other system like it appears to do with other file systems. I don't get any additional entries other than the currently booted Btrfs system. So it looks like I'd need to manually add a configfile menu entry, pointing to each Btrfs bootable system. Chainloading from one grub to another is not useful. Better if they're all on the same GRUB, and use configfile. There appear to be two situations where you have multiple software systems installed on the same hardware (real or virtual) -- root on a btrfs subvolume or /boot installed on an LV. I'm unclear on the use case. If /boot is on its own LV, why not just use ext4? If Btrfs, then I'd expect boot to be included in one Btrfs volume, along with home and rootfs. If that's on an LV, the use cases I'm envisioning for that the boot loader wouldn't be aware of the LV (e.g. VM, iSCSI, etc). Since developing those patches, I have had second thoughts about how multiboot should be done and now believe that it should involve a grub partition with a simple grub.cfg file and os-prober disabled. The simple grub.cfg file had menuentry definitions which point to the real grub.cfg file for each system. It's messy without agreement on how to boot all distributions. I've also thought of this primary/public grub.cfg, and secondary/private grub.cfg. The first forwards to the grub.cfg for each distribution using configfile or legacyconfigfile. However, that primary GRUB instance necessarily would need to contain the entry for Windows, etc. So each distribution's grub-mkconfig would need to know to write out a distribution specific grub.cfg which is based on os-prober disabled; but then also be capable of updating the primary grub.cfg - not merely write out a new one. And then if a distribution is deleted from the system, how is this discovered, and the old entry in the primary grub.cfg removed? Messy. I do not actually use a grub partition but, instead, a minimal system with grub2 and installation into the MBR. grub's core.img can accept a baked in grub.cfg Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: /boot as a btrfs subvolume
On Jan 7, 2013, at 10:48 AM, Swâmi Petaramesh sw...@petaramesh.org wrote: Le 06/01/2013 21:05, Gene Czarcinski a écrit : I assume you have a grub partition (or its equivalent) with a grub.cfg file having menuentry definitions [pointing to the different grub.cfg file for each system ... that seems to work well (at least for me). Currently, os-prober does not support btrfs. I don't have any specific grub partition, every Linux install has its own grub install in its own /boot BTRFS subvolume. Grub partition, as I'm taking it, means: MBR gap on MBR layouts; and BIOS Boot partition (GUID 21686148-6449-6E6F-744E-656564454649) on GPT layouts. In any case, MBR gap, dedicated GPT partition, or grub.efi application, it's possible to bake a grub.cfg directly into the core.img. Technically using configfile is not chain loading, you're just having the same instance of GRUB load an alternate config. Chain loading GRUB implies loading a new binary (boot loader) to replace the first instance of GRUB. Using configfile has the caveat that the GRUB using configfile may be of a different version than the GRUB package that produced the grub.cfg being configfile loaded. And as the syntax of grub.cfg has been changing quite a bit it's possible to have related failures. Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: /boot as a btrfs subvolume
On 01/07/2013 01:42 PM, Chris Murphy wrote: On Jan 7, 2013, at 8:42 AM, Gene Czarcinski g...@czarc.net wrote: On 01/06/2013 09:00 PM, Chris Murphy wrote: On Jan 6, 2013, at 1:05 PM, Gene Czarcinski g...@czarc.net wrote: On 01/06/2013 02:17 PM, Swâmi Petaramesh wrote: Le 06/01/2013 20:11, Chris Murphy a écrit : If you use UUID, and you use subvol=, and you don't rename/move your subvolume, it's perfectly safe. Nevertheless, GRUB becoming subvolid aware seems like a good idea to me, but I have no idea what's involved in that. I actually run several machines on which I have /boot in a separate BTRFS subvol, without any issue. I have a multiboot between several different distros (typically Ubuntu, Mint, LMDE, Bodhi... All Ubuntu derivatives except for LMDE which is Debian-based...) sharing the same BTRFS container and using different subvols i.e. UBUNTU/@boot, LMDE/@boot etc... Works just great. I assume you have a grub partition (or its equivalent) with a grub.cfg file having menuentry definitions [pointing to the different grub.cfg file for each system ... that seems to work well (at least for me). Currently, os-prober does not support btrfs. I have taken a little look at the grub2 source code and there is some mention of both btrfs and zfs (and also btrfs subvolumes) in the changelogs. However, it is not clear to me (and I have not had the time yet) to explore exactly what the source code is doing or not doing. Well, at least with the f18 version of GRUB 2 2.00, whether alternative Btrfs bootable systems are mounted or not, -mkconfig isn't searching/finding for the /etc/fstab and /etc/default/grub of the other system like it appears to do with other file systems. I don't get any additional entries other than the currently booted Btrfs system. So it looks like I'd need to manually add a configfile menu entry, pointing to each Btrfs bootable system. Chainloading from one grub to another is not useful. Better if they're all on the same GRUB, and use configfile. There appear to be two situations where you have multiple software systems installed on the same hardware (real or virtual) -- root on a btrfs subvolume or /boot installed on an LV. I'm unclear on the use case. If /boot is on its own LV, why not just use ext4? If Btrfs, then I'd expect boot to be included in one Btrfs volume, along with home and rootfs. If that's on an LV, the use cases I'm envisioning for that the boot loader wouldn't be aware of the LV (e.g. VM, iSCSI, etc). Oops. It is me that was not clear. Case 1: You have multiple systems installed on one piece of real or virtual hardware. These could be different releases or even different distributions. At least one of these systems has root installed on a btrfs subvolume. When grub2-mkconfig is run on that system everything is OK. Now you are on a different system and you run grub2-mkconfig with os-prober enabled so it will look for other systems that could be booted. It will not see the system with root on the btrfs subvolume. That is because os-prober does not understand btrfs. Case2: This is similar to the above situation except that you have /boot installed in an LV rather than a regular partition with the LV formated with ext4. Similar to the above, if, from a different installed system you run grub2-mkconfig with os-prober enabled, the system with the /boot in the LV will not be seen. I have not looked into this one so it may be os-prober or grub2 itself. Seeing this convinced me to avoid using os-prober and simply have a boot director with simple menuentry definitions pointing to the various grub.cfg files for each system. Since developing those patches, I have had second thoughts about how multiboot should be done and now believe that it should involve a grub partition with a simple grub.cfg file and os-prober disabled. The simple grub.cfg file had menuentry definitions which point to the real grub.cfg file for each system. It's messy without agreement on how to boot all distributions. I've also thought of this primary/public grub.cfg, and secondary/private grub.cfg. The first forwards to the grub.cfg for each distribution using configfile or legacyconfigfile. However, that primary GRUB instance necessarily would need to contain the entry for Windows, etc. So each distribution's grub-mkconfig would need to know to write out a distribution specific grub.cfg which is based on os-prober disabled; but then also be capable of updating the primary grub.cfg - not merely write out a new one. And then if a distribution is deleted from the system, how is this discovered, and the old entry in the primary grub.cfg removed? Messy. The hardware systems have four or five software systems on each. It is not so much of deleting an distribution as much as just ignoring it. Disk space really is pretty inexpensive these days. I do not actually use a grub partition but, instead, a minimal system with grub2 and
[PATCH] btrfs: add no file data flag to btrfs send ioctl
This patch adds the flag, BTRFS_SEND_FLAG_NO_FILE_DATA to the btrfs send ioctl code. When this flag is set, the btrfs send code will never write file data into the stream (thus also avoiding expensive reads of that data in the first place). BTRFS_SEND_C_UPDATE_EXTENT commands will be sent (instead of BTRFS_SEND_C_WRITE) with an offset, length pair indicating the extent in question. This patch does not affect the operation of BTRFS_SEND_C_CLONE commands - they will continue to be sent when a search finds an appropriate extent to clone from. Signed-off-by: Mark Fasheh mfas...@suse.de --- fs/btrfs/ioctl.h |7 +++ fs/btrfs/send.c | 45 + fs/btrfs/send.h |1 + 3 files changed, 49 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h index 731e287..1f6cfdd 100644 --- a/fs/btrfs/ioctl.h +++ b/fs/btrfs/ioctl.h @@ -363,6 +363,13 @@ struct btrfs_ioctl_received_subvol_args { __u64 reserved[16]; /* in */ }; +/* + * Caller doesn't want file data in the send stream, even if the + * search of clone sources doesn't find an extent. UPDATE_EXTENT + * commands will be sent instead of WRITE commands. + */ +#define BTRFS_SEND_FLAG_NO_FILE_DATA 0x1 + struct btrfs_ioctl_send_args { __s64 send_fd; /* in */ __u64 clone_sources_count; /* in */ diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index e78b297..d725536 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -85,6 +85,7 @@ struct send_ctx { u32 send_max_size; u64 total_send_size; u64 cmd_send_size[BTRFS_SEND_C_MAX + 1]; + u64 flags; /* 'flags' member of btrfs_ioctl_send_args is u64 */ struct vfsmount *mnt; @@ -3707,6 +3708,39 @@ out: return ret; } +/* + * Send an update extent command to user space. + */ +static int send_update_extent(struct send_ctx *sctx, + u64 offset, u32 len) +{ + int ret = 0; + struct fs_path *p; + + p = fs_path_alloc(sctx); + if (!p) + return -ENOMEM; + + ret = begin_cmd(sctx, BTRFS_SEND_C_UPDATE_EXTENT); + if (ret 0) + goto out; + + ret = get_cur_path(sctx, sctx-cur_ino, sctx-cur_inode_gen, p); + if (ret 0) + goto out; + + TLV_PUT_PATH(sctx, BTRFS_SEND_A_PATH, p); + TLV_PUT_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, offset); + TLV_PUT_U64(sctx, BTRFS_SEND_A_SIZE, len); + + ret = send_cmd(sctx); + +tlv_put_failure: +out: + fs_path_free(sctx, p); + return ret; +} + static int send_write_or_clone(struct send_ctx *sctx, struct btrfs_path *path, struct btrfs_key *key, @@ -3742,7 +3776,11 @@ static int send_write_or_clone(struct send_ctx *sctx, goto out; } - if (!clone_root) { + if (clone_root) { + ret = send_clone(sctx, offset, len, clone_root); + } else if (sctx-flags BTRFS_SEND_FLAG_NO_FILE_DATA) { + ret = send_update_extent(sctx, offset, len); + } else { while (pos len) { l = len - pos; if (l BTRFS_SEND_READ_SIZE) @@ -3755,10 +3793,7 @@ static int send_write_or_clone(struct send_ctx *sctx, pos += ret; } ret = 0; - } else { - ret = send_clone(sctx, offset, len, clone_root); } - out: return ret; } @@ -4570,6 +4605,8 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user *arg_) INIT_RADIX_TREE(sctx-name_cache, GFP_NOFS); INIT_LIST_HEAD(sctx-name_cache_list); + sctx-flags = arg-flags; + sctx-send_filp = fget(arg-send_fd); if (IS_ERR(sctx-send_filp)) { ret = PTR_ERR(sctx-send_filp); diff --git a/fs/btrfs/send.h b/fs/btrfs/send.h index 1bf4f32..8bb18f7 100644 --- a/fs/btrfs/send.h +++ b/fs/btrfs/send.h @@ -86,6 +86,7 @@ enum btrfs_send_cmd { BTRFS_SEND_C_UTIMES, BTRFS_SEND_C_END, + BTRFS_SEND_C_UPDATE_EXTENT, __BTRFS_SEND_C_MAX, }; #define BTRFS_SEND_C_MAX (__BTRFS_SEND_C_MAX - 1) -- 1.7.7 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: add orphan before truncating pagecache
Running xfstests 83 in a loop would sometimes fail the fsck. This happens because if we invalidate a page that already has an ordered extent setup for it we will complete the ordered extent ourselves, assuming that the truncate will clean everything up. The problem with this is there is plenty of time for the truncate to fail after we've done this work. So to fix this we need to add the orphan item first to make sure the cleanup gets done properly, and then we can truncate the pagecache and all that stuff and be safe. This fixes the btrfsck failures I was seeing while running 83 in a loop. Thanks, Signed-off-by: Josef Bacik jba...@fusionio.com --- fs/btrfs/inode.c | 45 - 1 files changed, 36 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d3ce93e..20138b6 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2484,6 +2484,18 @@ int btrfs_orphan_cleanup(struct btrfs_root *root) continue; } nr_truncate++; + + /* 1 for the orphan item deletion. */ + trans = btrfs_start_transaction(root, 1); + if (IS_ERR(trans)) { + ret = PTR_ERR(trans); + goto out; + } + ret = btrfs_orphan_add(trans, inode); + btrfs_end_transaction(trans, root); + if (ret) + goto out; + ret = btrfs_truncate(inode); } else { nr_unlink++; @@ -3789,6 +3801,29 @@ static int btrfs_setsize(struct inode *inode, loff_t newsize) set_bit(BTRFS_INODE_ORDERED_DATA_CLOSE, BTRFS_I(inode)-runtime_flags); + /* +* 1 for the orphan item we're going to add +* 1 for the orphan item deletion. +*/ + trans = btrfs_start_transaction(root, 2); + if (IS_ERR(trans)) + return PTR_ERR(trans); + + /* +* We need to do this in case we fail at _any_ point during the +* actual truncate. Once we do the truncate_setsize we could +* invalidate pages which forces any outstanding ordered io to +* be instantly completed which will give us extents that need +* to be truncated. If we fail to get an orphan inode down we +* could have left over extents that were never meant to live, +* so we need to garuntee from this point on that everything +* will be consistent. +*/ + ret = btrfs_orphan_add(trans, inode); + btrfs_end_transaction(trans, root); + if (ret) + return ret; + /* we don't support swapfiles, so vmtruncate shouldn't fail */ truncate_setsize(inode, newsize); ret = btrfs_truncate(inode); @@ -6939,11 +6974,9 @@ static int btrfs_truncate(struct inode *inode) /* * 1 for the truncate slack space -* 1 for the orphan item we're going to add -* 1 for the orphan item deletion * 1 for updating the inode. */ - trans = btrfs_start_transaction(root, 4); + trans = btrfs_start_transaction(root, 2); if (IS_ERR(trans)) { err = PTR_ERR(trans); goto out; @@ -6954,12 +6987,6 @@ static int btrfs_truncate(struct inode *inode) min_size); BUG_ON(ret); - ret = btrfs_orphan_add(trans, inode); - if (ret) { - btrfs_end_transaction(trans, root); - goto out; - } - /* * setattr is responsible for setting the ordered_data_close flag, * but that is only tested during the last file release. That -- 1.7.7.6 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: add no file data flag to btrfs send ioctl
On Mon, Jan 07, 2013 at 01:51:19PM -0800, Mark Fasheh wrote: +#define BTRFS_SEND_FLAG_NO_FILE_DATA 0x1 + + sctx-flags = arg-flags; + For compatibility reasons, you should mask the user input value and only allow the supported flag(s). david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: /boot as a btrfs subvolume
On Jan 7, 2013, at 2:41 PM, Gene Czarcinski g...@czarc.net wrote: And then if a distribution is deleted from the system, how is this discovered, and the old entry in the primary grub.cfg removed? Messy. The hardware systems have four or five software systems on each. It is not so much of deleting an distribution as much as just ignoring it. Disk space really is pretty inexpensive these days. The case of natively booting/rebooting different OS's/distributions on BIOS is difficult. For VM, I think maybe a simpler approach like syslinux is better, which understands Btrfs I believe. And for UEFI, I think a simpler approach is something like rEFInd or gummiboot, as boot manager, and then for various linux distributions to be built with EFI_STUB. I do not actually use a grub partition but, instead, a minimal system with grub2 and installation into the MBR. grub's core.img can accept a baked in grub.cfg If this was a true production situation then this would be a good choice but I am dealing with more of a laboratory environment with things that change a lot. Yeah currently it's difficult to do multiple OS native booting. Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: add no file data flag to btrfs send ioctl
On Mon, Jan 07, 2013 at 11:01:17PM +0100, David Sterba wrote: On Mon, Jan 07, 2013 at 01:51:19PM -0800, Mark Fasheh wrote: +#define BTRFS_SEND_FLAG_NO_FILE_DATA 0x1 + + sctx-flags = arg-flags; + For compatibility reasons, you should mask the user input value and only allow the supported flag(s). Fixed, and tested. Thanks for the review David! --Mark -- Mark Fasheh From 601b60fa3fa9e319c11f91fd317dda3fac43b955 Mon Sep 17 00:00:00 2001 btrfs: add no file data flag to btrfs send ioctl This patch adds the flag, BTRFS_SEND_FLAG_NO_FILE_DATA to the btrfs send ioctl code. When this flag is set, the btrfs send code will never write file data into the stream (thus also avoiding expensive reads of that data in the first place). BTRFS_SEND_C_UPDATE_EXTENT commands will be sent (instead of BTRFS_SEND_C_WRITE) with an offset, length pair indicating the extent in question. This patch does not affect the operation of BTRFS_SEND_C_CLONE commands - they will continue to be sent when a search finds an appropriate extent to clone from. Signed-off-by: Mark Fasheh mfas...@suse.de --- fs/btrfs/ioctl.h |7 +++ fs/btrfs/send.c | 48 fs/btrfs/send.h |1 + 3 files changed, 52 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h index 731e287..1f6cfdd 100644 --- a/fs/btrfs/ioctl.h +++ b/fs/btrfs/ioctl.h @@ -363,6 +363,13 @@ struct btrfs_ioctl_received_subvol_args { __u64 reserved[16]; /* in */ }; +/* + * Caller doesn't want file data in the send stream, even if the + * search of clone sources doesn't find an extent. UPDATE_EXTENT + * commands will be sent instead of WRITE commands. + */ +#define BTRFS_SEND_FLAG_NO_FILE_DATA 0x1 + struct btrfs_ioctl_send_args { __s64 send_fd; /* in */ __u64 clone_sources_count; /* in */ diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index e78b297..8d0c6b4 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -85,6 +85,7 @@ struct send_ctx { u32 send_max_size; u64 total_send_size; u64 cmd_send_size[BTRFS_SEND_C_MAX + 1]; + u64 flags; /* 'flags' member of btrfs_ioctl_send_args is u64 */ struct vfsmount *mnt; @@ -3707,6 +3708,39 @@ out: return ret; } +/* + * Send an update extent command to user space. + */ +static int send_update_extent(struct send_ctx *sctx, + u64 offset, u32 len) +{ + int ret = 0; + struct fs_path *p; + + p = fs_path_alloc(sctx); + if (!p) + return -ENOMEM; + + ret = begin_cmd(sctx, BTRFS_SEND_C_UPDATE_EXTENT); + if (ret 0) + goto out; + + ret = get_cur_path(sctx, sctx-cur_ino, sctx-cur_inode_gen, p); + if (ret 0) + goto out; + + TLV_PUT_PATH(sctx, BTRFS_SEND_A_PATH, p); + TLV_PUT_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, offset); + TLV_PUT_U64(sctx, BTRFS_SEND_A_SIZE, len); + + ret = send_cmd(sctx); + +tlv_put_failure: +out: + fs_path_free(sctx, p); + return ret; +} + static int send_write_or_clone(struct send_ctx *sctx, struct btrfs_path *path, struct btrfs_key *key, @@ -3742,7 +3776,11 @@ static int send_write_or_clone(struct send_ctx *sctx, goto out; } - if (!clone_root) { + if (clone_root) { + ret = send_clone(sctx, offset, len, clone_root); + } else if (sctx-flags BTRFS_SEND_FLAG_NO_FILE_DATA) { + ret = send_update_extent(sctx, offset, len); + } else { while (pos len) { l = len - pos; if (l BTRFS_SEND_READ_SIZE) @@ -3755,10 +3793,7 @@ static int send_write_or_clone(struct send_ctx *sctx, pos += ret; } ret = 0; - } else { - ret = send_clone(sctx, offset, len, clone_root); } - out: return ret; } @@ -4570,6 +4605,11 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user *arg_) INIT_RADIX_TREE(sctx-name_cache, GFP_NOFS); INIT_LIST_HEAD(sctx-name_cache_list); + if (arg-flags ~BTRFS_SEND_FLAG_NO_FILE_DATA) + return -EINVAL; + + sctx-flags = arg-flags; + sctx-send_filp = fget(arg-send_fd); if (IS_ERR(sctx-send_filp)) { ret = PTR_ERR(sctx-send_filp); diff --git a/fs/btrfs/send.h b/fs/btrfs/send.h index 1bf4f32..8bb18f7 100644 --- a/fs/btrfs/send.h +++ b/fs/btrfs/send.h @@ -86,6 +86,7 @@ enum btrfs_send_cmd { BTRFS_SEND_C_UTIMES, BTRFS_SEND_C_END, + BTRFS_SEND_C_UPDATE_EXTENT, __BTRFS_SEND_C_MAX, }; #define BTRFS_SEND_C_MAX (__BTRFS_SEND_C_MAX - 1) -- 1.7.7 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to
Re: [PATCH] btrfs: add no file data flag to btrfs send ioctl
(2013/01/08 8:33), Mark Fasheh wrote: On Mon, Jan 07, 2013 at 11:01:17PM +0100, David Sterba wrote: On Mon, Jan 07, 2013 at 01:51:19PM -0800, Mark Fasheh wrote: +#define BTRFS_SEND_FLAG_NO_FILE_DATA 0x1 + + sctx-flags = arg-flags; + For compatibility reasons, you should mask the user input value and only allow the supported flag(s). Fixed, and tested. Thanks for the review David! --Mark -- Mark Fasheh From 601b60fa3fa9e319c11f91fd317dda3fac43b955 Mon Sep 17 00:00:00 2001 btrfs: add no file data flag to btrfs send ioctl This patch adds the flag, BTRFS_SEND_FLAG_NO_FILE_DATA to the btrfs send ioctl code. When this flag is set, the btrfs send code will never write file data into the stream (thus also avoiding expensive reads of that data in the first place). BTRFS_SEND_C_UPDATE_EXTENT commands will be sent (instead of BTRFS_SEND_C_WRITE) with an offset, length pair indicating the extent in question. This patch does not affect the operation of BTRFS_SEND_C_CLONE commands - they will continue to be sent when a search finds an appropriate extent to clone from. Signed-off-by: Mark Fasheh mfas...@suse.de --- fs/btrfs/ioctl.h |7 +++ fs/btrfs/send.c | 48 fs/btrfs/send.h |1 + 3 files changed, 52 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h index 731e287..1f6cfdd 100644 --- a/fs/btrfs/ioctl.h +++ b/fs/btrfs/ioctl.h @@ -363,6 +363,13 @@ struct btrfs_ioctl_received_subvol_args { __u64 reserved[16]; /* in */ }; +/* + * Caller doesn't want file data in the send stream, even if the + * search of clone sources doesn't find an extent. UPDATE_EXTENT + * commands will be sent instead of WRITE commands. + */ +#define BTRFS_SEND_FLAG_NO_FILE_DATA 0x1 + struct btrfs_ioctl_send_args { __s64 send_fd; /* in */ __u64 clone_sources_count; /* in */ diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index e78b297..8d0c6b4 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -85,6 +85,7 @@ struct send_ctx { u32 send_max_size; u64 total_send_size; u64 cmd_send_size[BTRFS_SEND_C_MAX + 1]; + u64 flags; /* 'flags' member of btrfs_ioctl_send_args is u64 */ struct vfsmount *mnt; @@ -3707,6 +3708,39 @@ out: return ret; } +/* + * Send an update extent command to user space. + */ +static int send_update_extent(struct send_ctx *sctx, + u64 offset, u32 len) +{ + int ret = 0; + struct fs_path *p; + + p = fs_path_alloc(sctx); + if (!p) + return -ENOMEM; + + ret = begin_cmd(sctx, BTRFS_SEND_C_UPDATE_EXTENT); + if (ret 0) + goto out; + + ret = get_cur_path(sctx, sctx-cur_ino, sctx-cur_inode_gen, p); + if (ret 0) + goto out; + + TLV_PUT_PATH(sctx, BTRFS_SEND_A_PATH, p); + TLV_PUT_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, offset); + TLV_PUT_U64(sctx, BTRFS_SEND_A_SIZE, len); + + ret = send_cmd(sctx); + +tlv_put_failure: +out: + fs_path_free(sctx, p); + return ret; +} + static int send_write_or_clone(struct send_ctx *sctx, struct btrfs_path *path, struct btrfs_key *key, @@ -3742,7 +3776,11 @@ static int send_write_or_clone(struct send_ctx *sctx, goto out; } - if (!clone_root) { + if (clone_root) { + ret = send_clone(sctx, offset, len, clone_root); + } else if (sctx-flags BTRFS_SEND_FLAG_NO_FILE_DATA) { + ret = send_update_extent(sctx, offset, len); + } else { while (pos len) { l = len - pos; if (l BTRFS_SEND_READ_SIZE) @@ -3755,10 +3793,7 @@ static int send_write_or_clone(struct send_ctx *sctx, pos += ret; } ret = 0; - } else { - ret = send_clone(sctx, offset, len, clone_root); } - out: return ret; } @@ -4570,6 +4605,11 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user *arg_) INIT_RADIX_TREE(sctx-name_cache, GFP_NOFS); INIT_LIST_HEAD(sctx-name_cache_list); + if (arg-flags ~BTRFS_SEND_FLAG_NO_FILE_DATA) + return -EINVAL; I like #define BTRFS_SEND_FLAGS(BTRFS_SEND_FLAG_NO_FILE_DATA | ...) if (arg-flags ~BTRFS_SEND_FLAGS) return -EINVAL; even if current flag is only one. Thanks, Tsutomu + + sctx-flags = arg-flags; + sctx-send_filp = fget(arg-send_fd); if (IS_ERR(sctx-send_filp)) { ret = PTR_ERR(sctx-send_filp); diff --git a/fs/btrfs/send.h b/fs/btrfs/send.h index 1bf4f32..8bb18f7 100644 --- a/fs/btrfs/send.h +++ b/fs/btrfs/send.h @@ -86,6 +86,7 @@ enum btrfs_send_cmd { BTRFS_SEND_C_UTIMES, BTRFS_SEND_C_END,
chattr +C vs. btrfs subvolume snapshot
What happens if you set an individual file inside a subvolume as nocow (chattr +C) and then take a snapshot of that subvolume and modify the file in both? Will btrfs now ignore the nocow attribute completely or will it do as few copies as possible? (I'd love to know if it's possible to visualize the fragmentation of a single file.) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix off-by-one in lseek
On Mon, Jan 07, 2013 at 05:20:50PM +0100, David Sterba wrote: On Mon, Jan 07, 2013 at 11:53:08AM +0800, Liu Bo wrote: Lock end is inclusive. Signed-off-by: Liu Bo bo.li@oracle.com --- fs/btrfs/file.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 77061bf..1e16b6d 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2241,6 +2241,7 @@ static int find_desired_extent(struct inode *inode, loff_t *offset, int whence) if (lockend = lockstart) lockend = lockstart + root-sectorsize; + lockend--; len = lockend - lockstart + 1; len = max_t(u64, len, root-sectorsize); Fix looks ok. I think this should be caught at runtime as well, the number of ways how the lock start and end are passed is not small and it need not be always possible to catch it from reading sources. The range is inclusive, so it's 'lock end % 2 == 1' right? (in the bit manipulating primitives). Hmm, not always here, lockend = inode-i_size - 1, so lockend % 2 == 1 may not be true. But this check can worked in those places where we've rounded the range up to PAGE_SIZE :) thanks, liubo -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: chattr +C vs. btrfs subvolume snapshot
On Tue, Jan 08, 2013 at 12:37:11AM +, Jun Lion wrote: What happens if you set an individual file inside a subvolume as nocow (chattr +C) and then take a snapshot of that subvolume and modify the file in both? Will btrfs now ignore the nocow attribute completely or will it do as few copies as possible? (I'd love to know if it's possible to visualize the fragmentation of a single file.) Well, btrfs nearly puts everything with a kind of timestamp, generation, which stands for transaction id. For your case, the NOCOW file is created just before taking the snapshot, so btrfs thinks of it being shared, and for shared parts, we must COW it to keep everything right when trying to modify them. And for the fragmentation, do you mean 'filefrag'? thanks, liubo -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Btrfs fs not mounting or being identified after power loss.
Hello, The power to my computer was suddenly cut, one of my btrfs fs (not the root) has somewhat disappeared, I can't mount it or use any of the available btrfs tools on it. Here's some command output: [root@archpc ~]# fdisk -l /dev/sdb Disk /dev/sdb: 808.9 GB, 80614912 bytes, 1579860576 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x000db8c9 Device Boot Start End Blocks Id System /dev/sdb1 63 1567571967 783785952+ 83 Linux /dev/sdb4 1567571968 1579859967 6144000 82 Linux swap / Solaris /dev/sdb1 is the lost btrfs fs. [root@archpc ~]# btrfs-find-root -v /dev/sdb1 No valid Btrfs found on /dev/sdb1 [root@archpc ~]# btrfsck /dev/sdb1 No valid Btrfs found on /dev/sdb1 [root@archpc ~]# mount /dev/sdb1 /mnt/restore/ mount: wrong fs type, bad option, bad superblock on /dev/sdb1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so [ 1767.655429] JBD2: no valid journal superblock found [ 1767.655432] EXT4-fs (sdb1): error loading journal If you need anymore information please ask. Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: replace simple_strtoull() with kstrtoull() in btrfs_ioctl_resize()
On 01/08/2013 12:03 AM, David Sterba wrote: On Sun, Jan 06, 2013 at 11:53:41AM +0800, Jeff Liu wrote: -devid = simple_strtoull(devstr, end, 10); +ret = kstrtoull(devstr, 10, devid); +if (ret) { +pr_err(btrfs: resizer unable to parse device %s\n, +devstr); Code looks ok, I would prefer the message error text to say: pr_err(btrfs: resize unable to parse device id %s\n, devstr); but that's maybe just me :) Thanks for the review, yours is better :) The revised patch was shown as following: simple_strtoull() is obsolete, use kstrtoull() instead. Signed-off-by: Jie Liu jeff@oracle.com Cc: David Sterba dste...@suse.cz --- fs/btrfs/ioctl.c |9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 8fcf9a5..d9045eb 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1331,11 +1331,16 @@ static noinline int btrfs_ioctl_resize(struct btrfs_root *root, sizestr = vol_args-name; devstr = strchr(sizestr, ':'); if (devstr) { - char *end; sizestr = devstr + 1; *devstr = '\0'; devstr = vol_args-name; - devid = simple_strtoull(devstr, end, 10); + ret = kstrtoull(devstr, 10, devid); + if (ret) { + pr_err(btrfs: resizer unable to parse device id %s\n, + devstr); + ret = -EINVAL; + goto out_free; + } printk(KERN_INFO btrfs: resizing devid %llu\n, (unsigned long long)devid); } -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v9 1/2] Btrfs: Add a new ioctl to get the label of a mounted file system
Hi Goffredo, On 01/08/2013 01:32 AM, Goffredo Baroncelli wrote: Hi Jeff, On 01/07/2013 07:24 AM, Jeff Liu wrote: Hi, On 01/07/2013 02:44 AM, Goffredo Baroncelli wrote: Hi Jeff, On 01/05/2013 03:48 AM, Jeff Liu wrote: Add a new ioctl(2) BTRFS_IOC_GET_FSLABLE, so that we can get the label upon a mounted filesystem. Signed-off-by: Jie Liu jeff@oracle.com Signed-off-by: Anand Jain anand.j...@oracle.com Cc: Miao Xie mi...@cn.fujitsu.com Cc: Goffredo Baroncelli kreij...@inwind.it Cc: David Sterba dste...@suse.cz --- fs/btrfs/ioctl.c | 21 + fs/btrfs/ioctl.h |2 ++ 2 files changed, 23 insertions(+) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 8fcf9a5..ef2f55a 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3699,6 +3699,25 @@ out: return ret; } May be that it was already discussed, and I am missing something, but if we check the label length we should terminate the string with a zero in case this is too long... However this is a minor bug, please push this patch forward.. I don't think so. Why we need to terminate the string with a zero? We have already truncated the label string up to maximum 255 bytes if it was too long. Consider the normal conditions, i.e. the label string is less than 256, we just copy it back to the user without a terminated NUL. Sorry I don't understand the reason to terminate the string up to 255 chars without adding a zero. Or we copy all the buffer (BTRFS_LABEL_SIZE characters without furthers checks) or we copy BTRFS_LABEL_SIZE-1 characters, adding a zero at the end. For normal cases, there should already has a NUL at the end of the input label string when performing btrfs_ioc_set_fslabel(), and we have also added a NUL for unmounted label set in btrfs-progs. In both cases, put it again on btrfs_ioc_get_fslabel() is redundant IMO. If we ran into a too long label, return the first 255 bytes is fine for both mounted and unmounted get_label() routines in btrfs-progs since I did memset(label, 0, sizeof(label)) at first for fetching mounted fs label. Also, I have not found any existing kernel code does similar things by adding a NUL to end up a char array, could you show me an example? Or we copy all the buffer(BTRFS_LABEL_SIZE) characters without furthers checks If an array is capable of N chars, we can only full it with N - 1 chars. Thanks, -Jeff +static int btrfs_ioctl_get_fslabel(struct file *file, void __user *arg) +{ + struct btrfs_root *root = BTRFS_I(fdentry(file)-d_inode)-root; + const char *label = root-fs_info-super_copy-label; + size_t len = strnlen(label, BTRFS_LABEL_SIZE); + int ret; + int label_is_too_long = 0; bool label_is_too_long = false; + + if (len == BTRFS_LABEL_SIZE) { + pr_warn(btrfs: label is too long, return the first %zu bytes\n, + --len); + label_is_too_long = 1; label_is_too_long = true; + } + + mutex_lock(root-fs_info-volume_mutex); + ret = copy_to_user(arg, label, len); + if (!ret label_is_too_long) + ret = copy_to_user(arg+BTRFS_LABEL_SIZE, , 1); Hmm? ret = copy_to_user(arg + len, , 1); My idea was to put a zero at the end of the string, but I did a mistake with the cutpaste.. sorry :-) + mutex_unlock(root-fs_info-volume_mutex); If you or anyone has objection to this patch, please just post yours, I won't follow it up again. Sorry I don't want to bother anybody, I know that reviewing a patch nine times is a very havvy work. I (and I think more other peoples) really appreciate your efforts. BR G.Baroncelli + + return ret ? -EFAULT : 0; +} + long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -3797,6 +3816,8 @@ long btrfs_ioctl(struct file *file, unsigned int return btrfs_ioctl_qgroup_create(root, argp); case BTRFS_IOC_QGROUP_LIMIT: return btrfs_ioctl_qgroup_limit(root, argp); + case BTRFS_IOC_GET_FSLABEL: + return btrfs_ioctl_get_fslabel(file, argp); } return -ENOTTY; diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h index 731e287..5b2cbef 100644 --- a/fs/btrfs/ioctl.h +++ b/fs/btrfs/ioctl.h @@ -451,6 +451,8 @@ struct btrfs_ioctl_send_args { struct btrfs_ioctl_qgroup_create_args) #define BTRFS_IOC_QGROUP_LIMIT _IOR(BTRFS_IOCTL_MAGIC, 43, \ struct btrfs_ioctl_qgroup_limit_args) +#define BTRFS_IOC_GET_FSLABEL _IOR(BTRFS_IOCTL_MAGIC, 49, \ + char[BTRFS_LABEL_SIZE]) #define BTRFS_IOC_GET_DEV_STATS _IOWR(BTRFS_IOCTL_MAGIC, 52, \ struct btrfs_ioctl_get_dev_stats) #endif -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html