[PATCH] Btrfs: fix a bug when llseek for delalloc bytes behind prealloc extents

2013-01-07 Thread Liu Bo
xfstests case 285 complains.

It it because btrfs did not try to find unwritten delalloc
bytes(only dirty pages, not yet writeback) behind prealloc
extents, it ends up finding nothing while we're with SEEK_DATA.

Signed-off-by: Liu Bo bo.li@oracle.com
---
 fs/btrfs/file.c  |9 ++---
 fs/btrfs/inode.c |   11 ---
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 1e16b6d..f76b1fd 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2308,9 +2308,12 @@ static int find_desired_extent(struct inode *inode, 
loff_t *offset, int whence)
}
}
 
-   *offset = start;
-   free_extent_map(em);
-   break;
+   if (!test_bit(EXTENT_FLAG_PREALLOC,
+ em-flags)) {
+   *offset = start;
+   free_extent_map(em);
+   break;
+   }
}
}
 
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 16d9e8e..7ac41aa 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5572,10 +5572,13 @@ struct extent_map *btrfs_get_extent_fiemap(struct inode 
*inode, struct page *pag
return em;
if (em) {
/*
-* if our em maps to a hole, there might
-* actually be delalloc bytes behind it
+* if our em maps to
+* -  a hole or
+* -  a pre-alloc extent,
+* there might actually be delalloc bytes behind it.
 */
-   if (em-block_start != EXTENT_MAP_HOLE)
+   if (em-block_start != EXTENT_MAP_HOLE 
+   !test_bit(EXTENT_FLAG_PREALLOC, em-flags))
return em;
else
hole_em = em;
@@ -5657,6 +5660,8 @@ struct extent_map *btrfs_get_extent_fiemap(struct inode 
*inode, struct page *pag
 */
em-block_start = hole_em-block_start;
em-block_len = hole_len;
+   if (test_bit(EXTENT_FLAG_PREALLOC, hole_em-flags))
+   set_bit(EXTENT_FLAG_PREALLOC, em-flags);
} else {
em-start = range_start;
em-len = found;
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [bug] oops in linux-3.8-rc2

2013-01-07 Thread David Sterba
Hi,

On Mon, Jan 07, 2013 at 01:51:21AM +0100, Tom Gundersen wrote:
 I ran into a bug today with 3.8-rc2 [0], following an make
 modules_install  make install of the kernel.
 
 I have been using the same kernel without problems for some days, the
 only change immediately before the oops was that I switched from plain
 partitions to using multi-device and subvolumes.
 
 [0]: https://plus.google.com/114015603831160344127/posts/fzEYnF64NkZ

BUG at mm/page-writeback.c:2164

2160 int clear_page_dirty_for_io(struct page *page)
2161 {
2162 struct address_space *mapping = page_mapping(page);
2163
2164 BUG_ON(!PageLocked(page));

btrfs on the stack is only __btrfs_buffered_write +
btrfs_file_aio_write, ie. just a buffered write, originating from
pwrite64.

The bug_on condition is clear, the question is why the page is not
locked.


david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [bug] oops in linux-3.8-rc2

2013-01-07 Thread Chris Mason
On Mon, Jan 07, 2013 at 04:10:37AM -0700, David Sterba wrote:
 Hi,
 
 On Mon, Jan 07, 2013 at 01:51:21AM +0100, Tom Gundersen wrote:
  I ran into a bug today with 3.8-rc2 [0], following an make
  modules_install  make install of the kernel.
  
  I have been using the same kernel without problems for some days, the
  only change immediately before the oops was that I switched from plain
  partitions to using multi-device and subvolumes.
  
  [0]: https://plus.google.com/114015603831160344127/posts/fzEYnF64NkZ
 
 BUG at mm/page-writeback.c:2164
 
 2160 int clear_page_dirty_for_io(struct page *page)
 2161 {
 2162 struct address_space *mapping = page_mapping(page);
 2163
 2164 BUG_ON(!PageLocked(page));
 
 btrfs on the stack is only __btrfs_buffered_write +
 btrfs_file_aio_write, ie. just a buffered write, originating from
 pwrite64.
 
 The bug_on condition is clear, the question is why the page is not
 locked.

The big question is, can you make it happen again?

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Open for contribution towards xfstests for btrfs

2013-01-07 Thread Liu Bo
On Mon, Jan 07, 2013 at 03:48:43PM +0530, Kiran Patil wrote:
 Hello,
 
 We have a team of 5 students who would like to contribute to btrfs
 filesystem testing using xfstests.
 
 Is there space for them to contribute?

xfstests is lacking of testcases targetting for btrfs send/recieve,
you may be interested in it.

thanks,
liubo

 
 If yes, to whom do they need to keep in touch for guidance.
 
 Thank you,
 Kiran Patil (Director)
 Green Turtles Technologies, INDIA.
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Open for contribution towards xfstests for btrfs

2013-01-07 Thread Arne Jansen
On 07.01.2013 14:01, Liu Bo wrote:
 On Mon, Jan 07, 2013 at 03:48:43PM +0530, Kiran Patil wrote:
 Hello,

 We have a team of 5 students who would like to contribute to btrfs
 filesystem testing using xfstests.

 Is there space for them to contribute?
 
 xfstests is lacking of testcases targetting for btrfs send/recieve,
 you may be interested in it.

There is even a basis for that already, we have a testsuite for the
zfs send - btrfs receive feature at

git://git.kernel.org/pub/scm/linux/kernel/git/arne/far-progs.git

see test.pl. It can also already be used to test btrfs send -
btrfs receive.

-Arne

 
 thanks,
 liubo
 

 If yes, to whom do they need to keep in touch for guidance.

 Thank you,
 Kiran Patil (Director)
 Green Turtles Technologies, INDIA.
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Open for contribution towards xfstests for btrfs

2013-01-07 Thread David Sterba
On Mon, Jan 07, 2013 at 03:48:43PM +0530, Kiran Patil wrote:
 We have a team of 5 students who would like to contribute to btrfs
 filesystem testing using xfstests.

By any chance, are these the same people behind
http://sourceforge.net/projects/btrfstest/ ?

 Is there space for them to contribute?

Sure. As an example of where to start, look at a some patches from Miao
Xie with reproducers for several bugs. This IMHO is a smooth path to
start with xfstests, the task is to embody the test itself into the
teitesinfrastructure.

The more advanced tasks:
* look at commits in btrfs and try to write a specific test for them
  (depends on the nature of the bug, the difficulty ranges from easy to
  impossible)
* write test coverage for a particular feature (eg. for Jeff Liu's label
  setting patches)
* several btrfs specific xfstests were sent to the mailinglists over
  time, I'm sure not all of them were merged

 If yes, to whom do they need to keep in touch for guidance.

You can learn a lot from existing xfstests and review comments of
proposed tests.  Feel free to ask in the respective mailinglists or IRC
channels.


david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: /boot as a btrfs subvolume

2013-01-07 Thread Gene Czarcinski

On 01/06/2013 09:00 PM, Chris Murphy wrote:

On Jan 6, 2013, at 1:05 PM, Gene Czarcinski g...@czarc.net wrote:


On 01/06/2013 02:17 PM, Swâmi Petaramesh wrote:

Le 06/01/2013 20:11, Chris Murphy a écrit :

If you use UUID, and you use subvol=, and you don't rename/move your subvolume, 
it's perfectly safe. Nevertheless, GRUB becoming subvolid aware seems like a 
good idea to me, but I have no idea what's involved in that.

I actually run several machines on which I have /boot in a separate
BTRFS subvol, without any issue. I have a multiboot between several
different distros (typically Ubuntu, Mint, LMDE, Bodhi... All Ubuntu
derivatives except for LMDE which is Debian-based...) sharing the same
BTRFS container and using different subvols i.e. UBUNTU/@boot,
LMDE/@boot etc...

Works just great.


I assume you have a grub partition (or its equivalent) with a grub.cfg file 
having menuentry definitions [pointing to the different grub.cfg file for each system ... 
that seems to work well (at least for me).  Currently, os-prober does not support btrfs.

I have taken a little look at the grub2 source code and there is some mention 
of both btrfs and zfs (and also btrfs subvolumes) in the changelogs.  However, 
it is not clear to me (and I have not had the time yet) to explore exactly what 
the source code is doing or not doing.

Well, at least with the f18 version of GRUB 2 2.00, whether alternative Btrfs 
bootable systems are mounted or not, -mkconfig isn't searching/finding for the 
/etc/fstab and /etc/default/grub of the other system like it appears to do with 
other file systems. I don't get any additional entries other than the currently 
booted Btrfs system.

So it looks like I'd need to manually add a configfile menu entry, pointing to 
each Btrfs bootable system. Chainloading from one grub to another is not 
useful. Better if they're all on the same GRUB, and use configfile.

There appear to be two situations where you have multiple software 
systems installed on the same hardware (real or virtual) -- root on a 
btrfs subvolume or /boot installed on an LV.


I worked up a patch to os-prober and a small update to grub2 which 
addresses the root on a btrfs subvolume -- see 
https://bugzilla.redhat.com/show_bug.cgi?id=888341


I have not looked into the /boot in a LV so it may involve os-prober 
also and I am also not sure that it will work with /boot on a btrfs 
subvolume.


Since developing those patches, I have had second thoughts about how 
multiboot should be done and now believe that it should involve a grub 
partition with a simple grub.cfg file and os-prober disabled. The simple 
grub.cfg file had menuentry definitions which point to the real 
grub.cfg file for each system.


One of these simple definitions would look like:

menuentry F18 on BTRFS {
 insmod  part_msdos
 insmod  btrfs
 set  root='hd0,msdos5'
 configfile  /boot1/grub2/grub.cfg
}

where /boot is on btrfs subvol boot1

I do not actually use a grub partition but, instead, a minimal system 
with grub2 and installation into the MBR.


Gene
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: replace simple_strtoull() with kstrtoull() in btrfs_ioctl_resize()

2013-01-07 Thread David Sterba
On Sun, Jan 06, 2013 at 11:53:41AM +0800, Jeff Liu wrote:
 - devid = simple_strtoull(devstr, end, 10);
 + ret = kstrtoull(devstr, 10, devid);
 + if (ret) {
 + pr_err(btrfs: resizer unable to parse device %s\n,
 + devstr);

Code looks ok, I would prefer the message error text to say:

pr_err(btrfs: resize unable to parse device id %s\n,
devstr);

but that's maybe just me :)
david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix off-by-one in lseek

2013-01-07 Thread David Sterba
On Mon, Jan 07, 2013 at 11:53:08AM +0800, Liu Bo wrote:
 Lock end is inclusive.
 
 Signed-off-by: Liu Bo bo.li@oracle.com
 ---
  fs/btrfs/file.c |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)
 
 diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
 index 77061bf..1e16b6d 100644
 --- a/fs/btrfs/file.c
 +++ b/fs/btrfs/file.c
 @@ -2241,6 +2241,7 @@ static int find_desired_extent(struct inode *inode, 
 loff_t *offset, int whence)
   if (lockend = lockstart)
   lockend = lockstart + root-sectorsize;
  
 + lockend--;
   len = lockend - lockstart + 1;
  
   len = max_t(u64, len, root-sectorsize);

Fix looks ok. I think this should be caught at runtime as well, the
number of ways how the lock start and end are passed is not small and it
need not be always possible to catch it from reading sources. The range
is inclusive, so it's 'lock end % 2 == 1' right? (in the bit
manipulating primitives).

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs: wipe all the superblock [redhat bugzilla 889888]

2013-01-07 Thread David Sterba
On Sun, Jan 06, 2013 at 07:28:55PM +0100, Goffredo Baroncelli wrote:
 Currently wipefs doesn't clear all the superblock of btrfs. Only the first 
 one is cleared.
 
 Btrfs has three superblocks. The first one is placed at 64KB, the second 
 one at 64MB, the third one at 256GB.

It can have as much as 4 superblock backup copies:

Superblock offset 0 is 65536 (0x1, block=16/0x10)
Superblock offset 1 is 67108864 (0x400, block=16384/0x4000)
Superblock offset 2 is 274877906944 (0x40, block=67108864/0x400)
Superblock offset 3 is 1125899906842624 (0x4, 
block=274877906944/0x40)
Superblock offset 4 is 4611686018427387904 (0x4000, 
block=1125899906842624/0x4)

 If the first superblock is valid except that the magic field is zeroed,
 btrfs skips the check of the other superblocks.
 If the first superblock is fully invalid, btrfs checks for the other
 superblock.
 
 So zeroing the first superblock magic field at the beginning seems
 that the filesystem is wiped. But when the first superblock is overwritten
 (e.g. by another filesystem), then the other two superblocks may be considered
 valid, and the filesystem may resurrect.

And for that purpose all the superblock copies should be taken into
account, regardless of the tricks that btrfs_mount applies.


david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Open for contribution towards xfstests for btrfs

2013-01-07 Thread Kiran Patil
On Mon, Jan 7, 2013 at 9:28 PM, Eric Sandeen sand...@redhat.com wrote:
 On 1/7/13 4:18 AM, Kiran Patil wrote:
 Hello,

 We have a team of 5 students who would like to contribute to btrfs
 filesystem testing using xfstests.

 Is there space for them to contribute?

 If yes, to whom do they need to keep in touch for guidance.

 (cc: xfs list)

 Are you thinking of writing new tests, or doing more formal testing
 with the existing testsuite?  Both would be useful.

 In particular, any time there's a bugfix, a regression test is helpful.
 Generic new tests of btrfs functionality would be good, too.

 I would suggest starting with a couple new tests, and cc: the xfs list
 for review.  In the course of the review you'll get a sense of the various
 conventions used in xfstests, etc.

 Simply running xfstests over various btrfs configurations would also
 be useful, to be sure any regressions caught by existing tests don't go
 unnoticed.

 thanks,
 -Eric

Dear Eric and David,

Thanks a lot for your inputs.

We will work on them and soon we will start our work.

Thank you,

-Kiran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [bug] csum mismatches and failed xfstests with 3.8-rc1 and rc2

2013-01-07 Thread David Sterba
with top commit 5f243b9b46a22e5790dbbc36f574c2417af49a41 (something post
-rc2) I see more checksum errors

$ dmesg|grep csum|wc -l
100

and they appeared in a period of like last 30 minutes.  previous test rounds 
were
clean, and I can see that the same test sequenece run 3 time in a row with the
same mount parameters without a mkfs between runs. Test messages seem to be
related to no-space conditions.

MKFS_OPTIONS  -- /dev/sda9
MOUNT_OPTIONS -- -o space_cache,noatime /dev/sda9 /mnt/a2

091 41s ... [17:18:32] [17:18:33] [failed, exit status 1] - output mismatch 
(see 091.out.bad)
--- 091.out 2011-11-01 10:31:12.0 +0100
+++ 091.out.bad 2013-01-07 17:18:33.0 +0100
@@ -1,7 +1,6 @@
 QA output created by 091
 fsx -N 1 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
-fsx -N 1 -o 8192 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
-fsx -N 1 -o 32768 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
-fsx -N 1 -o 8192 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
-fsx -N 1 -o 32768 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
-fsx -N 1 -o 128000 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -W
+./091: line 46: 11355 Segmentation fault  $here/ltp/fsx $args 
$TEST_DIR/junk  $seq.full 21
+fsx -N 1 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
+mapped writes DISABLED
+truncating to largest ever: 0x12a00


133 220s ...[17:28:51] [17:31:13] - output mismatch (see 133.out.bad)
--- 133.out 2011-02-11 11:42:31.0 +0100
+++ 133.out.bad 2013-01-07 17:31:13.0 +0100
@@ -2,4 +2,5 @@
 Buffered writer, buffered reader
 Direct writer, buffered reader
 Buffered writer, direct reader
+pread64: Input/output error
 Direct writer, direct reader


240 1s ...  [17:40:51] [17:40:51] [failed, exit status 11] - output 
mismatch (see 240.out.bad)
--- 240.out 2011-08-10 17:17:23.0 +0200
+++ 240.out.bad 2013-01-07 17:40:51.0 +0100
@@ -1,2 +1,4 @@
 QA output created by 240
 Silence is golden.
+AIO write offset 33280 expected 4096 got -5
+short read() at offset 29184


247 41s ... [17:42:16] [17:42:36] [failed, exit status 1] - output mismatch 
(see 247.out.bad)
--- 247.out 2011-08-10 17:17:23.0 +0200
+++ 247.out.bad 2013-01-07 17:42:36.0 +0100
@@ -1,2 +1,2 @@
 QA output created by 247
-Silence is golden.
+Bus error


263 237s ...[17:45:00] [17:45:01] [failed, exit status 1] - output mismatch 
(see 263.out.bad)
--- 263.out 2011-11-01 10:31:12.0 +0100
+++ 263.out.bad 2013-01-07 17:45:01.0 +0100
@@ -1,3 +1,100 @@
 QA output created by 263
 fsx -N 1 -o 8192 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z
-fsx -N 1 -o 128000 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z
+fsx -N 1 -o 8192 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z
+truncating to largest ever: 0x12a00
+truncating to largest ever: 0x75400
+dowrite: write: Input/output error
+LOG DUMP (91 total operations):
+1(  1 mod 256): SKIPPED (no operation)
+2(  2 mod 256): MAPWRITE 0x62600 thru 0x626e8  (0xe9 bytes)
+3(  3 mod 256): FALLOC   0x2e0f2 thru 0x2f438  (0x1346 bytes) INTERIOR
+4(  4 mod 256): TRUNCATE DOWN  from 0x626e9 to 0x12a00
+5(  5 mod 256): MAPREAD  0x0 thru 0x90e(0x90f bytes)
+6(  6 mod 256): FALLOC   0x7048 thru 0x80ca(0x1082 bytes) INTERIOR
+7(  7 mod 256): WRITE0x5ea00 thru 0x5f3ff  (0xa00 bytes) HOLE
+8(  8 mod 256): SKIPPED (no operation)
+9(  9 mod 256): FALLOC   0x4957f thru 0x4af1f  (0x19a0 bytes) INTERIOR
+10( 10 mod 256): SKIPPED (no operation)
+11( 11 mod 256): WRITE0x10a00 thru 0x127ff (0x1e00 bytes)
+12( 12 mod 256): MAPWRITE 0x53800 thru 0x54384 (0xb85 bytes)
+13( 13 mod 256): MAPWRITE 0x5ae00 thru 0x5c8a9 (0x1aaa bytes)
+14( 14 mod 256): READ 0x4a000 thru 0x4afff (0x1000 bytes)
+15( 15 mod 256): SKIPPED (no operation)
+16( 16 mod 256): MAPREAD  0x36000 thru 0x36acf (0xad0 bytes)
+17( 17 mod 256): SKIPPED (no operation)
+18( 18 mod 256): READ 0x12000 thru 0x12fff (0x1000 bytes)
+19( 19 mod 256): MAPWRITE 0x17600 thru 0x17b42 (0x543 bytes)
+20( 20 mod 256): READ 0x29000 thru 0x29fff (0x1000 bytes)
+21( 21 mod 256): FALLOC   0xea89 thru 0x1023e  (0x17b5 bytes) INTERIOR
+22( 22 mod 256): FALLOC   0x569aa thru 0x5847f (0x1ad5 bytes) INTERIOR
+23( 23 mod 256): WRITE0x35c00 thru 0x369ff (0xe00 bytes)
+24( 24 mod 256): SKIPPED (no operation)
+25( 25 mod 256): SKIPPED (no operation)
+26( 26 mod 256): MAPREAD  0x32000 thru 0x32188 (0x189 bytes)
+27( 27 mod 256): MAPREAD  0x3f000 thru 0x40e43 (0x1e44 bytes)
+28( 28 mod 256): WRITE0x6f600 thru 0x705ff (0x1000 bytes) HOLE
+29( 29 mod 256): MAPREAD  0x5b000 thru 0x5c99e (0x199f bytes)
+30( 30 mod 256): TRUNCATE UP   from 0x70600 to 0x75400
+31( 31 mod 256): MAPREAD  0x4000 thru 0x54d5   (0x14d6 bytes)
+32( 32 mod 256): SKIPPED (no operation)
+33( 33 mod 256): FALLOC   0x31d49 thru 0x32872 (0xb29 bytes) INTERIOR
+34( 34 mod 256): FALLOC   0x2bbb3 thru 0x2bca7 (0xf4 bytes) INTERIOR
+35( 35 mod 256): MAPREAD  0x68000 thru 0x68d8e (0xd8f bytes)
+36( 36 mod 

Re: [bug] csum mismatches and failed xfstests with 3.8-rc1 and rc2

2013-01-07 Thread David Sterba
On Mon, Jan 07, 2013 at 06:03:51PM +0100, David Sterba wrote:
 with top commit 5f243b9b46a22e5790dbbc36f574c2417af49a41 (something post
 -rc2) I see more checksum errors
 
 $ dmesg|grep csum|wc -l
 100

more of dmesg:

[15303.739076] btrfs csum failed ino 63791 off 368640 csum 3994424334 private 
783210346
[15303.748113] [ cut here ]
[15303.752052] kernel BUG at mm/page-writeback.c:2164!
[15303.752052] invalid opcode:  [#1] SMP 
[15303.752052] Modules linked in: dm_crypt loop btrfs
[15303.752052] CPU 0 
[15303.752052] Pid: 11355, comm: fsx Not tainted 3.8.0-rc2-default+ #228 Intel 
Corporation Santa Rosa platform/Matanzas
[15303.752052] RIP: 0010:[81118239]  [81118239] 
clear_page_dirty_for_io+0x119/0x130
[15303.752052] RSP: 0018:88002024fb28  EFLAGS: 00010246
[15303.752052] RAX: 4802 RBX: ea00013c68f0 RCX: 
[15303.752052] RDX: 0011 RSI: 81151320 RDI: ea00013c68f0
[15303.752052] RBP: 88002024fb38 R08:  R09: 88004fa65640
[15303.752052] R10:  R11:  R12: 88006dd6e780
[15303.752052] R13:  R14: 880063f4fac0 R15: 0001
[15303.752052] FS:  7f7067547700() GS:88007d80() 
knlGS:
[15303.752052] CS:  0010 DS:  ES:  CR0: 8005003b
[15303.752052] CR2: 7f7066d2cb10 CR3: 1dafe000 CR4: 07f0
[15303.752052] DR0:  DR1:  DR2: 
[15303.752052] DR3:  DR6: 0ff0 DR7: 0400
[15303.752052] Process fsx (pid: 11355, threadinfo 88002024e000, task 
880058334440)
[15303.752052] Stack:
[15303.752052]  880063f4fac0 0001 88002024fc08 
a0044092
[15303.752052]  88002024fbd0 8850 00022000 
1000819572cb
[15303.752052]   0005b000 797903b8 
0005ae00
[15303.752052] Call Trace:
[15303.752052]  [a0044092] prepare_pages+0x302/0x3a0 [btrfs]
[15303.752052]  [a00455ed] __btrfs_buffered_write+0x1ad/0x370 [btrfs]
[15303.752052]  [a0045cab] btrfs_file_aio_write+0x4fb/0x5e0 [btrfs]
[15303.752052]  [8115e04a] do_sync_write+0xaa/0xf0
[15303.752052]  [8115e74e] vfs_write+0xce/0x190
[15303.752052]  [8115eab2] sys_write+0x62/0xb0
[15303.752052]  [8139a2de] ? trace_hardirqs_on_thunk+0x3a/0x3f
[15303.752052]  [81960699] system_call_fastpath+0x16/0x1b
[15303.752052] Code: 00 00 48 89 df e8 38 fb ff ff e9 73 ff ff ff 0f 1f 00 48 
89 df 57 9d 66 66 90 66 90 e8 51 e5 f8 ff b8 01 00 00 00 e9 3a ff ff ff 0f 0b 
49 c7 c4 20 7f e4 81 e9 07 ff ff ff 66 0f 1f 84 00 00 00 
[15303.752052] RIP  [81118239] clear_page_dirty_for_io+0x119/0x130
[15303.752052]  RSP 88002024fb28
[15304.049369] ---[ end trace e69c5ab147a5d842 ]---
[15304.358163] [ cut here ]
[15304.364297] WARNING: at fs/btrfs/inode.c:7123 
btrfs_destroy_inode+0x23e/0x2e0 [btrfs]()
[15304.373804] Hardware name: Santa Rosa platform
[15304.373809] Modules linked in: dm_crypt loop btrfs
[15304.373813] Pid: 11387, comm: umount Tainted: G  D  
3.8.0-rc2-default+ #228
[15304.373814] Call Trace:
[15304.373822]  [8104c6bf] warn_slowpath_common+0x7f/0xc0
[15304.373831]  [8104c71a] warn_slowpath_null+0x1a/0x20
[15304.373851]  [a00435be] btrfs_destroy_inode+0x23e/0x2e0 [btrfs]
[15304.373857]  [81178a4c] destroy_inode+0x3c/0x70
[15304.373862]  [81178ba6] evict+0x126/0x1c0
[15304.373867]  [81178c8f] dispose_list+0x4f/0x60
[15304.373873]  [811799f4] evict_inodes+0x114/0x130
[15304.373879]  [81160903] generic_shutdown_super+0x53/0xf0
[15304.373886]  [81160a36] kill_anon_super+0x16/0x30
[15304.373898]  [a0004d9a] btrfs_kill_super+0x1a/0x90 [btrfs]
[15304.373903]  [81161902] ? deactivate_super+0x42/0x70
[15304.373909]  [81160c8d] deactivate_locked_super+0x3d/0x90
[15304.373914]  [8116190a] deactivate_super+0x4a/0x70
[15304.373920]  [8117dc90] mntput_no_expire+0x100/0x160
[15304.373925]  [8117ecd1] sys_umount+0x71/0x3c0
[15304.373931]  [81960699] system_call_fastpath+0x16/0x1b
[15304.373934] ---[ end trace e69c5ab147a5d843 ]---
[15304.373937] [ cut here ]
[15304.373954] WARNING: at fs/btrfs/inode.c:7124 
btrfs_destroy_inode+0x2c6/0x2e0 [btrfs]()
[15304.373955] Hardware name: Santa Rosa platform
[15304.373963] Modules linked in: dm_crypt loop btrfs
[15304.373968] Pid: 11387, comm: umount Tainted: G  D W
3.8.0-rc2-default+ #228
[15304.373969] Call Trace:
[15304.373974]  [8104c6bf] warn_slowpath_common+0x7f/0xc0
[15304.373979]  [8104c71a] warn_slowpath_null+0x1a/0x20
[15304.373998]  [a0043646] btrfs_destroy_inode+0x2c6/0x2e0 [btrfs]
[15304.374003]  [81178a4c] destroy_inode+0x3c/0x70

Re: Open for contribution towards xfstests for btrfs

2013-01-07 Thread Kiran Patil
On Mon, Jan 7, 2013 at 10:31 PM, Kiran Patil kirantpa...@gmail.com wrote:
 On Mon, Jan 7, 2013 at 9:28 PM, Eric Sandeen sand...@redhat.com wrote:
 On 1/7/13 4:18 AM, Kiran Patil wrote:
 Hello,

 We have a team of 5 students who would like to contribute to btrfs
 filesystem testing using xfstests.

 Is there space for them to contribute?

 If yes, to whom do they need to keep in touch for guidance.

 (cc: xfs list)

 Are you thinking of writing new tests, or doing more formal testing
 with the existing testsuite?  Both would be useful.

 In particular, any time there's a bugfix, a regression test is helpful.
 Generic new tests of btrfs functionality would be good, too.

 I would suggest starting with a couple new tests, and cc: the xfs list
 for review.  In the course of the review you'll get a sense of the various
 conventions used in xfstests, etc.

 Simply running xfstests over various btrfs configurations would also
 be useful, to be sure any regressions caught by existing tests don't go
 unnoticed.

 thanks,
 -Eric

 Dear Eric and David,

 Thanks a lot for your inputs.

 We will work on them and soon we will start our work.

 Thank you,

 -Kiran

Sorry, I forgot to say thanks to Liu and Arne.

Thank you Liu and Arne for your suggestions.

-Kiran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Online Deduplication for Btrfs (Master's thesis)

2013-01-07 Thread Martin Křížek
On Tue, Dec 18, 2012 at 2:31 AM, Chris Mason chris.ma...@fusionio.com wrote:
 On Mon, Dec 17, 2012 at 06:33:24AM -0700, Alexander Block wrote:
 I did some research on deduplication in the past and there are some
 problems that you will face. I'll try to list some of them (for sure
 not all).

 Thanks Alexander for writing all of this up.  There are a lot of great
 points here, but I'll summarize with:

 [ many challenges to online dedup ]

 [ offline dedup is the best way ]

 So, the big problem with offline dedup is you're suddenly read bound.  I
 don't disagree that offline makes a lot of the dedup problems easier,
 and Alexander describes a very interesting system here.

 I've tried to avoid features that rely on scanning though, just because
 idle disk time may not really exist.  But with scrub, we have the scan
 as a feature, and it may make a lot of sense to leverage that.

 online dedup has a different set of tradeoffs, but as Alexander says the
 hard part really is the data structure to index the hashes.  I think
 there are a few different options here, including changing the file
 extent pointers to point to a sha instead of a logical disk offset.


I am not sure if I am following the approach with replacing the
pointers, could you please explain?

I agree that the data structure is the key part in designing the feature.

 So, part of my answer really depends on where you want to go with your
 thesis.  I expect the data structure work for efficient hash lookup is
 going to be closer to what your course work requires?

No, I don't think it is really necessary that the data structure
should be close to my course work. What is your opinion on the most
suitable data structure to index the hashes?

Thanks!
Martin
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Open for contribution towards xfstests for btrfs

2013-01-07 Thread Eric Sandeen
On 1/7/13 9:31 AM, David Sterba wrote:
 On Mon, Jan 07, 2013 at 03:48:43PM +0530, Kiran Patil wrote:
 We have a team of 5 students who would like to contribute to btrfs
 filesystem testing using xfstests.
 
 By any chance, are these the same people behind
 http://sourceforge.net/projects/btrfstest/ ?

If so, you might want to consider that

http://www.tuxera.com/community/posix-test-suite/

probably already covers some of the tests that btrfstest is
(re)writing.  i.e.:

+/* Creating Link on non-existing prefix path should return ENOTDIR

and

+/* Creating link on ENAMETOOLONG if either file name exceeds maximum characters

are almost certainly covered in the above posix test suite.

FWIW, that testsuite has a few failures on btrfs (not all unique to btrfs, 
IIRC):

Test Summary Report
---
/root/pjd-fstest/pjd-fstest-20090130-RC/tests/chown/00.t   (Wstat: 0 Tests: 171 
Failed: 6)
  Failed tests:  84, 88, 141, 145, 149, 153
/root/pjd-fstest/pjd-fstest-20090130-RC/tests/truncate/00.t (Wstat: 0 Tests: 21 
Failed: 1)
  Failed test:  15
/root/pjd-fstest/pjd-fstest-20090130-RC/tests/xacl/00.t(Wstat: 0 Tests: 45 
Failed: 1)
  Failed test:  45

-Eric



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs: wipe all the superblock [redhat bugzilla 889888]

2013-01-07 Thread Goffredo Baroncelli
Hi David,

On 01/07/2013 05:33 PM, David Sterba wrote:
 On Sun, Jan 06, 2013 at 07:28:55PM +0100, Goffredo Baroncelli wrote:
 Currently wipefs doesn't clear all the superblock of btrfs. Only the first 
 one is cleared.

 Btrfs has three superblocks. The first one is placed at 64KB, the second 
 one at 64MB, the third one at 256GB.
 
 It can have as much as 4 superblock backup copies:
 
 Superblock offset 0 is 65536 (0x1, block=16/0x10)
 Superblock offset 1 is 67108864 (0x400, block=16384/0x4000)
 Superblock offset 2 is 274877906944 (0x40, block=67108864/0x400)
 Superblock offset 3 is 1125899906842624 (0x4, 
 block=274877906944/0x40)
 Superblock offset 4 is 4611686018427387904 (0x4000, 
 block=1125899906842624/0x4)

Are you sure ?

Regarding the btrfs-progs suite, I looked at the btrfs_read_dev_super():
[..]
for (i = 0; i  BTRFS_SUPER_MIRROR_MAX; i++) {
bytenr = btrfs_sb_offset(i);
ret = pread64(fd, buf, sizeof(buf), bytenr);

Where BTRFS_SUPER_MIRROR_MAX is 3.

Regarding the kernel code, I looked at several function which call
btrfs_sb_offset(); everywhere there is an upper limit of the superblock
numbero which is BTRFS_SUPER_MIRROR_MAX, which is still 3.

Moreover I performed the following test:

$ ls -lh 7tb-filesystem.img
-rw-r--r-- 1 ghigo ghigo 7.1E Jan  7 18:49 7eb-filesystem.img
$ /sbin/mkfs.btrfs 7eb-filesystem.img

$ cat extract-sign.py
import os

BTRFS_SUPER_MIRROR_SHIFT = 12
BTRFS_SUPER_INFO_OFFSET = (64*1024)

def btrfs_sb_offset(mirror):
start = 16*1024
if(mirror):
return start  (BTRFS_SUPER_MIRROR_SHIFT * mirror)
return BTRFS_SUPER_INFO_OFFSET

f = open(7eb-filesystem.img,r)
for i in range(5):
pos = btrfs_sb_offset(i)+64
f.seek(pos)
sign = f.read(8)

print Superblock #%d - %20d - '%s'%(i,pos,sign)

$ python extract-sign.py
Superblock #0 -65600 - '_BHRfS_M'
Superblock #1 - 67108928 - '_BHRfS_M'
Superblock #2 - 274877907008 - '_BHRfS_M'
Superblock #3 - 1125899906842688 - ''
Superblock #4 -  4611686018427387968 - ''


To me it seems that in a 7TB filesystem there is only 3 superblocks.


 
 If the first superblock is valid except that the magic field is zeroed,
 btrfs skips the check of the other superblocks.
 If the first superblock is fully invalid, btrfs checks for the other
 superblock.

 So zeroing the first superblock magic field at the beginning seems
 that the filesystem is wiped. But when the first superblock is overwritten
 (e.g. by another filesystem), then the other two superblocks may be 
 considered
 valid, and the filesystem may resurrect.
 
 And for that purpose all the superblock copies should be taken into
 account, regardless of the tricks that btrfs_mount applies.
 
 
 david
 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs: wipe all the superblock [redhat bugzilla 889888]

2013-01-07 Thread Goffredo Baroncelli
Hi hugo

On 01/07/2013 07:24 PM, Hugo Mills wrote:
[...]

 print Superblock #%d - %20d - '%s'%(i,pos,sign)

 $ python extract-sign.py
 Superblock #0 -65600 - '_BHRfS_M'
 
 64 KiB

OK, (above)
 
 Superblock #1 - 67108928 - '_BHRfS_M'
 
 256 MiB

 above is 64M, below is 256M !
 
 Superblock #2 - 274877907008 - '_BHRfS_M'
 
 1 TiB
 
 Superblock #3 - 1125899906842688 - ''
 
 4 PiB
 
 Superblock #4 -  4611686018427387968 - ''
 
 16 EiB

Not reachable in linux. There is a VFS limits to 8EB.
 
 
 To me it seems that in a 7TB filesystem there is only 3 superblocks.
 
That would be as expected. How many on a 5 PiB filesystem, though?
 Or a 20 EiB one?

I wrote 7TB, but I meant 8EB. I first tried a test with 7TB and I wrote
the email, but then I retested with 8EB and I corrected the email,
forgetting some 7t instead of 8e...

Anyway, if BTRFS would allow more than three super-block,  in the 7TB
case, the super-block at 1TB would appeared
 
Hugo.
 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: /boot as a btrfs subvolume

2013-01-07 Thread Chris Murphy

On Jan 7, 2013, at 8:42 AM, Gene Czarcinski g...@czarc.net wrote:

 On 01/06/2013 09:00 PM, Chris Murphy wrote:
 On Jan 6, 2013, at 1:05 PM, Gene Czarcinski g...@czarc.net wrote:
 
 On 01/06/2013 02:17 PM, Swâmi Petaramesh wrote:
 Le 06/01/2013 20:11, Chris Murphy a écrit :
 If you use UUID, and you use subvol=, and you don't rename/move your 
 subvolume, it's perfectly safe. Nevertheless, GRUB becoming subvolid 
 aware seems like a good idea to me, but I have no idea what's involved in 
 that.
 I actually run several machines on which I have /boot in a separate
 BTRFS subvol, without any issue. I have a multiboot between several
 different distros (typically Ubuntu, Mint, LMDE, Bodhi... All Ubuntu
 derivatives except for LMDE which is Debian-based...) sharing the same
 BTRFS container and using different subvols i.e. UBUNTU/@boot,
 LMDE/@boot etc...
 
 Works just great.
 
 I assume you have a grub partition (or its equivalent) with a grub.cfg 
 file having menuentry definitions [pointing to the different grub.cfg file 
 for each system ... that seems to work well (at least for me).  Currently, 
 os-prober does not support btrfs.
 
 I have taken a little look at the grub2 source code and there is some 
 mention of both btrfs and zfs (and also btrfs subvolumes) in the 
 changelogs.  However, it is not clear to me (and I have not had the time 
 yet) to explore exactly what the source code is doing or not doing.
 Well, at least with the f18 version of GRUB 2 2.00, whether alternative 
 Btrfs bootable systems are mounted or not, -mkconfig isn't searching/finding 
 for the /etc/fstab and /etc/default/grub of the other system like it appears 
 to do with other file systems. I don't get any additional entries other than 
 the currently booted Btrfs system.
 
 So it looks like I'd need to manually add a configfile menu entry, pointing 
 to each Btrfs bootable system. Chainloading from one grub to another is not 
 useful. Better if they're all on the same GRUB, and use configfile.
 
 There appear to be two situations where you have multiple software systems 
 installed on the same hardware (real or virtual) -- root on a btrfs subvolume 
 or /boot installed on an LV.

I'm unclear on the use case. If /boot is on its own LV, why not just use ext4? 
If Btrfs, then I'd expect boot to be included in one Btrfs volume, along with 
home and rootfs. If that's on an LV, the use cases I'm envisioning for that the 
boot loader wouldn't be aware of the LV (e.g. VM, iSCSI, etc).

 Since developing those patches, I have had second thoughts about how 
 multiboot should be done and now believe that it should involve a grub 
 partition with a simple grub.cfg file and os-prober disabled. The simple 
 grub.cfg file had menuentry definitions which point to the real grub.cfg 
 file for each system.

It's messy without agreement on how to boot all distributions. I've also 
thought of this primary/public grub.cfg, and secondary/private grub.cfg. The 
first forwards to the grub.cfg for each distribution using configfile or 
legacyconfigfile. However, that primary GRUB instance necessarily would need to 
contain the entry for Windows, etc.

So each distribution's grub-mkconfig would need to know to write out a 
distribution specific grub.cfg which is based on os-prober disabled; but then 
also be capable of updating the primary grub.cfg - not merely write out a new 
one.

And then if a distribution is deleted from the system, how is this discovered, 
and the old entry in the primary grub.cfg removed? Messy.

 I do not actually use a grub partition but, instead, a minimal system with 
 grub2 and installation into the MBR.

grub's core.img can accept a baked in grub.cfg


Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: /boot as a btrfs subvolume

2013-01-07 Thread Chris Murphy

On Jan 7, 2013, at 10:48 AM, Swâmi Petaramesh sw...@petaramesh.org wrote:

 Le 06/01/2013 21:05, Gene Czarcinski a écrit :
 I assume you have a grub partition (or its equivalent) with a
 grub.cfg file having menuentry definitions [pointing to the different
 grub.cfg file for each system ... that seems to work well (at least
 for me).  Currently, os-prober does not support btrfs. 
 
 I don't have any specific grub partition, every Linux install has its
 own grub install in its own /boot BTRFS subvolume.

Grub partition, as I'm taking it, means: MBR gap on MBR layouts; and BIOS Boot 
partition (GUID 21686148-6449-6E6F-744E-656564454649) on GPT layouts.

In any case, MBR gap, dedicated GPT partition, or grub.efi application, it's 
possible to bake a grub.cfg directly into the core.img.

Technically using configfile is not chain loading, you're just having the same 
instance of GRUB load an alternate config. Chain loading GRUB implies loading a 
new binary (boot loader) to replace the first instance of GRUB. Using 
configfile has the caveat that the GRUB using configfile may be of a different 
version than the GRUB package that produced the grub.cfg being configfile 
loaded. And as the syntax of grub.cfg has been changing quite a bit it's 
possible to have related failures.

Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: /boot as a btrfs subvolume

2013-01-07 Thread Gene Czarcinski

On 01/07/2013 01:42 PM, Chris Murphy wrote:

On Jan 7, 2013, at 8:42 AM, Gene Czarcinski g...@czarc.net wrote:


On 01/06/2013 09:00 PM, Chris Murphy wrote:

On Jan 6, 2013, at 1:05 PM, Gene Czarcinski g...@czarc.net wrote:


On 01/06/2013 02:17 PM, Swâmi Petaramesh wrote:

Le 06/01/2013 20:11, Chris Murphy a écrit :

If you use UUID, and you use subvol=, and you don't rename/move your subvolume, 
it's perfectly safe. Nevertheless, GRUB becoming subvolid aware seems like a 
good idea to me, but I have no idea what's involved in that.

I actually run several machines on which I have /boot in a separate
BTRFS subvol, without any issue. I have a multiboot between several
different distros (typically Ubuntu, Mint, LMDE, Bodhi... All Ubuntu
derivatives except for LMDE which is Debian-based...) sharing the same
BTRFS container and using different subvols i.e. UBUNTU/@boot,
LMDE/@boot etc...

Works just great.


I assume you have a grub partition (or its equivalent) with a grub.cfg file 
having menuentry definitions [pointing to the different grub.cfg file for each system ... 
that seems to work well (at least for me).  Currently, os-prober does not support btrfs.

I have taken a little look at the grub2 source code and there is some mention 
of both btrfs and zfs (and also btrfs subvolumes) in the changelogs.  However, 
it is not clear to me (and I have not had the time yet) to explore exactly what 
the source code is doing or not doing.

Well, at least with the f18 version of GRUB 2 2.00, whether alternative Btrfs 
bootable systems are mounted or not, -mkconfig isn't searching/finding for the 
/etc/fstab and /etc/default/grub of the other system like it appears to do with 
other file systems. I don't get any additional entries other than the currently 
booted Btrfs system.

So it looks like I'd need to manually add a configfile menu entry, pointing to 
each Btrfs bootable system. Chainloading from one grub to another is not 
useful. Better if they're all on the same GRUB, and use configfile.


There appear to be two situations where you have multiple software systems 
installed on the same hardware (real or virtual) -- root on a btrfs subvolume 
or /boot installed on an LV.

I'm unclear on the use case. If /boot is on its own LV, why not just use ext4? 
If Btrfs, then I'd expect boot to be included in one Btrfs volume, along with 
home and rootfs. If that's on an LV, the use cases I'm envisioning for that the 
boot loader wouldn't be aware of the LV (e.g. VM, iSCSI, etc).

Oops.  It is me that was not clear.

Case 1: You have multiple systems installed on one piece of real or 
virtual hardware.  These could be different releases or even different 
distributions.  At least one of these systems has root installed on a 
btrfs subvolume.  When grub2-mkconfig is run on that system everything 
is OK.  Now you are on a different system and you run grub2-mkconfig 
with os-prober enabled so it will look for other systems that could be 
booted.  It will not see the system with root on the btrfs subvolume.  
That is because os-prober does not understand btrfs.


Case2: This is similar to the above situation except that you have /boot 
installed in an LV rather than a regular partition with the LV formated 
with ext4.  Similar to the above, if, from a different installed system 
you run grub2-mkconfig with os-prober enabled, the system with the /boot 
in the LV will not be seen.  I have not looked into this one so it may 
be os-prober or grub2 itself.


Seeing this convinced me to avoid using os-prober and simply have a 
boot director with simple menuentry definitions pointing to the 
various grub.cfg files for each system.



Since developing those patches, I have had second thoughts about how multiboot should be 
done and now believe that it should involve a grub partition with a simple grub.cfg file 
and os-prober disabled. The simple grub.cfg file had menuentry definitions which point to 
the real grub.cfg file for each system.

It's messy without agreement on how to boot all distributions. I've also thought of this 
primary/public grub.cfg, and secondary/private grub.cfg. The first forwards 
to the grub.cfg for each distribution using configfile or legacyconfigfile. However, that 
primary GRUB instance necessarily would need to contain the entry for Windows, etc.

So each distribution's grub-mkconfig would need to know to write out a 
distribution specific grub.cfg which is based on os-prober disabled; but then 
also be capable of updating the primary grub.cfg - not merely write out a new 
one.

And then if a distribution is deleted from the system, how is this discovered, 
and the old entry in the primary grub.cfg removed? Messy.
The hardware systems have four or five software systems on each.  It is 
not so much of deleting an distribution as much as just ignoring it.  
Disk space really is pretty inexpensive these days.



I do not actually use a grub partition but, instead, a minimal system with 
grub2 and 

[PATCH] btrfs: add no file data flag to btrfs send ioctl

2013-01-07 Thread Mark Fasheh
This patch adds the flag, BTRFS_SEND_FLAG_NO_FILE_DATA to the btrfs send
ioctl code. When this flag is set, the btrfs send code will never write file
data into the stream (thus also avoiding expensive reads of that data in the
first place). BTRFS_SEND_C_UPDATE_EXTENT commands will be sent (instead of
BTRFS_SEND_C_WRITE) with an offset, length pair indicating the extent in
question.

This patch does not affect the operation of BTRFS_SEND_C_CLONE commands -
they will continue to be sent when a search finds an appropriate extent to
clone from.

Signed-off-by: Mark Fasheh mfas...@suse.de
---
 fs/btrfs/ioctl.h |7 +++
 fs/btrfs/send.c  |   45 +
 fs/btrfs/send.h  |1 +
 3 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h
index 731e287..1f6cfdd 100644
--- a/fs/btrfs/ioctl.h
+++ b/fs/btrfs/ioctl.h
@@ -363,6 +363,13 @@ struct btrfs_ioctl_received_subvol_args {
__u64   reserved[16];   /* in */
 };
 
+/*
+ * Caller doesn't want file data in the send stream, even if the
+ * search of clone sources doesn't find an extent. UPDATE_EXTENT
+ * commands will be sent instead of WRITE commands.
+ */
+#define BTRFS_SEND_FLAG_NO_FILE_DATA 0x1
+
 struct btrfs_ioctl_send_args {
__s64 send_fd;  /* in */
__u64 clone_sources_count;  /* in */
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index e78b297..d725536 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -85,6 +85,7 @@ struct send_ctx {
u32 send_max_size;
u64 total_send_size;
u64 cmd_send_size[BTRFS_SEND_C_MAX + 1];
+   u64 flags;  /* 'flags' member of btrfs_ioctl_send_args is u64 */
 
struct vfsmount *mnt;
 
@@ -3707,6 +3708,39 @@ out:
return ret;
 }
 
+/*
+ * Send an update extent command to user space.
+ */
+static int send_update_extent(struct send_ctx *sctx,
+ u64 offset, u32 len)
+{
+   int ret = 0;
+   struct fs_path *p;
+
+   p = fs_path_alloc(sctx);
+   if (!p)
+   return -ENOMEM;
+
+   ret = begin_cmd(sctx, BTRFS_SEND_C_UPDATE_EXTENT);
+   if (ret  0)
+   goto out;
+
+   ret = get_cur_path(sctx, sctx-cur_ino, sctx-cur_inode_gen, p);
+   if (ret  0)
+   goto out;
+
+   TLV_PUT_PATH(sctx, BTRFS_SEND_A_PATH, p);
+   TLV_PUT_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, offset);
+   TLV_PUT_U64(sctx, BTRFS_SEND_A_SIZE, len);
+
+   ret = send_cmd(sctx);
+
+tlv_put_failure:
+out:
+   fs_path_free(sctx, p);
+   return ret;
+}
+
 static int send_write_or_clone(struct send_ctx *sctx,
   struct btrfs_path *path,
   struct btrfs_key *key,
@@ -3742,7 +3776,11 @@ static int send_write_or_clone(struct send_ctx *sctx,
goto out;
}
 
-   if (!clone_root) {
+   if (clone_root) {
+   ret = send_clone(sctx, offset, len, clone_root);
+   } else if (sctx-flags  BTRFS_SEND_FLAG_NO_FILE_DATA) {
+   ret = send_update_extent(sctx, offset, len);
+   } else {
while (pos  len) {
l = len - pos;
if (l  BTRFS_SEND_READ_SIZE)
@@ -3755,10 +3793,7 @@ static int send_write_or_clone(struct send_ctx *sctx,
pos += ret;
}
ret = 0;
-   } else {
-   ret = send_clone(sctx, offset, len, clone_root);
}
-
 out:
return ret;
 }
@@ -4570,6 +4605,8 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user 
*arg_)
INIT_RADIX_TREE(sctx-name_cache, GFP_NOFS);
INIT_LIST_HEAD(sctx-name_cache_list);
 
+   sctx-flags = arg-flags;
+
sctx-send_filp = fget(arg-send_fd);
if (IS_ERR(sctx-send_filp)) {
ret = PTR_ERR(sctx-send_filp);
diff --git a/fs/btrfs/send.h b/fs/btrfs/send.h
index 1bf4f32..8bb18f7 100644
--- a/fs/btrfs/send.h
+++ b/fs/btrfs/send.h
@@ -86,6 +86,7 @@ enum btrfs_send_cmd {
BTRFS_SEND_C_UTIMES,
 
BTRFS_SEND_C_END,
+   BTRFS_SEND_C_UPDATE_EXTENT,
__BTRFS_SEND_C_MAX,
 };
 #define BTRFS_SEND_C_MAX (__BTRFS_SEND_C_MAX - 1)
-- 
1.7.7

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: add orphan before truncating pagecache

2013-01-07 Thread Josef Bacik
Running xfstests 83 in a loop would sometimes fail the fsck.  This happens
because if we invalidate a page that already has an ordered extent setup for
it we will complete the ordered extent ourselves, assuming that the truncate
will clean everything up.  The problem with this is there is plenty of time
for the truncate to fail after we've done this work.  So to fix this we need
to add the orphan item first to make sure the cleanup gets done properly,
and then we can truncate the pagecache and all that stuff and be safe.  This
fixes the btrfsck failures I was seeing while running 83 in a loop.  Thanks,

Signed-off-by: Josef Bacik jba...@fusionio.com
---
 fs/btrfs/inode.c |   45 -
 1 files changed, 36 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d3ce93e..20138b6 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2484,6 +2484,18 @@ int btrfs_orphan_cleanup(struct btrfs_root *root)
continue;
}
nr_truncate++;
+
+   /* 1 for the orphan item deletion. */
+   trans = btrfs_start_transaction(root, 1);
+   if (IS_ERR(trans)) {
+   ret = PTR_ERR(trans);
+   goto out;
+   }
+   ret = btrfs_orphan_add(trans, inode);
+   btrfs_end_transaction(trans, root);
+   if (ret)
+   goto out;
+
ret = btrfs_truncate(inode);
} else {
nr_unlink++;
@@ -3789,6 +3801,29 @@ static int btrfs_setsize(struct inode *inode, loff_t 
newsize)
set_bit(BTRFS_INODE_ORDERED_DATA_CLOSE,
BTRFS_I(inode)-runtime_flags);
 
+   /*
+* 1 for the orphan item we're going to add
+* 1 for the orphan item deletion.
+*/
+   trans = btrfs_start_transaction(root, 2);
+   if (IS_ERR(trans))
+   return PTR_ERR(trans);
+
+   /*
+* We need to do this in case we fail at _any_ point during the
+* actual truncate.  Once we do the truncate_setsize we could
+* invalidate pages which forces any outstanding ordered io to
+* be instantly completed which will give us extents that need
+* to be truncated.  If we fail to get an orphan inode down we
+* could have left over extents that were never meant to live,
+* so we need to garuntee from this point on that everything
+* will be consistent.
+*/
+   ret = btrfs_orphan_add(trans, inode);
+   btrfs_end_transaction(trans, root);
+   if (ret)
+   return ret;
+
/* we don't support swapfiles, so vmtruncate shouldn't fail */
truncate_setsize(inode, newsize);
ret = btrfs_truncate(inode);
@@ -6939,11 +6974,9 @@ static int btrfs_truncate(struct inode *inode)
 
/*
 * 1 for the truncate slack space
-* 1 for the orphan item we're going to add
-* 1 for the orphan item deletion
 * 1 for updating the inode.
 */
-   trans = btrfs_start_transaction(root, 4);
+   trans = btrfs_start_transaction(root, 2);
if (IS_ERR(trans)) {
err = PTR_ERR(trans);
goto out;
@@ -6954,12 +6987,6 @@ static int btrfs_truncate(struct inode *inode)
  min_size);
BUG_ON(ret);
 
-   ret = btrfs_orphan_add(trans, inode);
-   if (ret) {
-   btrfs_end_transaction(trans, root);
-   goto out;
-   }
-
/*
 * setattr is responsible for setting the ordered_data_close flag,
 * but that is only tested during the last file release.  That
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: add no file data flag to btrfs send ioctl

2013-01-07 Thread David Sterba
On Mon, Jan 07, 2013 at 01:51:19PM -0800, Mark Fasheh wrote:
 +#define BTRFS_SEND_FLAG_NO_FILE_DATA 0x1
 +

 + sctx-flags = arg-flags;
 +

For compatibility reasons, you should mask the user input value and only
allow the supported flag(s).

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: /boot as a btrfs subvolume

2013-01-07 Thread Chris Murphy

On Jan 7, 2013, at 2:41 PM, Gene Czarcinski g...@czarc.net wrote:
 
 And then if a distribution is deleted from the system, how is this 
 discovered, and the old entry in the primary grub.cfg removed? Messy.
 The hardware systems have four or five software systems on each.  It is not 
 so much of deleting an distribution as much as just ignoring it.  Disk space 
 really is pretty inexpensive these days.

The case of natively booting/rebooting different OS's/distributions on BIOS is 
difficult. For VM, I think maybe a simpler approach like syslinux is better, 
which understands Btrfs I believe. And for UEFI, I think a simpler approach is 
something like rEFInd or gummiboot, as boot manager, and then for various linux 
distributions to be built with EFI_STUB.

 
 I do not actually use a grub partition but, instead, a minimal system with 
 grub2 and installation into the MBR.
 grub's core.img can accept a baked in grub.cfg
 If this was a true production situation then this would be a good choice but 
 I am dealing with more of a laboratory environment with things that change a 
 lot.

Yeah currently it's difficult to do multiple OS native booting.

Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: add no file data flag to btrfs send ioctl

2013-01-07 Thread Mark Fasheh
On Mon, Jan 07, 2013 at 11:01:17PM +0100, David Sterba wrote:
 On Mon, Jan 07, 2013 at 01:51:19PM -0800, Mark Fasheh wrote:
  +#define BTRFS_SEND_FLAG_NO_FILE_DATA 0x1
  +
 
  +   sctx-flags = arg-flags;
  +
 
 For compatibility reasons, you should mask the user input value and only
 allow the supported flag(s).

Fixed, and tested. Thanks for the review David!
--Mark

--
Mark Fasheh


From 601b60fa3fa9e319c11f91fd317dda3fac43b955 Mon Sep 17 00:00:00 2001

btrfs: add no file data flag to btrfs send ioctl

This patch adds the flag, BTRFS_SEND_FLAG_NO_FILE_DATA to the btrfs send
ioctl code. When this flag is set, the btrfs send code will never write file
data into the stream (thus also avoiding expensive reads of that data in the
first place). BTRFS_SEND_C_UPDATE_EXTENT commands will be sent (instead of
BTRFS_SEND_C_WRITE) with an offset, length pair indicating the extent in
question.

This patch does not affect the operation of BTRFS_SEND_C_CLONE commands -
they will continue to be sent when a search finds an appropriate extent to
clone from.

Signed-off-by: Mark Fasheh mfas...@suse.de
---
 fs/btrfs/ioctl.h |7 +++
 fs/btrfs/send.c  |   48 
 fs/btrfs/send.h  |1 +
 3 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h
index 731e287..1f6cfdd 100644
--- a/fs/btrfs/ioctl.h
+++ b/fs/btrfs/ioctl.h
@@ -363,6 +363,13 @@ struct btrfs_ioctl_received_subvol_args {
__u64   reserved[16];   /* in */
 };
 
+/*
+ * Caller doesn't want file data in the send stream, even if the
+ * search of clone sources doesn't find an extent. UPDATE_EXTENT
+ * commands will be sent instead of WRITE commands.
+ */
+#define BTRFS_SEND_FLAG_NO_FILE_DATA 0x1
+
 struct btrfs_ioctl_send_args {
__s64 send_fd;  /* in */
__u64 clone_sources_count;  /* in */
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index e78b297..8d0c6b4 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -85,6 +85,7 @@ struct send_ctx {
u32 send_max_size;
u64 total_send_size;
u64 cmd_send_size[BTRFS_SEND_C_MAX + 1];
+   u64 flags;  /* 'flags' member of btrfs_ioctl_send_args is u64 */
 
struct vfsmount *mnt;
 
@@ -3707,6 +3708,39 @@ out:
return ret;
 }
 
+/*
+ * Send an update extent command to user space.
+ */
+static int send_update_extent(struct send_ctx *sctx,
+ u64 offset, u32 len)
+{
+   int ret = 0;
+   struct fs_path *p;
+
+   p = fs_path_alloc(sctx);
+   if (!p)
+   return -ENOMEM;
+
+   ret = begin_cmd(sctx, BTRFS_SEND_C_UPDATE_EXTENT);
+   if (ret  0)
+   goto out;
+
+   ret = get_cur_path(sctx, sctx-cur_ino, sctx-cur_inode_gen, p);
+   if (ret  0)
+   goto out;
+
+   TLV_PUT_PATH(sctx, BTRFS_SEND_A_PATH, p);
+   TLV_PUT_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, offset);
+   TLV_PUT_U64(sctx, BTRFS_SEND_A_SIZE, len);
+
+   ret = send_cmd(sctx);
+
+tlv_put_failure:
+out:
+   fs_path_free(sctx, p);
+   return ret;
+}
+
 static int send_write_or_clone(struct send_ctx *sctx,
   struct btrfs_path *path,
   struct btrfs_key *key,
@@ -3742,7 +3776,11 @@ static int send_write_or_clone(struct send_ctx *sctx,
goto out;
}
 
-   if (!clone_root) {
+   if (clone_root) {
+   ret = send_clone(sctx, offset, len, clone_root);
+   } else if (sctx-flags  BTRFS_SEND_FLAG_NO_FILE_DATA) {
+   ret = send_update_extent(sctx, offset, len);
+   } else {
while (pos  len) {
l = len - pos;
if (l  BTRFS_SEND_READ_SIZE)
@@ -3755,10 +3793,7 @@ static int send_write_or_clone(struct send_ctx *sctx,
pos += ret;
}
ret = 0;
-   } else {
-   ret = send_clone(sctx, offset, len, clone_root);
}
-
 out:
return ret;
 }
@@ -4570,6 +4605,11 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user 
*arg_)
INIT_RADIX_TREE(sctx-name_cache, GFP_NOFS);
INIT_LIST_HEAD(sctx-name_cache_list);
 
+   if (arg-flags  ~BTRFS_SEND_FLAG_NO_FILE_DATA)
+   return -EINVAL;
+
+   sctx-flags = arg-flags;
+
sctx-send_filp = fget(arg-send_fd);
if (IS_ERR(sctx-send_filp)) {
ret = PTR_ERR(sctx-send_filp);
diff --git a/fs/btrfs/send.h b/fs/btrfs/send.h
index 1bf4f32..8bb18f7 100644
--- a/fs/btrfs/send.h
+++ b/fs/btrfs/send.h
@@ -86,6 +86,7 @@ enum btrfs_send_cmd {
BTRFS_SEND_C_UTIMES,
 
BTRFS_SEND_C_END,
+   BTRFS_SEND_C_UPDATE_EXTENT,
__BTRFS_SEND_C_MAX,
 };
 #define BTRFS_SEND_C_MAX (__BTRFS_SEND_C_MAX - 1)
-- 
1.7.7

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to 

Re: [PATCH] btrfs: add no file data flag to btrfs send ioctl

2013-01-07 Thread Tsutomu Itoh

(2013/01/08 8:33), Mark Fasheh wrote:

On Mon, Jan 07, 2013 at 11:01:17PM +0100, David Sterba wrote:

On Mon, Jan 07, 2013 at 01:51:19PM -0800, Mark Fasheh wrote:

+#define BTRFS_SEND_FLAG_NO_FILE_DATA 0x1
+



+   sctx-flags = arg-flags;
+


For compatibility reasons, you should mask the user input value and only
allow the supported flag(s).


Fixed, and tested. Thanks for the review David!
--Mark

--
Mark Fasheh



From 601b60fa3fa9e319c11f91fd317dda3fac43b955 Mon Sep 17 00:00:00 2001


btrfs: add no file data flag to btrfs send ioctl

This patch adds the flag, BTRFS_SEND_FLAG_NO_FILE_DATA to the btrfs send
ioctl code. When this flag is set, the btrfs send code will never write file
data into the stream (thus also avoiding expensive reads of that data in the
first place). BTRFS_SEND_C_UPDATE_EXTENT commands will be sent (instead of
BTRFS_SEND_C_WRITE) with an offset, length pair indicating the extent in
question.

This patch does not affect the operation of BTRFS_SEND_C_CLONE commands -
they will continue to be sent when a search finds an appropriate extent to
clone from.

Signed-off-by: Mark Fasheh mfas...@suse.de
---
  fs/btrfs/ioctl.h |7 +++
  fs/btrfs/send.c  |   48 
  fs/btrfs/send.h  |1 +
  3 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h
index 731e287..1f6cfdd 100644
--- a/fs/btrfs/ioctl.h
+++ b/fs/btrfs/ioctl.h
@@ -363,6 +363,13 @@ struct btrfs_ioctl_received_subvol_args {
__u64   reserved[16];   /* in */
  };

+/*
+ * Caller doesn't want file data in the send stream, even if the
+ * search of clone sources doesn't find an extent. UPDATE_EXTENT
+ * commands will be sent instead of WRITE commands.
+ */
+#define BTRFS_SEND_FLAG_NO_FILE_DATA 0x1
+
  struct btrfs_ioctl_send_args {
__s64 send_fd;  /* in */
__u64 clone_sources_count;  /* in */
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index e78b297..8d0c6b4 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -85,6 +85,7 @@ struct send_ctx {
u32 send_max_size;
u64 total_send_size;
u64 cmd_send_size[BTRFS_SEND_C_MAX + 1];
+   u64 flags;  /* 'flags' member of btrfs_ioctl_send_args is u64 */

struct vfsmount *mnt;

@@ -3707,6 +3708,39 @@ out:
return ret;
  }

+/*
+ * Send an update extent command to user space.
+ */
+static int send_update_extent(struct send_ctx *sctx,
+ u64 offset, u32 len)
+{
+   int ret = 0;
+   struct fs_path *p;
+
+   p = fs_path_alloc(sctx);
+   if (!p)
+   return -ENOMEM;
+
+   ret = begin_cmd(sctx, BTRFS_SEND_C_UPDATE_EXTENT);
+   if (ret  0)
+   goto out;
+
+   ret = get_cur_path(sctx, sctx-cur_ino, sctx-cur_inode_gen, p);
+   if (ret  0)
+   goto out;
+
+   TLV_PUT_PATH(sctx, BTRFS_SEND_A_PATH, p);
+   TLV_PUT_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, offset);
+   TLV_PUT_U64(sctx, BTRFS_SEND_A_SIZE, len);
+
+   ret = send_cmd(sctx);
+
+tlv_put_failure:
+out:
+   fs_path_free(sctx, p);
+   return ret;
+}
+
  static int send_write_or_clone(struct send_ctx *sctx,
   struct btrfs_path *path,
   struct btrfs_key *key,
@@ -3742,7 +3776,11 @@ static int send_write_or_clone(struct send_ctx *sctx,
goto out;
}

-   if (!clone_root) {
+   if (clone_root) {
+   ret = send_clone(sctx, offset, len, clone_root);
+   } else if (sctx-flags  BTRFS_SEND_FLAG_NO_FILE_DATA) {
+   ret = send_update_extent(sctx, offset, len);
+   } else {
while (pos  len) {
l = len - pos;
if (l  BTRFS_SEND_READ_SIZE)
@@ -3755,10 +3793,7 @@ static int send_write_or_clone(struct send_ctx *sctx,
pos += ret;
}
ret = 0;
-   } else {
-   ret = send_clone(sctx, offset, len, clone_root);
}
-
  out:
return ret;
  }
@@ -4570,6 +4605,11 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user 
*arg_)
INIT_RADIX_TREE(sctx-name_cache, GFP_NOFS);
INIT_LIST_HEAD(sctx-name_cache_list);




+   if (arg-flags  ~BTRFS_SEND_FLAG_NO_FILE_DATA)
+   return -EINVAL;


I like

#define BTRFS_SEND_FLAGS(BTRFS_SEND_FLAG_NO_FILE_DATA | ...)
if (arg-flags  ~BTRFS_SEND_FLAGS)
return -EINVAL;

even if current flag is only one.

Thanks,
Tsutomu


+
+   sctx-flags = arg-flags;
+
sctx-send_filp = fget(arg-send_fd);
if (IS_ERR(sctx-send_filp)) {
ret = PTR_ERR(sctx-send_filp);
diff --git a/fs/btrfs/send.h b/fs/btrfs/send.h
index 1bf4f32..8bb18f7 100644
--- a/fs/btrfs/send.h
+++ b/fs/btrfs/send.h
@@ -86,6 +86,7 @@ enum btrfs_send_cmd {
BTRFS_SEND_C_UTIMES,

BTRFS_SEND_C_END,

chattr +C vs. btrfs subvolume snapshot

2013-01-07 Thread Jun Lion
What happens if you set an individual file inside a subvolume as nocow
(chattr +C) and then take a snapshot of that subvolume and modify the
file in both?

Will btrfs now ignore the nocow attribute completely or will it do as
few copies as possible? (I'd love to know if it's possible to visualize
the fragmentation of a single file.)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix off-by-one in lseek

2013-01-07 Thread Liu Bo
On Mon, Jan 07, 2013 at 05:20:50PM +0100, David Sterba wrote:
 On Mon, Jan 07, 2013 at 11:53:08AM +0800, Liu Bo wrote:
  Lock end is inclusive.
  
  Signed-off-by: Liu Bo bo.li@oracle.com
  ---
   fs/btrfs/file.c |1 +
   1 files changed, 1 insertions(+), 0 deletions(-)
  
  diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
  index 77061bf..1e16b6d 100644
  --- a/fs/btrfs/file.c
  +++ b/fs/btrfs/file.c
  @@ -2241,6 +2241,7 @@ static int find_desired_extent(struct inode *inode, 
  loff_t *offset, int whence)
  if (lockend = lockstart)
  lockend = lockstart + root-sectorsize;
   
  +   lockend--;
  len = lockend - lockstart + 1;
   
  len = max_t(u64, len, root-sectorsize);
 
 Fix looks ok. I think this should be caught at runtime as well, the
 number of ways how the lock start and end are passed is not small and it
 need not be always possible to catch it from reading sources. The range
 is inclusive, so it's 'lock end % 2 == 1' right? (in the bit
 manipulating primitives).

Hmm, not always here, lockend = inode-i_size - 1, so lockend % 2 == 1 may not
be true.

But this check can worked in those places where we've rounded the range up to
PAGE_SIZE :)

thanks,
liubo
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: chattr +C vs. btrfs subvolume snapshot

2013-01-07 Thread Liu Bo
On Tue, Jan 08, 2013 at 12:37:11AM +, Jun Lion wrote:
 What happens if you set an individual file inside a subvolume as nocow
 (chattr +C) and then take a snapshot of that subvolume and modify the
 file in both?
 
 Will btrfs now ignore the nocow attribute completely or will it do as
 few copies as possible? (I'd love to know if it's possible to visualize
 the fragmentation of a single file.)

Well, btrfs nearly puts everything with a kind of timestamp,
generation, which stands for transaction id.

For your case, the NOCOW file is created just before taking the
snapshot, so btrfs thinks of it being shared, and for shared parts, we
must COW it to keep everything right when trying to modify them.

And for the fragmentation, do you mean 'filefrag'?

thanks,
liubo
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Btrfs fs not mounting or being identified after power loss.

2013-01-07 Thread Jordan Windsor
Hello,
The power to my computer was suddenly cut, one of my btrfs fs (not the
root) has somewhat disappeared, I can't mount it or use any of the
available btrfs tools on it.
Here's some command output:

[root@archpc ~]# fdisk -l /dev/sdb

Disk /dev/sdb: 808.9 GB, 80614912 bytes, 1579860576 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000db8c9

   Device Boot  Start End  Blocks   Id  System
/dev/sdb1  63  1567571967   783785952+  83  Linux
/dev/sdb4  1567571968  1579859967 6144000   82  Linux swap / Solaris


/dev/sdb1 is the lost btrfs fs.

[root@archpc ~]# btrfs-find-root -v /dev/sdb1
No valid Btrfs found on /dev/sdb1

[root@archpc ~]# btrfsck /dev/sdb1
No valid Btrfs found on /dev/sdb1

[root@archpc ~]# mount /dev/sdb1 /mnt/restore/
mount: wrong fs type, bad option, bad superblock on /dev/sdb1,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail or so

[ 1767.655429] JBD2: no valid journal superblock found
[ 1767.655432] EXT4-fs (sdb1): error loading journal


If you need anymore information please ask.
Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: replace simple_strtoull() with kstrtoull() in btrfs_ioctl_resize()

2013-01-07 Thread Jeff Liu

On 01/08/2013 12:03 AM, David Sterba wrote:
 On Sun, Jan 06, 2013 at 11:53:41AM +0800, Jeff Liu wrote:
 -devid = simple_strtoull(devstr, end, 10);
 +ret = kstrtoull(devstr, 10, devid);
 +if (ret) {
 +pr_err(btrfs: resizer unable to parse device %s\n,
 +devstr);
 
 Code looks ok, I would prefer the message error text to say:
 
   pr_err(btrfs: resize unable to parse device id %s\n,
   devstr);
 
 but that's maybe just me :)
Thanks for the review, yours is better :)
The revised patch was shown as following:


simple_strtoull() is obsolete, use kstrtoull() instead.

Signed-off-by: Jie Liu jeff@oracle.com
Cc: David Sterba dste...@suse.cz

---
 fs/btrfs/ioctl.c |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 8fcf9a5..d9045eb 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1331,11 +1331,16 @@ static noinline int btrfs_ioctl_resize(struct 
btrfs_root *root,
sizestr = vol_args-name;
devstr = strchr(sizestr, ':');
if (devstr) {
-   char *end;
sizestr = devstr + 1;
*devstr = '\0';
devstr = vol_args-name;
-   devid = simple_strtoull(devstr, end, 10);
+   ret = kstrtoull(devstr, 10, devid);
+   if (ret) {
+   pr_err(btrfs: resizer unable to parse device id %s\n,
+   devstr);
+   ret = -EINVAL;
+   goto out_free;
+   }
printk(KERN_INFO btrfs: resizing devid %llu\n,
   (unsigned long long)devid);
}
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v9 1/2] Btrfs: Add a new ioctl to get the label of a mounted file system

2013-01-07 Thread Jeff Liu
Hi Goffredo,

On 01/08/2013 01:32 AM, Goffredo Baroncelli wrote:
 Hi Jeff,
 
 On 01/07/2013 07:24 AM, Jeff Liu wrote:
 Hi,
 On 01/07/2013 02:44 AM, Goffredo Baroncelli wrote:
 Hi Jeff,

 On 01/05/2013 03:48 AM, Jeff Liu wrote:
 Add a new ioctl(2) BTRFS_IOC_GET_FSLABLE, so that we can get the label 
 upon a mounted filesystem.

 Signed-off-by: Jie Liu jeff@oracle.com
 Signed-off-by: Anand Jain anand.j...@oracle.com
 Cc: Miao Xie mi...@cn.fujitsu.com
 Cc: Goffredo Baroncelli kreij...@inwind.it
 Cc: David Sterba dste...@suse.cz

 ---
  fs/btrfs/ioctl.c |   21 +
  fs/btrfs/ioctl.h |2 ++
  2 files changed, 23 insertions(+)

 diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
 index 8fcf9a5..ef2f55a 100644
 --- a/fs/btrfs/ioctl.c
 +++ b/fs/btrfs/ioctl.c
 @@ -3699,6 +3699,25 @@ out:
return ret;
  }
  

 May be that it was already discussed, and I am missing something, but if
 we check the label length we should terminate the string with a zero in
 case this is too long...

 However this is a minor bug, please push this patch forward..
 I don't think so.
 Why we need to terminate the string with a zero?
 We have already truncated the label string up to maximum 255 bytes if it
 was too long.

 Consider the normal conditions, i.e. the label string is less than 256,
 we just copy it back to the user without a terminated NUL.
 
 Sorry I don't understand the reason to terminate the string up to 255
 chars without adding a zero. Or we copy all the buffer (BTRFS_LABEL_SIZE
 characters without furthers checks) or we copy BTRFS_LABEL_SIZE-1
 characters, adding a zero at the end.
For normal cases, there should already has a NUL at the end of the input
label string when performing btrfs_ioc_set_fslabel(), and we have also
added a NUL for unmounted label set in btrfs-progs.  In both cases, put
it again on btrfs_ioc_get_fslabel() is redundant IMO.

If we ran into a too long label, return the first 255 bytes is fine for
both mounted and unmounted get_label() routines in btrfs-progs since I
did memset(label, 0, sizeof(label)) at first for fetching mounted fs label.

Also, I have not found any existing kernel code does similar things by
adding a NUL to end up a char array, could you show me an example?


Or we copy all the buffer(BTRFS_LABEL_SIZE) characters without
furthers checks

If an array is capable of N chars, we can only full it with N - 1 chars.

Thanks,
-Jeff


 +static int btrfs_ioctl_get_fslabel(struct file *file, void __user *arg)
 +{
 +  struct btrfs_root *root = BTRFS_I(fdentry(file)-d_inode)-root;
 +  const char *label = root-fs_info-super_copy-label;
 +  size_t len = strnlen(label, BTRFS_LABEL_SIZE);
 +  int ret;
 +   int label_is_too_long = 0;
bool label_is_too_long = false;
 +
 +  if (len == BTRFS_LABEL_SIZE) {
 +  pr_warn(btrfs: label is too long, return the first %zu 
 bytes\n,
 +  --len);

 +   label_is_too_long = 1;
label_is_too_long = true;
 +  }
 +
 +  mutex_lock(root-fs_info-volume_mutex);
 +  ret = copy_to_user(arg, label, len);
 +   if (!ret  label_is_too_long)
 +   ret = copy_to_user(arg+BTRFS_LABEL_SIZE, , 1);
 Hmm? ret = copy_to_user(arg + len, , 1);

 My idea was to put a zero at the end of the string, but I did a mistake
 with the cutpaste.. sorry :-)
 
 +  mutex_unlock(root-fs_info-volume_mutex);
 If you or anyone has objection to this patch, please just post yours, I
 won't follow it up again.
 
 Sorry I don't want to bother anybody, I know that reviewing a patch nine
 times is a very havvy work. I (and I think more other peoples) really
 appreciate your efforts.
 
 BR
 G.Baroncelli
 
 +
 +  return ret ? -EFAULT : 0;
 +}
 +
  long btrfs_ioctl(struct file *file, unsigned int
cmd, unsigned long arg)
  {
 @@ -3797,6 +3816,8 @@ long btrfs_ioctl(struct file *file, unsigned int
return btrfs_ioctl_qgroup_create(root, argp);
case BTRFS_IOC_QGROUP_LIMIT:
return btrfs_ioctl_qgroup_limit(root, argp);
 +  case BTRFS_IOC_GET_FSLABEL:
 +  return btrfs_ioctl_get_fslabel(file, argp);
}
  
return -ENOTTY;
 diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h
 index 731e287..5b2cbef 100644
 --- a/fs/btrfs/ioctl.h
 +++ b/fs/btrfs/ioctl.h
 @@ -451,6 +451,8 @@ struct btrfs_ioctl_send_args {
   struct btrfs_ioctl_qgroup_create_args)
  #define BTRFS_IOC_QGROUP_LIMIT _IOR(BTRFS_IOCTL_MAGIC, 43, \
   struct btrfs_ioctl_qgroup_limit_args)
 +#define BTRFS_IOC_GET_FSLABEL _IOR(BTRFS_IOCTL_MAGIC, 49, \
 + char[BTRFS_LABEL_SIZE])
  #define BTRFS_IOC_GET_DEV_STATS _IOWR(BTRFS_IOCTL_MAGIC, 52, \
  struct btrfs_ioctl_get_dev_stats)
  #endif




 
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html