Re: [RFC PATCH 0/5 v3] Btrfs: Add readonly support to replace BUG_ON phrase

2010-12-15 Thread liubo
Hi, chris,

Is there any comment on this Forced readonly mounts on errors patchset?

thanks,
Liu Bo

On 12/03/2010 04:15 PM, liubo wrote:
 Btrfs has a number of BUG_ON()s, which may lead btrfs to unpleasant panic.
 Meanwhile, they are very ugly and should be handled more propriately.
 
 There are mainly two ways to deal with these BUG_ON()s.
 
 1. For those errors which can be handled well by callers, we just return their
 error number to callers.
 
 2. For others, We can force the filesystem readonly when it hits errors, which
  is what this patchset has done. Replaced BUG_ON() with the interface provided
  in this patchset, we will get error infomation via dmesg. Since btrfs is now 
 readonly, we can save our data safely and umount it, then a btrfsck is 
 recommended.
 
 By these ways, we can protect our filesystem from panic caused by those 
 BUG_ONs.
 
 We still need a incompat flag to make old kernels happy.
 
 This patchset needs more test.
 
 v2-v3:
 - since btrfs may do log replay after crash, even it is mounted as readonly,
   and we have add a readonly check at start transaction time, it needs to set
   and to restore readonly flags around log replay.
 
 v1-v2:
 - in order to avoid deadlock thing, move write super stuff from error handle
   path to unmount time.
 - remove BTRFS_SUPER_FLAG_VALID, just use BTRFS_SUPER_FLAG_ERROR to make it
   simple.
 - add MS_RDONLY check at start of a transaction instead of commit transaction.
 ---
  fs/btrfs/ctree.h   |   19 ++
  fs/btrfs/disk-io.c |   56 +-
  fs/btrfs/super.c   |   88 
 
  fs/btrfs/transaction.c |3 ++
  4 files changed, 164 insertions(+), 2 deletions(-)
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 2/5 v3] Btrfs: avoid transaction stuff when btrfs is readonly

2010-12-15 Thread Yan, Zheng
On Fri, Dec 3, 2010 at 4:16 PM, liubo liubo2...@cn.fujitsu.com wrote:
 When the filesystem is readonly, avoid transaction stuff by checking MS_RDONLY
 at start transaction time.

 Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
 ---
  fs/btrfs/transaction.c |    3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)

 diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
 index 1fffbc0..14a597d 100644
 --- a/fs/btrfs/transaction.c
 +++ b/fs/btrfs/transaction.c
 @@ -181,6 +181,9 @@ static struct btrfs_trans_handle 
 *start_transaction(struct btrfs_root *root,
        struct btrfs_trans_handle *h;
        struct btrfs_transaction *cur_trans;
        int ret;
 +
 +       if (root-fs_info-sb-s_flags  MS_RDONLY)
 +               return ERR_PTR(-EROFS);
  again:
        h = kmem_cache_alloc(btrfs_trans_handle_cachep, GFP_NOFS);
        if (!h)

There are cases that we need to start transaction when MS_RDONLY flag is set.
For example, remount FS into read-only mode and log replay.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 2/5 v3] Btrfs: avoid transaction stuff when btrfs is readonly

2010-12-15 Thread liubo
On 12/15/2010 04:45 PM, Yan, Zheng wrote:
 On Fri, Dec 3, 2010 at 4:16 PM, liubo liubo2...@cn.fujitsu.com wrote:
 When the filesystem is readonly, avoid transaction stuff by checking 
 MS_RDONLY
 at start transaction time.

 Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
 ---
  fs/btrfs/transaction.c |3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)

 diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
 index 1fffbc0..14a597d 100644
 --- a/fs/btrfs/transaction.c
 +++ b/fs/btrfs/transaction.c
 @@ -181,6 +181,9 @@ static struct btrfs_trans_handle 
 *start_transaction(struct btrfs_root *root,
struct btrfs_trans_handle *h;
struct btrfs_transaction *cur_trans;
int ret;
 +
 +   if (root-fs_info-sb-s_flags  MS_RDONLY)
 +   return ERR_PTR(-EROFS);
  again:
h = kmem_cache_alloc(btrfs_trans_handle_cachep, GFP_NOFS);
if (!h)
 
 There are cases that we need to start transaction when MS_RDONLY flag is set.
 For example, remount FS into read-only mode and log replay.

However, is it weird to make changes to disk as fs is in readonly state?
IMO, btrfs needs to limit the use of these disk-change while readonly cases,
as it is not what readonly means.

Since it has been here, we can bypass readonly in those cases(as I did in the 
5th patch):

...
flags = sb-s_flags;
if (sb-s_flags  MS_RDONLY)
sb-s_flags = ~MS_RDONLY

remount()

sb-s_flags = flags;
...

thanks,
Liu Bo

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fsck, parent transid verify failed

2010-12-15 Thread Tommy Jonsson
 Excerpts from Tommy Jonsson's message of 2010-12-08 15:07:58 -0500:
  Build the latest tools, then:
 
  btrfsck -s 1 /dev/xxx
  btrfsck -s 2 /dev/xxx
 
  If either of these work we have an easy way to get it mounted.  Just let
  me know.
 
  -chris
 

  $ btrfsck -s 1 /dev/sda
  using SB copy 1, bytenr 67108864
  parent transid verify failed on 2721514774528 wanted 39651 found 39649
  btrfsck: disk-io.c:739: open_ctree_fd: Assertion `!(!tree_root-node)' 
  failed.
  Aborted
 
  $ btrfsck -s 2 /dev/sda
  using SB copy 2, bytenr 274877906944
  parent transid verify failed on 2721514774528 wanted 39651 found 39649
  btrfsck: disk-io.c:739: open_ctree_fd: Assertion `!(!tree_root-node)' 
  failed.
  Aborted
 
  Tried btrfs-debug-tree /dev/sda and btrfs-debug-tree -e /dev/sda :
  parent transid verify failed on 2721514774528 wanted 39651 found 39649
  btrfs-debug-tree: disk-io.c:739: open_ctree_fd: Assertion
  `!(!tree_root-node)' failed.
 
  dmesg said:
  [268375.903581] device fsid 734a485d12c77872-9b0b5aa408670db4 devid 2 
  transid 39650 /dev/sdd
  [268375.904241] device fsid 734a485d12c77872-9b0b5aa408670db4 devid 1 
  transid 39651 /dev/sdc
  [268375.904526] device fsid 734a485d12c77872-9b0b5aa408670db4 devid 3 
  transid 39651 /dev/sda

 Sorry to be a bother, but do you have any other suggestions ?

 Not a bother at all, I'm polishing off a version of fsck that I hope
 will be able to construct a good tree for you.  It's my main priority
 right now and I hope to have something ready early Monday.

 -chris


Hi Chris.
Thanks for all your help. Any progress on the fsck ?
I pulled the latest btrfs-progs-unstable and recompiled, same output
from all the commands (btrfsck -s / btrfs-debug-tree).

-tommy
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


How to invoke the discard operation over btrfs

2010-12-15 Thread gao bo
Hi all,
Is there anybody know how to invoke the discard operation over btrfs
filesystem?  Does delete a file by rm will trigger the discard
operation?
If I want to trigger discard operation,how should i do? if there any
pre-requires? such as the device must be ssd, btrfs should be mounted
with -o discard.

Any suggestion is appreciated.

Thanks a lot!
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 2/5 v3] Btrfs: avoid transaction stuff when btrfs is readonly

2010-12-15 Thread Chris Mason
Excerpts from liubo's message of 2010-12-15 04:12:14 -0500:
 On 12/15/2010 04:45 PM, Yan, Zheng wrote:
  On Fri, Dec 3, 2010 at 4:16 PM, liubo liubo2...@cn.fujitsu.com wrote:
  When the filesystem is readonly, avoid transaction stuff by checking 
  MS_RDONLY
  at start transaction time.
 
  Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
  ---
   fs/btrfs/transaction.c |3 +++
   1 files changed, 3 insertions(+), 0 deletions(-)
 
  diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
  index 1fffbc0..14a597d 100644
  --- a/fs/btrfs/transaction.c
  +++ b/fs/btrfs/transaction.c
  @@ -181,6 +181,9 @@ static struct btrfs_trans_handle 
  *start_transaction(struct btrfs_root *root,
 struct btrfs_trans_handle *h;
 struct btrfs_transaction *cur_trans;
 int ret;
  +
  +   if (root-fs_info-sb-s_flags  MS_RDONLY)
  +   return ERR_PTR(-EROFS);
   again:
 h = kmem_cache_alloc(btrfs_trans_handle_cachep, GFP_NOFS);
 if (!h)
  
  There are cases that we need to start transaction when MS_RDONLY flag is 
  set.
  For example, remount FS into read-only mode and log replay.
 
 However, is it weird to make changes to disk as fs is in readonly state?
 IMO, btrfs needs to limit the use of these disk-change while readonly cases,
 as it is not what readonly means.

reiserfs and ext3 at least have always done this.  Log replay is
required even when the FS is readonly.

 
 Since it has been here, we can bypass readonly in those cases(as I did in the 
 5th patch):
 
 ...
 flags = sb-s_flags;
 if (sb-s_flags  MS_RDONLY)
 sb-s_flags = ~MS_RDONLY

I think we should have a dedicated flag to reflect a filesystem that is
forced readonly, and check that flag instead.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


What do the arguments of btrfs filesystem defragment do?

2010-12-15 Thread Erik Logtenberg
Hi,

I would like to know that the arguments of btrfs filesystem defragment
do. According to the built-in help page, the invocation is as follows:

 btrfs filesystem defragment [-vcf] [-s start] [-l len] [-t
 size] file|dir [file|dir...]
 Defragment a file or a directory.

Unfortunately I can't find any documentation on the meaning of these
arguments. The man page doesn't list these arguments:

   filesystem defragment file|dir [file|dir...]
  Defragment files and/or directories.

Also the online documentation [1] is identical to the man page.

[1] https://btrfs.wiki.kernel.org/index.php/Btrfs(command)

On a somewhat related note I did find another page on the wiki [2] that
explains a bit on the defragmenting subject, more specifically it
mentions this rather important caveat:

 Caveat: Defragmenting a file which has a COW copy (either a snapshot
 copy or one made with bcp or cp --reflinks) will produce two unrelated
 files. If you defragment a subvolume that has a snapshot, you will
 roughly double the disk usage, as the snapshot files are no longer COW
 images of the originals.

[2] https://btrfs.wiki.kernel.org/index.php/Problem_FAQ

From what I've heard on IRC this is still the case in current versions,
but the Btrfs(command) documentation contains no mention of this.

I hope someone can shed some light on these subjects.

Kind regards,

Erik.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 2/5 v3] Btrfs: avoid transaction stuff when btrfs is readonly

2010-12-15 Thread Josef Bacik
On Wed, Dec 15, 2010 at 11:03:46AM -0500, Chris Mason wrote:
 Excerpts from liubo's message of 2010-12-15 04:12:14 -0500:
  On 12/15/2010 04:45 PM, Yan, Zheng wrote:
   On Fri, Dec 3, 2010 at 4:16 PM, liubo liubo2...@cn.fujitsu.com wrote:
   When the filesystem is readonly, avoid transaction stuff by checking 
   MS_RDONLY
   at start transaction time.
  
   Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
   ---
fs/btrfs/transaction.c |3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
  
   diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
   index 1fffbc0..14a597d 100644
   --- a/fs/btrfs/transaction.c
   +++ b/fs/btrfs/transaction.c
   @@ -181,6 +181,9 @@ static struct btrfs_trans_handle 
   *start_transaction(struct btrfs_root *root,
  struct btrfs_trans_handle *h;
  struct btrfs_transaction *cur_trans;
  int ret;
   +
   +   if (root-fs_info-sb-s_flags  MS_RDONLY)
   +   return ERR_PTR(-EROFS);
again:
  h = kmem_cache_alloc(btrfs_trans_handle_cachep, GFP_NOFS);
  if (!h)
   
   There are cases that we need to start transaction when MS_RDONLY flag is 
   set.
   For example, remount FS into read-only mode and log replay.
  
  However, is it weird to make changes to disk as fs is in readonly state?
  IMO, btrfs needs to limit the use of these disk-change while readonly 
  cases,
  as it is not what readonly means.
 
 reiserfs and ext3 at least have always done this.  Log replay is
 required even when the FS is readonly.


Just make sure the underlying disk isn't read only, we had problems with this on
ext3 a bit ago.  Thanks,

Josef 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug in the design of the tree search ioctl API ? [was Re: [PATCH 1/3] Btrfs: Really return keys within specified range]

2010-12-15 Thread Chris Mason
Excerpts from Li Zefan's message of 2010-12-14 22:33:33 -0500:
  Suppose to have the following sequence keys  [objectid, type, offset]:
  
  [...]
  1)[300, BTRFS_EXTENT_DATA_KEY, xx]
  2)[300, BTRFS_INODE_ITEM_KEY, xx]
  3)[300, BTRFS_XATTR_ITEM_KEY, xx]
  4)[301, BTRFS_EXTENT_DATA_KEY, xx]
  5)[301, BTRFS_INODE_ITEM_KEY, xx]
  7)[30200, BTRFS_EXTENT_DATA_KEY, xx]
  8)[30200, BTRFS_INODE_ITEM_KEY, xx]
  9)[30200, BTRFS_XATTR_ITEM_KEY, xx]
  [...]
  
  
  Suppose that the buffer is filled between the item 2 and 3. We should 
  restart 
  the search, but how set the min_* key ? Try the following hypothesis
  
  h1) objectid++, type = 0 - In the next search the key 3 would be skipped
  h2) objectid asis, type ++, - in the next search the key 4 would be skipped
  h3) objectid asis, type = 0 - in the next search the key 1,2,3 would be 
 
 h4) objectid asis, type asis, offset++ - we should get the correct result.

This is the right answer ;).  The problem is that even though our key has
3 distinct parts, and the API makes it look like you have very fine
grained control over those three parts, you have to remember to reset
them as you iterate between objectids.  It isn't a obvious as it should
be.

The current API is a very raw export of how we do the searches in the
kernel too.  You can do pretty much anything with it, but we pay with
complexity.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs hang (deadlock?) when trying to create a ext4 filesystem in KVM guest

2010-12-15 Thread C Anthony Risinger
On Thu, Oct 28, 2010 at 1:29 AM, Tomasz Chmielewski man...@wpkg.org wrote:
 On 28.10.2010 00:55, Chris Mason wrote:
 On Wed, Oct 27, 2010 at 03:29:38PM +0200, Tomasz Chmielewski wrote:
 There are a couple of problems when running KVM guests with images stored 
 on btrfs filesystem.

 One of them is inability to create a filesystem (i.e. ext4) in the guest:

 - btrfs filesystem on the host was mounted with noatime,compress-force
 - guest was using a 50 GB sparse file,
 - attempt to create a ext4 filesystem within the guest does not succeed 
 (hangs); host prints below messages in dmesg - some deadlock in btrfs?

 kernel: 2.6.36
 qemu-kvm: 0.13.0

 Is this the full dmesg output?  I think there are other messages hiding
 in there.

 There were indeed bad ordered accounting left (see below), I think they are 
 coming from btrfs?


 Is this a single disk btrfs?

 Yes.



 [ 8072.773053] device fsid 1142843480ad2d13-4bdc742fd9b1f7b0 devid 1 transid 
 1508 /dev/sdb4
 [ 8072.773674] btrfs: forcing compression
 [ 8122.052221] device tap0 entered promiscuous mode
 [ 8122.052245] br0: port 2(tap0) entering learning state
 [ 8122.052248] br0: port 2(tap0) entering learning state
 [ 8122.451587] br0: port 2(tap0) entering learning state
 [ 8122.543477] br0: port 2(tap0) entering disabled state
 [ 8122.609645] device tap0 left promiscuous mode
 [ 8122.609650] br0: port 2(tap0) entering disabled state
 [ 8131.325647] EXT4-fs (md4): recovery complete
 [ 8131.325809] EXT4-fs (md4): mounted filesystem with ordered data mode. 
 Opts: (null)
 [ 8133.392100] device tap0 entered promiscuous mode
 [ 8133.392127] br0: port 2(tap0) entering learning state
 [ 8133.392131] br0: port 2(tap0) entering learning state
 [ 8134.106594] kvm: 5004: cpu0 unhandled wrmsr: 0x198 data 0
 [ 8134.106618] kvm: 5004: cpu1 unhandled wrmsr: 0x198 data 0
 [ 8143.460927] tap0: no IPv6 routers present
 [ 8148.359485] br0: port 2(tap0) entering forwarding state
 [ 8309.103502] bad ordered accounting left 65536 size 385024
 [ 8309.106206] bad ordered accounting left 65536 size 385024
 [ 8309.108915] bad ordered accounting left 65536 size 385024
 [ 8309.111630] bad ordered accounting left 36864 size 385024
 [ 8501.965625] INFO: task qemu-system-x86:5148 blocked for more than 120 
 seconds.
 [ 8501.965629] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
 this message.
 [ 8501.965632] qemu-system-x D 880001e14cc0     0  5148   4924 0x
 [ 8501.965638]  880223bc3b38 0086 880223bc3fd8 
 00014cc0
 [ 8501.965642]  00014cc0 880223bc3fd8 00014cc0 
 880223bc3fd8
 [ 8501.965647]  00014cc0 880221965f18 880221965f20 
 880221965b80
 [ 8501.965651] Call Trace:
 [ 8501.965678]  [a024c52c] btrfs_start_ordered_extent+0x6c/0xb0 
 [btrfs]
 [ 8501.965685]  [81083200] ? autoremove_wake_function+0x0/0x40
 [ 8501.965701]  [a024d1c2] btrfs_wait_ordered_range+0xd2/0x160 
 [btrfs]
 [ 8501.965716]  [a0240059] btrfs_file_aio_write+0x269/0x990 [btrfs]
 [ 8501.965721]  [8105ca94] ? try_to_wake_up+0xf4/0x3f0
 [ 8501.965726]  [81168119] ? __pollwake+0x49/0x50
 [ 8501.965730]  [8105cd90] ? default_wake_function+0x0/0x20
 [ 8501.965733]  [8105ca94] ? try_to_wake_up+0xf4/0x3f0
 [ 8501.965737]  [8116813b] ? pollwake+0x1b/0x20
 [ 8501.965752]  [a023fdf0] ? btrfs_file_aio_write+0x0/0x990 [btrfs]
 [ 8501.965761]  [8115664b] do_sync_readv_writev+0xcb/0x110
 [ 8501.965769]  [81294d98] ? apparmor_file_permission+0x18/0x20
 [ 8501.965776]  [8126356e] ? security_file_permission+0x1e/0x80
 [ 8501.965781]  [811576e0] do_readv_writev+0xd0/0x1d0
 [ 8501.965787]  [81076d72] ? kill_something_info+0x42/0x130
 [ 8501.965793]  [81076ee0] ? sys_kill+0x80/0x90
 [ 8501.965798]  [8115781e] vfs_writev+0x3e/0x60
 [ 8501.965802]  [811578e7] sys_pwritev+0xa7/0xc0
 [ 8501.965806]  [8100b0f2] system_call_fastpath+0x16/0x1b
 [ 8501.965810] INFO: task qemu-system-x86:5150 blocked for more than 120 
 seconds.
 [ 8501.965812] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
 this message.
 [ 8501.965814] qemu-system-x D 880001e94cc0     0  5150   4924 0x
 [ 8501.965819]  8801aba4bb38 0086 8801aba4bfd8 
 00014cc0
 [ 8501.965823]  00014cc0 8801aba4bfd8 00014cc0 
 8801aba4bfd8
 [ 8501.965827]  00014cc0 880226d14838 880226d14840 
 880226d144a0
 [ 8501.965832] Call Trace:
 [ 8501.965847]  [a024c52c] btrfs_start_ordered_extent+0x6c/0xb0 
 [btrfs]
 [ 8501.965852]  [81083200] ? autoremove_wake_function+0x0/0x40
 [ 8501.965867]  [a024d1c2] btrfs_wait_ordered_range+0xd2/0x160 
 [btrfs]
 [ 8501.965883]  [a0240059] btrfs_file_aio_write+0x269/0x990 [btrfs]
 [ 8501.965887]  [8105ca94] ? try_to_wake_up+0xf4/0x3f0
 [ 8501.965891]  [81168119] ? 

Re: Bug in the design of the tree search ioctl API ? [was Re: [PATCH 1/3] Btrfs: Really return keys within specified range]

2010-12-15 Thread Goffredo Baroncelli
On Wednesday, 15 December, 2010, Chris Mason wrote:
 Excerpts from Li Zefan's message of 2010-12-14 22:33:33 -0500:
   Suppose to have the following sequence keys  [objectid, type, offset]:
   
   [...]
   1)[300, BTRFS_EXTENT_DATA_KEY, xx]
   2)[300, BTRFS_INODE_ITEM_KEY, xx]
   3)[300, BTRFS_XATTR_ITEM_KEY, xx]
   4)[301, BTRFS_EXTENT_DATA_KEY, xx]
   5)[301, BTRFS_INODE_ITEM_KEY, xx]
   7)[30200, BTRFS_EXTENT_DATA_KEY, xx]
   8)[30200, BTRFS_INODE_ITEM_KEY, xx]
   9)[30200, BTRFS_XATTR_ITEM_KEY, xx]
   [...]
   
   
   Suppose that the buffer is filled between the item 2 and 3. We should 
restart 
   the search, but how set the min_* key ? Try the following hypothesis
   
   h1) objectid++, type = 0 - In the next search the key 3 would be 
skipped
   h2) objectid asis, type ++, - in the next search the key 4 would be 
skipped
   h3) objectid asis, type = 0 - in the next search the key 1,2,3 would be 
  
  h4) objectid asis, type asis, offset++ - we should get the correct 
result.
 
 This is the right answer ;).  The problem is that even though our key has
 3 distinct parts, and the API makes it look like you have very fine
 grained control over those three parts, you have to remember to reset
 them as you iterate between objectids.  It isn't a obvious as it should
 be.
 
 The current API is a very raw export of how we do the searches in the
 kernel too.  You can do pretty much anything with it, but we pay with
 complexity.

Hi Chris,

I am a bit confused about your answer.

The actual API is a bit confused (or almost not obvious). An application in 
order to work properly has to make some adjustment to the min_* fields AND 
filter the results (because if we tweak with the min_* field, not useful data 
is returned). Moreover this  means that we move between user-space-kernel-
space a lot of unused data (un-necessary context switch).

On the basis of your answer, it seems that this is ok (please don't consider 
only the case of listing the subvolumes which is a very simple cases). And 
nothing have to do.

Instead I suggest to (see my email [PATCH] BTRFS_IOC_TREE_SEARCH: store and 
use the last key found):
- leave the min_* and max_* fields to act only as filter
- add three more fields start_* as start point for the search 
- make some small modification to the kernel code to track in the start_* 
fields the last key found

pro:
- we pass to userspace only useful data
- we simplify a lot the userspace application, because they don't have to 
update the min_* fields (they will work in a obvious way :-) )
- we can use the ordered property of a btree structure to perform efficient 
data lookup (if we reset to zero the min_* fields we lookup un-necessary data)
- we have the same functionality of the old API

cons:
- we need a modification ( :-) )

May be that I missed some point, but I don't see any advantage to continue to 
support the actual API. Of course, that doesn't means that we can remove the 
old API ignoring the backward compatibility. 
But I think that there are sufficient pros to develop a new API

Please be patient: my english is very bad; I am not trying to blame anybody; I 
want only a perfect fs (TM) :-)

 
 -chris
 
 


-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) kreij...@inwind.it
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug in the design of the tree search ioctl API ? [was Re: [PATCH 1/3] Btrfs: Really return keys within specified range]

2010-12-15 Thread Goffredo Baroncelli
On Wednesday, 15 December, 2010, Li Zefan wrote:
 Goffredo Baroncelli wrote:
  On Wednesday, 15 December, 2010, Li Zefan wrote:
  h4) objectid asis, type asis, offset++ - we should get the correct 
result.
  
  This fix the problem of the missing subvolume. But for the other case 
  (searching for more than one type) the problem still here.
  
 
 I don't think so. And the above h4 has showed how we search for more
 than one type.
 
 The generic userland code for next search is:
 
 /* this is in essence the same as how we advance key in kernel code */
 if (sk-min_offset  (u64)-1  sk-min_offset  sk-max_offset)
   sk-min_offset++;
 else if (sk-min_type  (u8)-1  sk-min_type  sk-max_type) {
   sk-min_offset = 0;
   sk-min_type++;
 } else if (sk-min_objectid  (u64)-1  sk-min_objectid  sk-
max_objectid){
   sk-min_offset = 0;
   sk-min_type = 0;

Sorry but if you reset the sk-min_type to 0, this means that the min_type 
lost its purpose (act as lover bound of the acceptance criteria).

   sk-min_objectid++;
 } else
   break;
 
 ioctl(...);
 
 for (i = 0; i  nr_items; i++) {
   if (!filter(items[i]))
   continue;

So you are suggesting: Move all tree items from kernel to user space and 
filter it in userspace ?. This mean a lot of un-needed kernel-space - 
userspace transition...

Sorry I don't understand if we are talking about a workaround or a solution.

   /* process this item */
   ...
 }
 
  because the current ioctl uses min_{x,y,z} and max_{x,y,z} as start_key 
and
  end_key, and it returns all keys that falls in [start_key, end_key].
 
  So this btrfs-progs patch should fix missing subvolumes in the output of
  subvolume list:
 
  diff --git a/btrfs-list.c b/btrfs-list.c
  index 93766a8..1b9ea45 100644
  --- a/btrfs-list.c
  +++ b/btrfs-list.c
  @@ -620,7 +620,10 @@ int list_subvols(int fd)
  /* this iteration is done, step forward one root for the 
  next
   * ioctl
   */
  -   if (sk-min_objectid  (u64)-1) {
  +   if (sk-min_type  BTRFS_ROOT_BACKREF_KEY) {
  +   sk-min_type = BTRFS_ROOT_BACKREF_KEY;
  +   sk-min_offset = 0;
  +   } else  if (sk-min_objectid  (u64)-1) {
  sk-min_objectid++;
  sk-min_type = BTRFS_ROOT_BACKREF_KEY;
  sk-min_offset = 0;
 
  
  
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) kreij...@inwind.it
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug in the design of the tree search ioctl API ? [was Re: [PATCH 1/3] Btrfs: Really return keys within specified range]

2010-12-15 Thread Chris Mason
Excerpts from Goffredo Baroncelli's message of 2010-12-15 13:42:23 -0500:
 On Wednesday, 15 December, 2010, Chris Mason wrote:
  Excerpts from Li Zefan's message of 2010-12-14 22:33:33 -0500:
Suppose to have the following sequence keys  [objectid, type, offset]:

[...]
1)[300, BTRFS_EXTENT_DATA_KEY, xx]
2)[300, BTRFS_INODE_ITEM_KEY, xx]
3)[300, BTRFS_XATTR_ITEM_KEY, xx]
4)[301, BTRFS_EXTENT_DATA_KEY, xx]
5)[301, BTRFS_INODE_ITEM_KEY, xx]
7)[30200, BTRFS_EXTENT_DATA_KEY, xx]
8)[30200, BTRFS_INODE_ITEM_KEY, xx]
9)[30200, BTRFS_XATTR_ITEM_KEY, xx]
[...]


Suppose that the buffer is filled between the item 2 and 3. We should 
 restart 
the search, but how set the min_* key ? Try the following hypothesis

h1) objectid++, type = 0 - In the next search the key 3 would be 
 skipped
h2) objectid asis, type ++, - in the next search the key 4 would be 
 skipped
h3) objectid asis, type = 0 - in the next search the key 1,2,3 would 
be 
   
   h4) objectid asis, type asis, offset++ - we should get the correct 
 result.
  
  This is the right answer ;).  The problem is that even though our key has
  3 distinct parts, and the API makes it look like you have very fine
  grained control over those three parts, you have to remember to reset
  them as you iterate between objectids.  It isn't a obvious as it should
  be.
  
  The current API is a very raw export of how we do the searches in the
  kernel too.  You can do pretty much anything with it, but we pay with
  complexity.
 
 Hi Chris,
 
 I am a bit confused about your answer.
 
 The actual API is a bit confused (or almost not obvious). An application in 
 order to work properly has to make some adjustment to the min_* fields AND 
 filter the results (because if we tweak with the min_* field, not useful data 
 is returned). Moreover this  means that we move between user-space-kernel-
 space a lot of unused data (un-necessary context switch).
 
 On the basis of your answer, it seems that this is ok (please don't consider 
 only the case of listing the subvolumes which is a very simple cases). And 
 nothing have to do.

Well, it's ok in that I wanted the API to be very close to the way
searches are done in the kernel.  I'll definitely agree it isn't
perfect, especially as we hop between objectids or types.

But I don't want to extend it just yet, mostly because we don't have new
applications making use of it.  I'd rather couple any new apis with new
applications that we haven't yet thought of.

Thanks a lot for the time you're spending on review and looking at this.
If you have killer apps that can really make use of new APIs, I'm happy
to start reworking things.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs hang (deadlock?) when trying to create a ext4 filesystem in KVM guest

2010-12-15 Thread Chris Mason
Excerpts from C Anthony Risinger's message of 2010-12-15 11:46:11 -0500:
 On Thu, Oct 28, 2010 at 1:29 AM, Tomasz Chmielewski man...@wpkg.org wrote:
  On 28.10.2010 00:55, Chris Mason wrote:
  On Wed, Oct 27, 2010 at 03:29:38PM +0200, Tomasz Chmielewski wrote:
  There are a couple of problems when running KVM guests with images stored 
  on btrfs filesystem.
 
  One of them is inability to create a filesystem (i.e. ext4) in the guest:
 
  - btrfs filesystem on the host was mounted with noatime,compress-force
  - guest was using a 50 GB sparse file,
  - attempt to create a ext4 filesystem within the guest does not succeed 
  (hangs); host prints below messages in dmesg - some deadlock in btrfs?
 
  kernel: 2.6.36
  qemu-kvm: 0.13.0
 
  Is this the full dmesg output?  I think there are other messages hiding
  in there.
 
  There were indeed bad ordered accounting left (see below), I think they 
  are coming from btrfs?
 
 
  Is this a single disk btrfs?
 
  Yes.
 
 
 
  [ 8072.773053] device fsid 1142843480ad2d13-4bdc742fd9b1f7b0 devid 1 
  transid 1508 /dev/sdb4
  [ 8072.773674] btrfs: forcing compression
  [ 8122.052221] device tap0 entered promiscuous mode
  [ 8122.052245] br0: port 2(tap0) entering learning state
  [ 8122.052248] br0: port 2(tap0) entering learning state
  [ 8122.451587] br0: port 2(tap0) entering learning state
  [ 8122.543477] br0: port 2(tap0) entering disabled state
  [ 8122.609645] device tap0 left promiscuous mode
  [ 8122.609650] br0: port 2(tap0) entering disabled state
  [ 8131.325647] EXT4-fs (md4): recovery complete
  [ 8131.325809] EXT4-fs (md4): mounted filesystem with ordered data mode. 
  Opts: (null)
  [ 8133.392100] device tap0 entered promiscuous mode
  [ 8133.392127] br0: port 2(tap0) entering learning state
  [ 8133.392131] br0: port 2(tap0) entering learning state
  [ 8134.106594] kvm: 5004: cpu0 unhandled wrmsr: 0x198 data 0
  [ 8134.106618] kvm: 5004: cpu1 unhandled wrmsr: 0x198 data 0
  [ 8143.460927] tap0: no IPv6 routers present
  [ 8148.359485] br0: port 2(tap0) entering forwarding state
  [ 8309.103502] bad ordered accounting left 65536 size 385024
  [ 8309.106206] bad ordered accounting left 65536 size 385024
  [ 8309.108915] bad ordered accounting left 65536 size 385024
  [ 8309.111630] bad ordered accounting left 36864 size 385024
  [ 8501.965625] INFO: task qemu-system-x86:5148 blocked for more than 120 
  seconds.
  [ 8501.965629] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
  this message.
  [ 8501.965632] qemu-system-x D 880001e14cc0     0  5148   4924 
  0x
  [ 8501.965638]  880223bc3b38 0086 880223bc3fd8 
  00014cc0
  [ 8501.965642]  00014cc0 880223bc3fd8 00014cc0 
  880223bc3fd8
  [ 8501.965647]  00014cc0 880221965f18 880221965f20 
  880221965b80
  [ 8501.965651] Call Trace:
  [ 8501.965678]  [a024c52c] btrfs_start_ordered_extent+0x6c/0xb0 
  [btrfs]
  [ 8501.965685]  [81083200] ? autoremove_wake_function+0x0/0x40
  [ 8501.965701]  [a024d1c2] btrfs_wait_ordered_range+0xd2/0x160 
  [btrfs]
  [ 8501.965716]  [a0240059] btrfs_file_aio_write+0x269/0x990 
  [btrfs]
  [ 8501.965721]  [8105ca94] ? try_to_wake_up+0xf4/0x3f0
  [ 8501.965726]  [81168119] ? __pollwake+0x49/0x50
  [ 8501.965730]  [8105cd90] ? default_wake_function+0x0/0x20
  [ 8501.965733]  [8105ca94] ? try_to_wake_up+0xf4/0x3f0
  [ 8501.965737]  [8116813b] ? pollwake+0x1b/0x20
  [ 8501.965752]  [a023fdf0] ? btrfs_file_aio_write+0x0/0x990 
  [btrfs]
  [ 8501.965761]  [8115664b] do_sync_readv_writev+0xcb/0x110
  [ 8501.965769]  [81294d98] ? apparmor_file_permission+0x18/0x20
  [ 8501.965776]  [8126356e] ? security_file_permission+0x1e/0x80
  [ 8501.965781]  [811576e0] do_readv_writev+0xd0/0x1d0
  [ 8501.965787]  [81076d72] ? kill_something_info+0x42/0x130
  [ 8501.965793]  [81076ee0] ? sys_kill+0x80/0x90
  [ 8501.965798]  [8115781e] vfs_writev+0x3e/0x60
  [ 8501.965802]  [811578e7] sys_pwritev+0xa7/0xc0
  [ 8501.965806]  [8100b0f2] system_call_fastpath+0x16/0x1b
  [ 8501.965810] INFO: task qemu-system-x86:5150 blocked for more than 120 
  seconds.
  [ 8501.965812] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
  this message.
  [ 8501.965814] qemu-system-x D 880001e94cc0     0  5150   4924 
  0x
  [ 8501.965819]  8801aba4bb38 0086 8801aba4bfd8 
  00014cc0
  [ 8501.965823]  00014cc0 8801aba4bfd8 00014cc0 
  8801aba4bfd8
  [ 8501.965827]  00014cc0 880226d14838 880226d14840 
  880226d144a0
  [ 8501.965832] Call Trace:
  [ 8501.965847]  [a024c52c] btrfs_start_ordered_extent+0x6c/0xb0 
  [btrfs]
  [ 8501.965852]  [81083200] ? autoremove_wake_function+0x0/0x40
  [ 8501.965867]  [a024d1c2] btrfs_wait_ordered_range+0xd2/0x160 
  [btrfs]
  [ 

Re: Bug in the design of the tree search ioctl API ? [was Re: [PATCH 1/3] Btrfs: Really return keys within specified range]

2010-12-15 Thread Goffredo Baroncelli
On Wednesday, 15 December, 2010, Chris Mason wrote:
[...]
  
  Hi Chris,
  
  I am a bit confused about your answer.
  
  The actual API is a bit confused (or almost not obvious). An application 
in 
  order to work properly has to make some adjustment to the min_* fields AND 
  filter the results (because if we tweak with the min_* field, not useful 
data 
  is returned). Moreover this  means that we move between user-space-
kernel-
  space a lot of unused data (un-necessary context switch).
  
  On the basis of your answer, it seems that this is ok (please don't 
consider 
  only the case of listing the subvolumes which is a very simple cases). And 
  nothing have to do.
 
 Well, it's ok in that I wanted the API to be very close to the way
 searches are done in the kernel.  I'll definitely agree it isn't
 perfect, especially as we hop between objectids or types.
 
 But I don't want to extend it just yet, mostly because we don't have new
 applications making use of it.  I'd rather couple any new apis with new
 applications that we haven't yet thought of.
 
 Thanks a lot for the time you're spending on review and looking at this.
 If you have killer apps that can really make use of new APIs, I'm happy
 to start reworking things.

Look at a my previous email about an enhancement of the find-new command 
([RFC] Improve btrfs subvolume find-new command - 11/12/2010). 
It would be sufficient ? (of course now on the basis of the last news of this 
API I know that this command is bugged :-( )

I can live with the current API ( tweaking the increasing of the min_* 
fields).. but think about another side of the question: now the only client of 
this API is the btrfs command (btrfs subvol list, which actually is broken) , 
and we can update this api with the minimum effort. Instead if we leave the 
current behavior in the future may appears an application which depends on it. 
So we may be obligated to maintain it ..



Goffredo
 
 -chris
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) kreij...@inwind.it
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What do the arguments of btrfs filesystem defragment do?

2010-12-15 Thread Chris Mason
Excerpts from Erik Logtenberg's message of 2010-12-15 11:06:03 -0500:
 Hi,
 
 I would like to know that the arguments of btrfs filesystem defragment
 do. According to the built-in help page, the invocation is as follows:
 
  btrfs filesystem defragment [-vcf] [-s start] [-l len] [-t
  size] file|dir [file|dir...]
  Defragment a file or a directory.
 
 Unfortunately I can't find any documentation on the meaning of these
 arguments. The man page doesn't list these arguments:

-v is verbose.  It just prints the file name as it defrags
-c forces compression, even if you are mounted without compression on

-s is the starting byte you want to defrag in the file.  This is useful
for very large files that are only compressed in a particular spot

-l is the number of bytes you want to compress, would be paired with -s

-t is a threshold.  If the extent is bigger than this threshold, we
consider it defragged already.

-f starts the IO right away.

 
filesystem defragment file|dir [file|dir...]
   Defragment files and/or directories.
 
 Also the online documentation [1] is identical to the man page.
 
 [1] https://btrfs.wiki.kernel.org/index.php/Btrfs(command)
 
 On a somewhat related note I did find another page on the wiki [2] that
 explains a bit on the defragmenting subject, more specifically it
 mentions this rather important caveat:
 
  Caveat: Defragmenting a file which has a COW copy (either a snapshot
  copy or one made with bcp or cp --reflinks) will produce two unrelated
  files. If you defragment a subvolume that has a snapshot, you will
  roughly double the disk usage, as the snapshot files are no longer COW
  images of the originals.
 
 [2] https://btrfs.wiki.kernel.org/index.php/Problem_FAQ
 
 From what I've heard on IRC this is still the case in current versions,
 but the Btrfs(command) documentation contains no mention of this.

This is still true.

 
 I hope someone can shed some light on these subjects.
 

Hope this makes sense ;)

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What do the arguments of btrfs filesystem defragment do?

2010-12-15 Thread Goffredo Baroncelli
On Wednesday, 15 December, 2010, Erik Logtenberg wrote:
 Hi,
 
 I would like to know that the arguments of btrfs filesystem defragment
 do. According to the built-in help page, the invocation is as follows:
 
  btrfs filesystem defragment [-vcf] [-s start] [-l len] [-t
  size] file|dir [file|dir...]
  Defragment a file or a directory.
 
 Unfortunately I can't find any documentation on the meaning of these
 arguments. The man page doesn't list these arguments:

I loked at the code and I try to explain. But what I write below may be 
affected by error.

If the parameter is a directory, the subvolume where the dir is placed is 
defragged.

Instead if the parameter is a file, the switches have the following meaning:

-f - flush after defrag
-s - start point of defrag
-l - length of defrag range (if len is = 0, the rest of file is considered)
-c - force compress
-t - threshold for compression (if the size of file is bigger, compress the 
file ?)


-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) kreij...@inwind.it
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What do the arguments of btrfs filesystem defragment do?

2010-12-15 Thread Erik Logtenberg
Chris, thank you very much for your explanation. Indeed this clears
things up a bit.

 Caveat: Defragmenting a file which has a COW copy (either a snapshot
 copy or one made with bcp or cp --reflinks) will produce two unrelated
 files. If you defragment a subvolume that has a snapshot, you will
 roughly double the disk usage, as the snapshot files are no longer COW
 images of the originals.

 [2] https://btrfs.wiki.kernel.org/index.php/Problem_FAQ

 From what I've heard on IRC this is still the case in current versions,
 but the Btrfs(command) documentation contains no mention of this.
 
 This is still true.

Is there a decent way to have btrfs compress already existing files
(that were written before compression was enabled) without hurting any
of the internal structures such as snapshots?
The goal is to increase free disk space and possibly performance, not to
expload disk usage by breaking COW relations. So given your reply, I
assume that defragmenting all files is not the right way (?)

Kind regards,

Erik.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-15 Thread Matt
On Mon, Dec 13, 2010 at 7:56 PM, Jon Nelson jnel...@jamponi.net wrote:
 On Sun, Dec 12, 2010 at 8:06 PM, Ted Ts'o ty...@mit.edu wrote:
 On Sun, Dec 12, 2010 at 07:11:28AM -0600, Jon Nelson wrote:
 I'm glad you've been able to reproduce the problem! If you should need
 any further assistance, please do not hesitate to ask.

 This patch seems to fix the problem for me.  (Unless the partition is
 mounted with mblk_io_submit.)

 Could you confirm that it fixes it for you as well?

 I believe I have applied the (relevant) inode.c changes to
 bd2d0210cf22f2bd0cef72eb97cf94fc7d31d8cc, rebuilt and begun testing.
 Now at 28 passes without error, I think I can say that the patch
 appears to resolve the issue.

 --
 Jon


Confirmed !

I'm running my box for 5+ hours right now with your patch applied in
addition to Andi's/Milan's patch
(http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-scale-to-multiple-CPUs.patch)
, Ted, and can't see any indications of corruptions so far (while
doing an emerge -e system) and doing everyday stuff.
My /home partition (with ext4) is also still intact [which of course
has a backup] so it seems to fix it for me, too

so the corruption I was seeing was similar in a way to that of Jon

You can add a
Tested-by: Matthias Bayer jackdac...@gmail.com

Thanks a lot to everyone for your support ! :)


I have a question though: the deactivation of multiple page-io
submission support most likely only would affect bigger systems or
also desktop systems (like mine) ?

Regards

Matt
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-15 Thread Andi Kleen
 I have a question though: the deactivation of multiple page-io
 submission support most likely only would affect bigger systems or
 also desktop systems (like mine) ?

I think this is not a final fix, just a workaround.
The problem with the other path still really needs to be tracked down.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What do the arguments of btrfs filesystem defragment do?

2010-12-15 Thread Chris Mason
Excerpts from Erik Logtenberg's message of 2010-12-15 14:14:15 -0500:
 Chris, thank you very much for your explanation. Indeed this clears
 things up a bit.
 
  Caveat: Defragmenting a file which has a COW copy (either a snapshot
  copy or one made with bcp or cp --reflinks) will produce two unrelated
  files. If you defragment a subvolume that has a snapshot, you will
  roughly double the disk usage, as the snapshot files are no longer COW
  images of the originals.
 
  [2] https://btrfs.wiki.kernel.org/index.php/Problem_FAQ
 
  From what I've heard on IRC this is still the case in current versions,
  but the Btrfs(command) documentation contains no mention of this.
  
  This is still true.
 
 Is there a decent way to have btrfs compress already existing files
 (that were written before compression was enabled) without hurting any
 of the internal structures such as snapshots?

I'm afraid not yet.  There is code for this in the btrfs balance
routines, but we haven't yet adapted it to the defragment command.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-15 Thread Matt
On Wed, Dec 15, 2010 at 8:16 PM, Andi Kleen a...@firstfloor.org wrote:
 I have a question though: the deactivation of multiple page-io
 submission support most likely only would affect bigger systems or
 also desktop systems (like mine) ?

 I think this is not a final fix, just a workaround.
 The problem with the other path still really needs to be tracked down.

 -Andi

 --
 a...@linux.intel.com -- Speaking for myself only.


ok,

thanks for the clarification

Regards

Matt
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What do the arguments of btrfs filesystem defragment do?

2010-12-15 Thread Erik Logtenberg
 Is there a decent way to have btrfs compress already existing files
 (that were written before compression was enabled) without hurting any
 of the internal structures such as snapshots?
 
 I'm afraid not yet.  There is code for this in the btrfs balance
 routines, but we haven't yet adapted it to the defragment command.

Okay, just to be 100% sure, if I add -o compress to the mount options,
this will cause btrfs to (try to) compress newly written files, right?
Is this fully compatible with existing snapshots, or will compressing
files always (have a chance to) hurt COW relations?

The use case is a filesystem used for backups, which are rsynced
nightly, after which a new snapshot is made. After something like 45
days, the old snapshots are removed. I am assuming that this way, after
45 days all files will be compressed naturally, but this is only
beneficial if snapshots still fully work. If instead it results in
storing the compressed form of every file 45 times on disk, then it
won't help much.

Kind regards,

Erik.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-15 Thread Matt
On Wed, Dec 15, 2010 at 8:25 PM, Matt jackdac...@gmail.com wrote:
 On Wed, Dec 15, 2010 at 8:16 PM, Andi Kleen a...@firstfloor.org wrote:
 I have a question though: the deactivation of multiple page-io
 submission support most likely only would affect bigger systems or
 also desktop systems (like mine) ?

 I think this is not a final fix, just a workaround.
 The problem with the other path still really needs to be tracked down.

 -Andi

 --
 a...@linux.intel.com -- Speaking for myself only.


 ok,

 thanks for the clarification

 Regards

 Matt


Sorry to spam the mailing lists again

make that a
Reported-and-Tested-by: Matthias Bayer jackdac...@gmail.com

(hope that's the correct way to write it)

Regards

Matt
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What do the arguments of btrfs filesystem defragment do?

2010-12-15 Thread Chris Mason
Excerpts from Erik Logtenberg's message of 2010-12-15 14:26:49 -0500:
  Is there a decent way to have btrfs compress already existing files
  (that were written before compression was enabled) without hurting any
  of the internal structures such as snapshots?
  
  I'm afraid not yet.  There is code for this in the btrfs balance
  routines, but we haven't yet adapted it to the defragment command.
 
 Okay, just to be 100% sure, if I add -o compress to the mount options,
 this will cause btrfs to (try to) compress newly written files, right?

Yes

 Is this fully compatible with existing snapshots, or will compressing
 files always (have a chance to) hurt COW relations?

COW references (both via snapshotting and cp --reflink) are broken when
you change the file in either copy.  While the
files are the same, they are shared and the parts that change are
private to each copy.  If you have a reflink copy of a 1TB file and
change only 4K in the middle, the other 1TB-4KB are still shared.

This is true for both the default and with compression on.

 
 The use case is a filesystem used for backups, which are rsynced
 nightly, after which a new snapshot is made. After something like 45
 days, the old snapshots are removed. I am assuming that this way, after
 45 days all files will be compressed naturally, but this is only
 beneficial if snapshots still fully work. If instead it results in
 storing the compressed form of every file 45 times on disk, then it
 won't help much.

Yes, you'll end up with a fully compressed and fully shared setup after
45 days.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: fix memory leak on error path in btrfs_get_acl()

2010-12-15 Thread Mariusz Kozlowski
If posix_acl_from_xattr() fails we leak memory stored in 'value'.

Signed-off-by: Mariusz Kozlowski m...@lab.zgora.pl
---
 fs/btrfs/acl.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
index d16..11c9561 100644
--- a/fs/btrfs/acl.c
+++ b/fs/btrfs/acl.c
@@ -61,6 +61,7 @@ static struct posix_acl *btrfs_get_acl(struct inode *inode, 
int type)
if (size  0) {
acl = posix_acl_from_xattr(value, size);
if (IS_ERR(acl))
+   kfree(value);
return acl;
set_cached_acl(inode, type, acl);
}
-- 
1.7.0.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


revised ioctl#21 btrfsprogs patch

2010-12-15 Thread David Nicol
the below is against 70c6c10134b502fa69955746554031939b85fb0c which is
the head of the next branch.
1: absolute-sized fields in the interface object
2: now getting error from errno instead of the return value of ioctl(2)


-- Forwarded message --
From: project user
Date: Wed, Dec 15, 2010 at 4:15 PM
Subject: btrfsprogs revised ioctl#21 patch


diff --git a/Makefile b/Makefile
index d65f6a2..b8ef1f2 100644
--- a/Makefile
+++ b/Makefile
@@ -37,12 +37,13 @@ all: version $(progs) manpages
 version:
       bash version.sh

-btrfs: $(objects) btrfs.o btrfs_cmds.o
-       gcc $(CFLAGS) -o btrfs btrfs.o btrfs_cmds.o \
+btrfs: $(objects) btrfs.o btrfs_cmds.o iso8601toms.o
+       gcc $(CFLAGS) -o btrfs btrfs.o btrfs_cmds.o iso8601toms.o \
               $(objects) $(LDFLAGS) $(LIBS)

-btrfsctl: $(objects) btrfsctl.o
-       gcc $(CFLAGS) -o btrfsctl btrfsctl.o $(objects) $(LDFLAGS) $(LIBS)
+btrfsctl: $(objects) btrfsctl.o iso8601toms.o
+       gcc $(CFLAGS) -o btrfsctl btrfsctl.o iso8601toms.o \
+               $(objects) $(LDFLAGS) $(LIBS)

 btrfs-vol: $(objects) btrfs-vol.o
       gcc $(CFLAGS) -o btrfs-vol btrfs-vol.o $(objects) $(LDFLAGS) $(LIBS)
diff --git a/btrfs.c b/btrfs.c
index 46314cf..6b3ebf6 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -29,6 +29,7 @@ struct Command {
       CommandFunction func;   /* function which implements the command */
       int     nargs;          /* if == 999, any number of arguments
                                  if = 0, number of arguments,
+                                  if  1000, 1000 more than the
_maximum_ number of arguments,
                                  if  0, _minimum_ number of arguments */
       char    *verb;          /* verb */
       char    *help;          /* help lines; form the 2nd onward they are
@@ -73,14 +74,19 @@ static struct Command commands[] = {
               Set the subvolume of the filesystem path which will
be mounted\n
               as default.
       },
-       { do_fssync, 1,
-         filesystem sync, path\n
-               Force a sync on the filesystem path.
+       { do_fssync, 1002,
+         filesystem sync, [-dq] path\n
+               Force a sync on the filesystem path, defaulting to '.'.
+       },
+       { do_wait4clean, 1002, /* require at most two args */
+         filesystem reclaim, [path [timeout]] \n
+               Wait for cleanup of deleted subvolumes. Path defaults to .\n
+               Timeout in seconds, or add m for minutes.\n
       },
       { do_resize, 2,
         filesystem resize, [+/-]newsize[gkm]|max filesystem\n
               Resize the file system. If 'max' is passed, the filesystem\n
-               will occupe all available space on the device.
+               will occupy all available space on the device.
       },
       { do_show_filesystem, 999,
         filesystem show, [uuid|label]\n
@@ -349,6 +355,15 @@ static int parse_args(int argc, char **argv,
               return -2;

       /* check the number of argument */
+
+       if(matchcmd-nargs  1000 ){
+          if ( matchcmd-nargs  (1000 + *nargs_)){
+               fprintf(stderr, ERROR: '%s' requires only %d or fewer
arg(s)\n,
+                       matchcmd-verb, matchcmd-nargs - 1000 );
+                       return -2;
+          }
+       } else {
+
       if (matchcmd-nargs  0  matchcmd-nargs  -*nargs_ ){
               fprintf(stderr, ERROR: '%s' requires minimum %d arg(s)\n,
                       matchcmd-verb, -matchcmd-nargs);
@@ -359,7 +374,7 @@ static int parse_args(int argc, char **argv,
                       matchcmd-verb, matchcmd-nargs);
                       return -2;
       }
-
+       }
        if (prepare_args( nargs_, args_, prgname, matchcmd )){
                fprintf(stderr, ERROR: not enough memory\\n);
               return -20;
diff --git a/btrfs_cmds.c b/btrfs_cmds.c
index 8031c58..89f1f22 100644
--- a/btrfs_cmds.c
+++ b/btrfs_cmds.c
@@ -507,15 +507,37 @@ int do_create_subvol(int argc, char **argv)
 int do_fssync(int argc, char **argv)
 {
       int fd, res;
-       char    *path = argv[1];
-
+        int cleanup=1, quiet=0;
+       char    *path = ., *p; /* [-dq] [path] */
+        struct btrfs_ioctl_cleaner_wait_args w4c_arg;
+
+        if (argc  1){
+           if (argv[1][0] == '-'){
+             for (p=argv[1][1]; *p; p++) switch (*p){
+                case 'q': quiet++; break;
+                case 'd': cleanup++; break;
+                default:
+                   fprintf(stderr,
+     ERROR: unknown flag '%c' in '%s'\n, *p,argv[1]);
+                  return 12;
+             };
+             if (argc  2)
+                 path = argv[2];
+
+           }else{
+              path = argv[1];
+
+           };
+
+        };
+    while (cleanup--){ /* cleanup starts at 1 */
       fd = open_file_or_dir(path);
       if (fd  0) {
               fprintf(stderr, ERROR: can't access to '%s'\n, path);
               return 12;
       }

-       printf(FSSync '%s'\n, path);
+       if (!quiet) 

Re: [PATCH] btrfs: fix memory leak on error path in btrfs_get_acl()

2010-12-15 Thread Miguel Ojeda
On Wed, Dec 15, 2010 at 10:57 PM, Mariusz Kozlowski m...@lab.zgora.pl wrote:
 If posix_acl_from_xattr() fails we leak memory stored in 'value'.

 Signed-off-by: Mariusz Kozlowski m...@lab.zgora.pl
 ---
  fs/btrfs/acl.c |    1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

 diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
 index d16..11c9561 100644
 --- a/fs/btrfs/acl.c
 +++ b/fs/btrfs/acl.c
 @@ -61,6 +61,7 @@ static struct posix_acl *btrfs_get_acl(struct inode *inode, 
 int type)
                if (size  0) {
                        acl = posix_acl_from_xattr(value, size);
                        if (IS_ERR(acl))
 +                               kfree(value);

Be careful with the evil { }

                                return acl;
                        set_cached_acl(inode, type, acl);
                }
 --
 1.7.0.4

 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug in the design of the tree search ioctl API ? [was Re: [PATCH 1/3] Btrfs: Really return keys within specified range]

2010-12-15 Thread Li Zefan
02:48, Goffredo Baroncelli wrote:
 On Wednesday, 15 December, 2010, Li Zefan wrote:
 Goffredo Baroncelli wrote:
 On Wednesday, 15 December, 2010, Li Zefan wrote:
 h4) objectid asis, type asis, offset++ - we should get the correct 
 result.
 This fix the problem of the missing subvolume. But for the other case 
 (searching for more than one type) the problem still here.

 I don't think so. And the above h4 has showed how we search for more
 than one type.

 The generic userland code for next search is:

 /* this is in essence the same as how we advance key in kernel code */
 if (sk-min_offset  (u64)-1  sk-min_offset  sk-max_offset)
  sk-min_offset++;
 else if (sk-min_type  (u8)-1  sk-min_type  sk-max_type) {
  sk-min_offset = 0;
  sk-min_type++;
 } else if (sk-min_objectid  (u64)-1  sk-min_objectid  sk-
 max_objectid){
  sk-min_offset = 0;
  sk-min_type = 0;
 
 Sorry but if you reset the sk-min_type to 0, this means that the min_type 
 lost its purpose (act as lover bound of the acceptance criteria).
 

Yep, the changelog of Chris' commit has said that userland has to do this.

  sk-min_objectid++;
 } else
  break;

 ioctl(...);

 for (i = 0; i  nr_items; i++) {
  if (!filter(items[i]))
  continue;
 
 So you are suggesting: Move all tree items from kernel to user space and 
 filter it in userspace ?. This mean a lot of un-needed kernel-space - 
 userspace transition...
 

Right, but it's fine so far. I'm not suggesting anything, but explaining
how the ioctl is working.

 Sorry I don't understand if we are talking about a workaround or a solution.
 

As Chris said, it's not perfect but we can just live along with it, until
we find some killer app that requests us to improve/expand this ioctl.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 2/5 v3] Btrfs: avoid transaction stuff when btrfs is readonly

2010-12-15 Thread liubo
On 12/16/2010 12:03 AM, Chris Mason wrote:
 Excerpts from liubo's message of 2010-12-15 04:12:14 -0500:
 On 12/15/2010 04:45 PM, Yan, Zheng wrote:
 On Fri, Dec 3, 2010 at 4:16 PM, liubo liubo2...@cn.fujitsu.com wrote:
 When the filesystem is readonly, avoid transaction stuff by checking 
 MS_RDONLY
 at start transaction time.

 Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
 ---
  fs/btrfs/transaction.c |3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)

 diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
 index 1fffbc0..14a597d 100644
 --- a/fs/btrfs/transaction.c
 +++ b/fs/btrfs/transaction.c
 @@ -181,6 +181,9 @@ static struct btrfs_trans_handle 
 *start_transaction(struct btrfs_root *root,
struct btrfs_trans_handle *h;
struct btrfs_transaction *cur_trans;
int ret;
 +
 +   if (root-fs_info-sb-s_flags  MS_RDONLY)
 +   return ERR_PTR(-EROFS);
  again:
h = kmem_cache_alloc(btrfs_trans_handle_cachep, GFP_NOFS);
if (!h)
 There are cases that we need to start transaction when MS_RDONLY flag is 
 set.
 For example, remount FS into read-only mode and log replay.
 However, is it weird to make changes to disk as fs is in readonly state?
 IMO, btrfs needs to limit the use of these disk-change while readonly 
 cases,
 as it is not what readonly means.
 
 reiserfs and ext3 at least have always done this.  Log replay is
 required even when the FS is readonly.
 

My concern is:
now we have a forced readonly FS, which is already broken, if we still write 
something to
disk, would it become more broken?

 Since it has been here, we can bypass readonly in those cases(as I did in 
 the 5th patch):

 ...
 flags = sb-s_flags;
 if (sb-s_flags  MS_RDONLY)
 sb-s_flags = ~MS_RDONLY
 
 I think we should have a dedicated flag to reflect a filesystem that is
 forced readonly, and check that flag instead.

OK, we did have fs_state, a dedicated flag.

thanks,
Liu Bo

 
 -chris
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] fs: fix deadlocks in writeback_if_idle

2010-12-15 Thread Nick Piggin
On Tue, Nov 30, 2010 at 11:01:18AM +1100, Nick Piggin wrote:
 On Mon, Nov 29, 2010 at 02:26:03PM -0800, Andrew Morton wrote:
  On Thu, 25 Nov 2010 14:53:56 +1100
  Nick Piggin npig...@kernel.dk wrote:
   On Wed, Nov 24, 2010 at 02:10:28PM +0100, Jan Kara wrote:
   Well, for now, the easiest and simplest fix is my patch, I think. The
   objection is that we may not write out anything for the specified sb,
   but the current implementation provides no such guarantees at all
   anyway, so I don't think it's a big issue.
  
  Well yes.  We take something which will fail occasionally and with your
  patch replace it with something which will fail a bit more often.  Why
  don't we go all the way and do something which will fail *even more
  often*.  Namely, just delete the damn function in the hope that the
  resulting failures will provoke the ext4 crew into doing something sane
  this time?
 
 I just need it fixed because the deadlocks are constantly hanging my
 tests and/or switching off lockdep.
 
  
  Guys, this delalloc thing *sucks*.  And here we are just sticking new
  bandaids on top of the old bandaids.  And the btrfs approach isn't
  exactly a thing of glory, either.
  
  So...  nope.  I won't be applying Nick's patch.  Please fix this thing
  properly - you have a whole month!
 
 Testers have less. It would be better to fix it now and rip it out at
 the start of the next merge window if you're that way inclined :)

Is this going to be fixed in time, or shall we merge my patch for
2.6.37?

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: fix memory leak on error path in btrfs_get_acl()

2010-12-15 Thread Mariusz Kozlowski
On Wed, Dec 15, 2010 at 11:49:39PM +0100, Miguel Ojeda wrote:
 On Wed, Dec 15, 2010 at 10:57 PM, Mariusz Kozlowski m...@lab.zgora.pl wrote:
  If posix_acl_from_xattr() fails we leak memory stored in 'value'.
 
  Signed-off-by: Mariusz Kozlowski m...@lab.zgora.pl
  ---
   fs/btrfs/acl.c |    1 +
   1 files changed, 1 insertions(+), 0 deletions(-)
 
  diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
  index d16..11c9561 100644
  --- a/fs/btrfs/acl.c
  +++ b/fs/btrfs/acl.c
  @@ -61,6 +61,7 @@ static struct posix_acl *btrfs_get_acl(struct inode 
  *inode, int type)
                 if (size  0) {
                         acl = posix_acl_from_xattr(value, size);
                         if (IS_ERR(acl))
  +                               kfree(value);
 
 Be careful with the evil { }

Dang. Too much python recently i guess. Will send v2 shortly.

Thanks.

                                 return acl;
                         set_cached_acl(inode, type, acl);
                 }
  --
  1.7.0.4
 
  --
  To unsubscribe from this list: send the line unsubscribe linux-kernel in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  Please read the FAQ at  http://www.tux.org/lkml/
 

-- 
Mariusz Kozlowski
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] btrfs: fix memory leak on error path in btrfs_get_acl()

2010-12-15 Thread Mariusz Kozlowski
If posix_acl_from_xattr() fails we leak memory stored in 'value'.

Signed-off-by: Mariusz Kozlowski m...@lab.zgora.pl
---
 fs/btrfs/acl.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
index d16..6d1410e 100644
--- a/fs/btrfs/acl.c
+++ b/fs/btrfs/acl.c
@@ -60,8 +60,10 @@ static struct posix_acl *btrfs_get_acl(struct inode *inode, 
int type)
size = __btrfs_getxattr(inode, name, value, size);
if (size  0) {
acl = posix_acl_from_xattr(value, size);
-   if (IS_ERR(acl))
+   if (IS_ERR(acl)) {
+   kfree(value);
return acl;
+   }
set_cached_acl(inode, type, acl);
}
kfree(value);
-- 
1.7.0.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


BTRFS is broken with discard request?

2010-12-15 Thread gao bo
Hi Kyungmin,
Do you get some updates about your BTRFS is broken with discard request issue?

Do you know how to explicitly trigger discard operation over filesystem?

Thanks.

From: Kyungmin Park kmpark at infradead.org
Subject: BTRFS is broken with discard request?
Newsgroups: gmane.comp.file-systems.btrfs, gmane.linux.kernel.mmc
Date: 2010-06-15 23:17:49 GMT (26 weeks, 1 day, 3 hours and 51 minutes ago)

Deal all,

Now I got test the btrfs with discard option on MMC (implemented
discard request).
but I can't test the iozone benchmark test program with I/O error.

Of course I disabled the discard option. it works well. and discard
handling is tested with VFAT well.
So I think BTRFS discard request seems to broken.

Test kernel is commit 7e27d6e778cd87b6f2415515d7127eba53fe5d02
Author: Linus Torvalds torvalds at linux-foundation.org
Date:   Fri Jun 11 19:14:04 2010 -0700

Linux 2.6.35-rc3

scenarios are below.

Any clues?

Thank you,
Kyungmin Park

# mkfs.btrfs /dev/mmcblk0p1 -L mmc

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label mmc on /dev/mmcblk0p1
nodesize 4096 leafsize 4096 sectorsize 4096 size 7.40GB
Btrfs Btrfs v0.19

# mount -t btrfs -o ssd,discard /dev/mmcblk0p1 /mnt/mmc
[39310.465000] device label mmc devid 1 transid 7 /dev/mmcblk0p1
[39310.47] btrfs: use ssd allocation scheme

# iozone -A -s 10m -U /mnt/mmc -f /mnt/mmc/test -e
Iozone: Performance Test of File I/O
Version $Revision: 3.326 $
Compiled for 32 bit mode.
Build: linux-arm

Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
 Al Slater, Scott Rhine, Mike Wisner, Ken Goss
 Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
 Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
 Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy,
 Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root.

Run began: Thu Jan  1 10:55:35 1970

Auto Mode 2. This option is obsolete. Use -az -i0 -i1
File size set to 10240 KB
Include fsync in write timing
Command line used: iozone -A -s 10m -U /mnt/mmc -f /mnt/mmc/test -e
Output is in Kbytes/sec
Time Resolution = 0.01 seconds.
Processor cache size set to 1024 Kbytes.

Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.

  KB  reclen   write rewritereadreread

   10240   4
[39337.095000] device label mmc devid 1 transid 12 /dev/mmcblk0p1
[39337.10] btree_readpage_end_io_hook: 4 callbacks suppressed
[39337.105000] btrfs bad tree block start 0 29421568
[39337.11] btrfs bad tree block start 0 29421568
[39337.115000] btrfs bad tree block start 0 29425664
[39337.12] btrfs bad tree block start 0 29425664
[39337.125000] btrfs bad tree block start 0 29372416
[39337.13] btrfs bad tree block start 0 29372416
[39337.135000] btrfs bad tree block start 0 29376512
[39337.14] btrfs bad tree block start 0 29376512
[39337.145000] Btrfs detected SSD devices, enabling SSD mode
[39337.15] btrfs bad tree block start 0 29380608
[39337.155000] btrfs bad tree block start 0 29380608
[39338.88] device label mmc devid 1 transid 17 /dev/mmcblk0p1
[39338.895000] Btrfs detected SSD devices, enabling SSD mode
74948721
[39340.83] device label mmc devid 1 transid 21 /dev/mmcblk0p1
[39340.845000] Btrfs detected SSD devices, enabling SSD mode

Error reading block 0 4020
read: Input/output error
#
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html