Re: Please hammer my for-linus branch

2012-07-10 Thread Daniel J Blueman
On 11 July 2012 09:37, Liu Bo  wrote:
> On 07/10/2012 08:18 PM, Daniel J Blueman wrote:
>
>> On 2 July 2012 12:20, Liu Bo  wrote:
>>> On 07/02/2012 11:35 AM, Daniel J Blueman wrote:
>>>
> Hi everyone,
>
> I've got a nice set of fixes from Josef, Jan, Ilya and others in my
> for-linus branch:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git 
> for-linus
>
> Some of the changes are fixes for the tree logging code, so I ran some
> extra crash runs against them Friday night.
>
> I ended up with a new crash in the tree log directory deletion replay
> code, so I didn't send out the pull request to Linus.
>
> It isn't clear yet if the new crash is because I was testing differently
> or if it is a regression.  I'm nailing it down this weekend, but please
> give my for-linus a shot.
 With this branch (3.4.0), my test has consistently been hitting the
 BUG_ON(owner < BTRFS_FIRST_FREE_OBJECTID) in
 insert_inline_extent_backref [1]. This is followed by a string of
 other issues [2] and a hard lockup, so I used netconsole to collect
 this.

 I'm preparing my btrfs test for xfstests integration, but can slip you
 it if interested. It hits this case in ~30s.

>>>
>>> IMO the BUG_ON is meant to avoid to mix 'log tree' in, it should be:
>>>
>>> BUG_ON(owner < BTRFS_FIRST_FREE_OBJECTID && root_objectid == 
>>> BTRFS_TREE_LOG_OBJECTID);
>>>
>>> This should help you, can you give it a try?
>>
>> Bo, this did address the assertion I was tripping, so looks good from
>> here; it allowed me to report the second (different) assertion of
>> course.
>>
>> If you still think the fix is sound, is it a good idea for 3.5-rc7?
>
>
> Hi Daniel,
>
> I'm sorry but it is not ready yet, as it does not catch the root cause of the 
> bug.
>
> Josef has found that the bug comes from disabling merging delayed refs and is 
> working on the bug
> with Arne.  As the root cause has been found, the bug will be fixed soon IMO.

Now I see the two issues are connected.

> Btw, while testing with your great test scripts, I also post patches for two 
> bugs, which may have address your
> other issues.  Their links are
>
> http://www.spinics.net/lists/linux-btrfs/msg17761.html
> http://www.spinics.net/lists/linux-btrfs/msg17764.html

Great work indeed!

Thanks Bo,
  Daniel
-- 
Daniel J Blueman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] Btrfs: improve multi-thread buffer read

2012-07-10 Thread Liu Bo
On 07/11/2012 02:58 AM, Josef Bacik wrote:

> On Tue, Jul 10, 2012 at 05:27:59AM -0600, Liu Bo wrote:
>> While testing with my buffer read fio jobs[1], I find that btrfs does not
>> perform well enough.
>>
>> Here is a scenario in fio jobs:
>>
>> We have 4 threads, "t1 t2 t3 t4", starting to buffer read a same file,
>> and all of them will race on add_to_page_cache_lru(), and if one thread
>> successfully puts its page into the page cache, it takes the responsibility
>> to read the page's data.
>>
>> And what's more, reading a page needs a period of time to finish, in which
>> other threads can slide in and process rest pages:
>>
>>  t1  t2  t3  t4
>>add Page1
>>read Page1  add Page2
>>  | read Page2  add Page3
>>  ||read Page3  add Page4
>>  ||   |read Page4
>> -||---|---|
>>  vv   v   v
>> bio  bio bio bio
>>
>> Now we have four bios, each of which holds only one page since we need to
>> maintain consecutive pages in bio.  Thus, we can end up with far more bios
>> than we need.
>>
>> Here we're going to
>> a) delay the real read-page section and
>> b) try to put more pages into page cache.
>>
>> With that said, we can make each bio hold more pages and reduce the number
>> of bios we need.
>>
>> Here is some numbers taken from fio results:
>>  w/o patch w patch
>>-    ---
>> READ:745MB/s+32%   987MB/s
>>
> 
> Um, I have this in btrfs-next
> 
> Btrfs: use large extent range for read and its endio
> 
> that seems to do the same thing, did you not want to do that anymore?  Thanks,
> 



I'm still hard working on that patchset. :)

Although the patchset is well worthy of testing, it is not good enough for 
btrfs upstream.

While doing some tuning work on it, I realized that I could make this 
improvement without
the help of rwlock extent state stuff, so I made this smaller and cleaner patch 
for upstream
so that we could gain some performance here first.

thanks,
liubo


> Josef
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please hammer my for-linus branch

2012-07-10 Thread Liu Bo
On 07/10/2012 08:18 PM, Daniel J Blueman wrote:

> On 2 July 2012 12:20, Liu Bo  wrote:
>> On 07/02/2012 11:35 AM, Daniel J Blueman wrote:
>>
 Hi everyone,

 I've got a nice set of fixes from Josef, Jan, Ilya and others in my
 for-linus branch:

 git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git 
 for-linus

 Some of the changes are fixes for the tree logging code, so I ran some
 extra crash runs against them Friday night.

 I ended up with a new crash in the tree log directory deletion replay
 code, so I didn't send out the pull request to Linus.

 It isn't clear yet if the new crash is because I was testing differently
 or if it is a regression.  I'm nailing it down this weekend, but please
 give my for-linus a shot.
>>> With this branch (3.4.0), my test has consistently been hitting the
>>> BUG_ON(owner < BTRFS_FIRST_FREE_OBJECTID) in
>>> insert_inline_extent_backref [1]. This is followed by a string of
>>> other issues [2] and a hard lockup, so I used netconsole to collect
>>> this.
>>>
>>> I'm preparing my btrfs test for xfstests integration, but can slip you
>>> it if interested. It hits this case in ~30s.
>>>
>>
>> IMO the BUG_ON is meant to avoid to mix 'log tree' in, it should be:
>>
>> BUG_ON(owner < BTRFS_FIRST_FREE_OBJECTID && root_objectid == 
>> BTRFS_TREE_LOG_OBJECTID);
>>
>> This should help you, can you give it a try?
> 
> Bo, this did address the assertion I was tripping, so looks good from
> here; it allowed me to report the second (different) assertion of
> course.
> 
> If you still think the fix is sound, is it a good idea for 3.5-rc7?
> 


Hi Daniel,

I'm sorry but it is not ready yet, as it does not catch the root cause of the 
bug.

Josef has found that the bug comes from disabling merging delayed refs and is 
working on the bug
with Arne.  As the root cause has been found, the bug will be fixed soon IMO.

Btw, while testing with your great test scripts, I also post patches for two 
bugs, which may have address your
other issues.  Their links are

http://www.spinics.net/lists/linux-btrfs/msg17761.html
http://www.spinics.net/lists/linux-btrfs/msg17764.html

thanks,
liubo

> Thanks,
>   Daniel


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.5.0-rc6: btrfs and LVM snapshots -> wrong devicename in /proc/mounts

2012-07-10 Thread Arnd Hannemann
Hi Goffredo,

Am 10.07.2012 20:42, schrieb Goffredo Baroncelli:
> Hi Arnd,
> 
> I am trying to reproduce this bug. Which kernel version are you using ?

I'm using linus' vanilla tree from Sunday which is 3.5.0-rc6
plus some unsuspicious commits.

Best regards
Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: allow delayed refs to be merged

2012-07-10 Thread Josef Bacik
On Tue, Jul 10, 2012 at 01:39:42PM -0600, Arne Jansen wrote:
> On 07/10/2012 08:52 PM, Josef Bacik wrote:
> > Daniel Blueman reported a bug with fio+balance on a ramdisk setup.
> > Basically what happens is the balance relocates a tree block which will drop
> > the implicit refs for all of its children and adds a full backref.  Once the
> > block is relocated we have to add the implicit refs back, so when we cow the
> > block again we add the implicit refs for its children back.  The problem
> > comes when the original drop ref doesn't get run before we add the implicit
> > refs back.  The delayed ref stuff will specifically prefer ADD operations
> > over DROP to keep us from freeing up an extent that will have references to
> > it, so we try to add the implicit ref before it is actually removed and we
> > panic.  This worked fine before because the add would have just canceled the
> > drop out and we would have been fine.  But the backref walking work needs to
> > be able to freeze the delayed ref stuff in time so we have this ever
> > increasing sequence number that gets attached to all new delayed ref updates
> > which makes us not merge refs and we run into this issue.
> >
> > So since the backref walking stuff doesn't get run all that often we just
> > ignore the sequence updates until somebody actually tries to do the freeze.
> > Then if we try to run the delayed refs we go back and try to merge them in
> > case we get a sequence like this again so we do not panic.
> 
> Subvolume quota will also use it, so it might get used _very_ often.
> Please give me some time to understand the problem deeper. This patch
> adds a lot of complexity, and I'd prefer to find a solution that adds
> none :)
> 

If you've got a better idea then go for it, but I'm coming up short.  One way or
another we need these operations to cancel out of they are both on the same ref
head at the same time.  We may be able to do something like make sure the full
backrefs are added first, then let implicit ref deletes happen, and then let
implicit ref adds happen, but then you are adding even more weird logic to what
can be run when.

The other option is to make relocate not do this dance at all, and I'm not
entirely sure how you would go about this.  I think we are ok leaving the
implicit ref because frankly the children all still belong to the original root,
but I don't understand the relocate code enough to decide if thats ok.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: allow delayed refs to be merged

2012-07-10 Thread Arne Jansen
On 07/10/2012 08:52 PM, Josef Bacik wrote:
> Daniel Blueman reported a bug with fio+balance on a ramdisk setup.
> Basically what happens is the balance relocates a tree block which will drop
> the implicit refs for all of its children and adds a full backref.  Once the
> block is relocated we have to add the implicit refs back, so when we cow the
> block again we add the implicit refs for its children back.  The problem
> comes when the original drop ref doesn't get run before we add the implicit
> refs back.  The delayed ref stuff will specifically prefer ADD operations
> over DROP to keep us from freeing up an extent that will have references to
> it, so we try to add the implicit ref before it is actually removed and we
> panic.  This worked fine before because the add would have just canceled the
> drop out and we would have been fine.  But the backref walking work needs to
> be able to freeze the delayed ref stuff in time so we have this ever
> increasing sequence number that gets attached to all new delayed ref updates
> which makes us not merge refs and we run into this issue.
> 
> So since the backref walking stuff doesn't get run all that often we just
> ignore the sequence updates until somebody actually tries to do the freeze.
> Then if we try to run the delayed refs we go back and try to merge them in
> case we get a sequence like this again so we do not panic.

Subvolume quota will also use it, so it might get used _very_ often.
Please give me some time to understand the problem deeper. This patch
adds a lot of complexity, and I'd prefer to find a solution that adds
none :)

Thanks,
Arne

> 
> I need the consumers of the backref resolution code to test this heavily and
> make sure it doesn't break them.  It makes Daniels original problem go away.
> Thanks,
> 
> Reported-by: Daniel J Blueman 
> Signed-off-by: Josef Bacik 
> ---
>  fs/btrfs/ctree.c   |2 +-
>  fs/btrfs/ctree.h   |4 +-
>  fs/btrfs/delayed-ref.c |  146 
> 
>  fs/btrfs/delayed-ref.h |6 ++-
>  fs/btrfs/extent-tree.c |   20 +-
>  5 files changed, 146 insertions(+), 32 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index 8206b39..a914580 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -846,7 +846,7 @@ static noinline int update_ref_for_cow(struct 
> btrfs_trans_handle *trans,
>   if (new_flags != 0) {
>   ret = btrfs_set_disk_extent_flags(trans, root,
> buf->start,
> -   buf->len,
> +   buf->len, owner,
> new_flags, 0);
>   if (ret)
>   return ret;
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index fa5c45b..1b527bc 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -2574,8 +2574,8 @@ int btrfs_dec_ref(struct btrfs_trans_handle *trans, 
> struct btrfs_root *root,
> struct extent_buffer *buf, int full_backref, int for_cow);
>  int btrfs_set_disk_extent_flags(struct btrfs_trans_handle *trans,
>   struct btrfs_root *root,
> - u64 bytenr, u64 num_bytes, u64 flags,
> - int is_data);
> + u64 bytenr, u64 num_bytes, u64 ref_root,
> + u64 flags, int is_data);
>  int btrfs_free_extent(struct btrfs_trans_handle *trans,
> struct btrfs_root *root,
> u64 bytenr, u64 num_bytes, u64 parent, u64 root_objectid,
> diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
> index 13ae7b0..93b7df1 100644
> --- a/fs/btrfs/delayed-ref.c
> +++ b/fs/btrfs/delayed-ref.c
> @@ -85,7 +85,8 @@ static int comp_data_refs(struct btrfs_delayed_data_ref 
> *ref2,
>   * type of the delayed backrefs and content of delayed backrefs.
>   */
>  static int comp_entry(struct btrfs_delayed_ref_node *ref2,
> -   struct btrfs_delayed_ref_node *ref1)
> +   struct btrfs_delayed_ref_node *ref1,
> +   bool compare_seq)
>  {
>   if (ref1->bytenr < ref2->bytenr)
>   return -1;
> @@ -102,10 +103,12 @@ static int comp_entry(struct btrfs_delayed_ref_node 
> *ref2,
>   if (ref1->type > ref2->type)
>   return 1;
>   /* merging of sequenced refs is not allowed */
> - if (ref1->seq < ref2->seq)
> - return -1;
> - if (ref1->seq > ref2->seq)
> - return 1;
> + if (compare_seq) {
> + if (ref1->seq < ref2->seq)
> + return -1;
> + if (ref1->seq > ref2->seq)
> + return 1;
> + }
>   if (ref1->type == BTRFS_TREE_BLOCK_REF_KEY ||
>   ref1->type == BTRFS_SHARED_BLOCK_REF_KEY) {
>   retu

Re: [PATCH] Btrfs: kill free_space pointer from inode structure

2012-07-10 Thread Josef Bacik
On Mon, Jul 09, 2012 at 08:21:07PM -0600, Li Zefan wrote:
> Inodes always allocate free space with BTRFS_BLOCK_GROUP_DATA type,
> which means every inode has the same BTRFS_I(inode)->free_space pointer.
> 
> This shrinks struct btrfs_inode by 4 bytes (or 8 bytes on 64 bits).
> 
> Signed-off-by: Li Zefan 

Li I can't apply any of your patches because they are all in base64 format and
I'm having a hell of a time pulling them out to apply them, can you resend with
git send-email or something so I can apply them properly?  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs volume suddenly becomes read-only

2012-07-10 Thread Chester
So, I got the sysq-w + the whole dmesg until crash

<6>[0.00] Initializing cgroup subsys cpuset
<6>[0.00] Initializing cgroup subsys cpu
<5>[0.00] Linux version 3.4.0-00091-gcb77fcd (root@navilaptop)
(gcc version 4.6.2 (Gentoo 4.6.2-r1 p1.4, pie-0.5.0) ) #1 SMP Thu Jun
21 18:25:21 CDT 2012
<6>[0.00] Command line: BOOT_IMAGE=/bzImage-btrfs-3.4.0 =
root=/dev/sda6 rootflags=subvolid=264,thread_pool=32 rootfstype=btrfs
radeon.modeset=1 funtoo-ati raid=noautodetect selinux=0 rescue
<6>[0.00] BIOS-provided physical RAM map:
<6>[0.00]  BIOS-e820:  - 0009f800 (usable)
<6>[0.00]  BIOS-e820: 0009f800 - 000a (reserved)
<6>[0.00]  BIOS-e820: 000e - 0010 (reserved)
<6>[0.00]  BIOS-e820: 0010 - bfb3f000 (usable)
<6>[0.00]  BIOS-e820: bfb3f000 - bfbbf000 (reserved)
<6>[0.00]  BIOS-e820: bfbbf000 - bfebf000 (ACPI NVS)
<6>[0.00]  BIOS-e820: bfebf000 - bfef6000 (ACPI data)
<6>[0.00]  BIOS-e820: bfef6000 - bff0 (usable)
<6>[0.00]  BIOS-e820: bff0 - e000 (reserved)
<6>[0.00]  BIOS-e820: f800 - fc00 (reserved)
<6>[0.00]  BIOS-e820: fec0 - fec01000 (reserved)
<6>[0.00]  BIOS-e820: fec1 - fec11000 (reserved)
<6>[0.00]  BIOS-e820: fee0 - fee01000 (reserved)
<6>[0.00]  BIOS-e820: ffe0 - 0001 (reserved)
<6>[0.00]  BIOS-e820: 0001 - 00019f00 (usable)
<6>[0.00] NX (Execute Disable) protection: active
<6>[0.00] DMI 2.7 present.
<7>[0.00] DMI: Hewlett-Packard HP Pavilion dv6 Notebook
PC/358B, BIOS F.21 09/13/2011
<7>[0.00] e820 update range:  -
0001 (usable) ==> (reserved)
<7>[0.00] e820 remove range: 000a -
0010 (usable)
<6>[0.00] No AGP bridge found
<6>[0.00] last_pfn = 0x19f000 max_arch_pfn = 0x4
<7>[0.00] MTRR default type: uncachable
<7>[0.00] MTRR fixed ranges enabled:
<7>[0.00]   0-9 write-back
<7>[0.00]   A-B uncachable
<7>[0.00]   C-F write-through
<7>[0.00] MTRR variable ranges enabled:
<7>[0.00]   0 base 00 mask FF8000 write-back
<7>[0.00]   1 base 008000 mask FFC000 write-back
<7>[0.00]   2 base 00BFEBD000 mask FFF000 uncachable
<7>[0.00]   3 base 00FFE0 mask E0 write-protect
<7>[0.00]   4 disabled
<7>[0.00]   5 disabled
<7>[0.00]   6 disabled
<7>[0.00]   7 disabled
<7>[0.00] TOM2: 00019f00 aka 6640M
<6>[0.00] x86 PAT enabled: cpu 0, old 0x7040600070406, new
0x7010600070106
<6>[0.00] last_pfn = 0xbff00 max_arch_pfn = 0x4
<6>[0.00] found SMP MP-table at [880fe1b0] fe1b0
<7>[0.00] initial memory mapped : 0 - 2000
<7>[0.00] Base memory trampoline at [8809a000] 9a000 size 20480
<6>[0.00] Using GB pages for direct mapping
<6>[0.00] init_memory_mapping: -bff0
<7>[0.00]  00 - 008000 page 1G
<7>[0.00]  008000 - 00bfe0 page 2M
<7>[0.00]  00bfe0 - 00bff0 page 4k
<7>[0.00] kernel direct mapping tables up to bff0 @
1fffd000-2000
<6>[0.00] init_memory_mapping: 0001-00019f00
<7>[0.00]  01 - 018000 page 1G
<7>[0.00]  018000 - 019f00 page 2M
<7>[0.00] kernel direct mapping tables up to 19f00 @
bfefe000-bff0
<6>[0.00] RAMDISK: 37dc7000 - 37ff
<4>[0.00] ACPI: RSDP 000fe020 00024 (v02 HPQOEM)
<4>[0.00] ACPI: XSDT bfef5120 0007C (v01 HPQOEM
SLIC-MPC 0001  0113)
<4>[0.00] ACPI: FACP bfef4000 000F4 (v04 HPQOEM
SLIC-MPC 0001 ACPI 0004)
<4>[0.00] ACPI: DSDT bfede000 11490 (v01 HP INSYDE
  F000 ACPI 0004)
<4>[0.00] ACPI: FACS bfc95000 00040
<4>[0.00] ACPI: HPET bfef3000 00038 (v01 HP INSYDE
  0001 ACPI 0004)
<4>[0.00] ACPI: APIC bfef2000 00084 (v02 HP INSYDE
  0001 ACPI 0004)
<4>[0.00] ACPI: MCFG bfef1000 0003C (v01 HP INSYDE
  0001 ACPI 0004)
<4>[0.00] ACPI: ASF! bfef 000A5 (v32 HP INSYDE
  0001 ACPI 0004)
<4>[0.00] ACPI: BOOT bfedd000 00028 (v01 HP INSYDE
  0001 ACPI 0004)
<4>[0.00] ACPI: SLIC bfedc000 00176 (v01 HPQOEM
SLIC-MPC 0001 ACPI 0004)
<4>[0.00] ACPI: WDRT bfedb000 00047 (v01 HP INSYDE
  0001 ACPI 0004)
<4>[

Re: [PATCH RFC] Btrfs: improve multi-thread buffer read

2012-07-10 Thread Josef Bacik
On Tue, Jul 10, 2012 at 05:27:59AM -0600, Liu Bo wrote:
> While testing with my buffer read fio jobs[1], I find that btrfs does not
> perform well enough.
> 
> Here is a scenario in fio jobs:
> 
> We have 4 threads, "t1 t2 t3 t4", starting to buffer read a same file,
> and all of them will race on add_to_page_cache_lru(), and if one thread
> successfully puts its page into the page cache, it takes the responsibility
> to read the page's data.
> 
> And what's more, reading a page needs a period of time to finish, in which
> other threads can slide in and process rest pages:
> 
>  t1  t2  t3  t4
>add Page1
>read Page1  add Page2
>  | read Page2  add Page3
>  ||read Page3  add Page4
>  ||   |read Page4
> -||---|---|
>  vv   v   v
> bio  bio bio bio
> 
> Now we have four bios, each of which holds only one page since we need to
> maintain consecutive pages in bio.  Thus, we can end up with far more bios
> than we need.
> 
> Here we're going to
> a) delay the real read-page section and
> b) try to put more pages into page cache.
> 
> With that said, we can make each bio hold more pages and reduce the number
> of bios we need.
> 
> Here is some numbers taken from fio results:
>  w/o patch w patch
>-    ---
> READ:745MB/s+32%   987MB/s
> 

Um, I have this in btrfs-next

Btrfs: use large extent range for read and its endio

that seems to do the same thing, did you not want to do that anymore?  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: allow delayed refs to be merged

2012-07-10 Thread Josef Bacik
Daniel Blueman reported a bug with fio+balance on a ramdisk setup.
Basically what happens is the balance relocates a tree block which will drop
the implicit refs for all of its children and adds a full backref.  Once the
block is relocated we have to add the implicit refs back, so when we cow the
block again we add the implicit refs for its children back.  The problem
comes when the original drop ref doesn't get run before we add the implicit
refs back.  The delayed ref stuff will specifically prefer ADD operations
over DROP to keep us from freeing up an extent that will have references to
it, so we try to add the implicit ref before it is actually removed and we
panic.  This worked fine before because the add would have just canceled the
drop out and we would have been fine.  But the backref walking work needs to
be able to freeze the delayed ref stuff in time so we have this ever
increasing sequence number that gets attached to all new delayed ref updates
which makes us not merge refs and we run into this issue.

So since the backref walking stuff doesn't get run all that often we just
ignore the sequence updates until somebody actually tries to do the freeze.
Then if we try to run the delayed refs we go back and try to merge them in
case we get a sequence like this again so we do not panic.

I need the consumers of the backref resolution code to test this heavily and
make sure it doesn't break them.  It makes Daniels original problem go away.
Thanks,

Reported-by: Daniel J Blueman 
Signed-off-by: Josef Bacik 
---
 fs/btrfs/ctree.c   |2 +-
 fs/btrfs/ctree.h   |4 +-
 fs/btrfs/delayed-ref.c |  146 
 fs/btrfs/delayed-ref.h |6 ++-
 fs/btrfs/extent-tree.c |   20 +-
 5 files changed, 146 insertions(+), 32 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 8206b39..a914580 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -846,7 +846,7 @@ static noinline int update_ref_for_cow(struct 
btrfs_trans_handle *trans,
if (new_flags != 0) {
ret = btrfs_set_disk_extent_flags(trans, root,
  buf->start,
- buf->len,
+ buf->len, owner,
  new_flags, 0);
if (ret)
return ret;
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index fa5c45b..1b527bc 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2574,8 +2574,8 @@ int btrfs_dec_ref(struct btrfs_trans_handle *trans, 
struct btrfs_root *root,
  struct extent_buffer *buf, int full_backref, int for_cow);
 int btrfs_set_disk_extent_flags(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
-   u64 bytenr, u64 num_bytes, u64 flags,
-   int is_data);
+   u64 bytenr, u64 num_bytes, u64 ref_root,
+   u64 flags, int is_data);
 int btrfs_free_extent(struct btrfs_trans_handle *trans,
  struct btrfs_root *root,
  u64 bytenr, u64 num_bytes, u64 parent, u64 root_objectid,
diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
index 13ae7b0..93b7df1 100644
--- a/fs/btrfs/delayed-ref.c
+++ b/fs/btrfs/delayed-ref.c
@@ -85,7 +85,8 @@ static int comp_data_refs(struct btrfs_delayed_data_ref *ref2,
  * type of the delayed backrefs and content of delayed backrefs.
  */
 static int comp_entry(struct btrfs_delayed_ref_node *ref2,
- struct btrfs_delayed_ref_node *ref1)
+ struct btrfs_delayed_ref_node *ref1,
+ bool compare_seq)
 {
if (ref1->bytenr < ref2->bytenr)
return -1;
@@ -102,10 +103,12 @@ static int comp_entry(struct btrfs_delayed_ref_node *ref2,
if (ref1->type > ref2->type)
return 1;
/* merging of sequenced refs is not allowed */
-   if (ref1->seq < ref2->seq)
-   return -1;
-   if (ref1->seq > ref2->seq)
-   return 1;
+   if (compare_seq) {
+   if (ref1->seq < ref2->seq)
+   return -1;
+   if (ref1->seq > ref2->seq)
+   return 1;
+   }
if (ref1->type == BTRFS_TREE_BLOCK_REF_KEY ||
ref1->type == BTRFS_SHARED_BLOCK_REF_KEY) {
return comp_tree_refs(btrfs_delayed_node_to_tree_ref(ref2),
@@ -139,7 +142,7 @@ static struct btrfs_delayed_ref_node *tree_insert(struct 
rb_root *root,
entry = rb_entry(parent_node, struct btrfs_delayed_ref_node,
 rb_node);
 
-   cmp = comp_entry(entry, ins);
+   cmp = comp_entry(entry, ins, 1);
if (cmp < 0)
  

Re: 3.5.0-rc6: btrfs and LVM snapshots -> wrong devicename in /proc/mounts

2012-07-10 Thread Goffredo Baroncelli
Hi Arnd,

I am trying to reproduce this bug. Which kernel version are you using ?

BR
G.Baroncelli

On 07/10/2012 07:55 PM, Goffredo Baroncelli wrote:
> On 07/10/2012 10:52 AM, Arnd Hannemann wrote:
>> Hi,
>>
>> Am 10.07.2012 05:30, schrieb Christian Robert:
>>> I agree with you, but you should never mount a snapshot of a btrfs 
>>> filesystem at the same time the original is,
>>> because both the original and the snapshot had same "device fsid 
>>> 5c3e8ca2-da56-4ade-9fef-103a6a8a70c2"
> 
> I think that the kernel should be smarter in this regard.
> 
> At kernel level, as golden rule it should be not possible to add a
> duplicate fsid if the previous one is mounted. "btrfs dev scan" MUST
> return an error in this case. This in any case is an error.
> 
> Today it seems that if a device with the same fsid is already
> registered, the new one overwrites the old one (or almost the name is
> overwritten). This could be acceptable if the filesystem is unmounted. I
> think that it is a serious error otherwise.
> 
>>>
>>> the kernel will tkink twice and fold back to the same device.
>>
>> If that is correct the bug is that the kernel lets me mount the same device 
>> fsid on different devices twice.
>>
>>>
>>> btrsf does not behave like other filesystems, you can't snapshot a btrfs 
>>> filesystem
>>> and hope to mount the snapshot somewhere else.
>>
>>> snapsoot also duplicate lots of things internally that have no sence in a 
>>> snapshot (like raid level, single or multiple devies ...)
>>
>> I see. However, I expect a "simple" btrfs to just work or fail gracefully.
>>
>> Best regards,
>> Arnd
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> .
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> .
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.5.0-rc6: btrfs and LVM snapshots -> wrong devicename in /proc/mounts

2012-07-10 Thread Goffredo Baroncelli
On 07/10/2012 10:52 AM, Arnd Hannemann wrote:
> Hi,
> 
> Am 10.07.2012 05:30, schrieb Christian Robert:
>> I agree with you, but you should never mount a snapshot of a btrfs 
>> filesystem at the same time the original is,
>> because both the original and the snapshot had same "device fsid 
>> 5c3e8ca2-da56-4ade-9fef-103a6a8a70c2"

I think that the kernel should be smarter in this regard.

At kernel level, as golden rule it should be not possible to add a
duplicate fsid if the previous one is mounted. "btrfs dev scan" MUST
return an error in this case. This in any case is an error.

Today it seems that if a device with the same fsid is already
registered, the new one overwrites the old one (or almost the name is
overwritten). This could be acceptable if the filesystem is unmounted. I
think that it is a serious error otherwise.

>>
>> the kernel will tkink twice and fold back to the same device.
> 
> If that is correct the bug is that the kernel lets me mount the same device 
> fsid on different devices twice.
> 
>>
>> btrsf does not behave like other filesystems, you can't snapshot a btrfs 
>> filesystem
>> and hope to mount the snapshot somewhere else.
> 
>> snapsoot also duplicate lots of things internally that have no sence in a 
>> snapshot (like raid level, single or multiple devies ...)
> 
> I see. However, I expect a "simple" btrfs to just work or fail gracefully.
> 
> Best regards,
> Arnd
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> .
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] A way to tell if all the devices in a file system are available

2012-07-10 Thread Harald Hoyer
Am 21.06.2012 22:10, schrieb Josef Bacik:
> Harald Hoyer has had this as a feature request for ages and I've finally 
> gotten
> around to hacking something up.  This is probably going to get bikeshedded to
> death, bring it on, I'm not married to any of the behaviors in these patches, 
> I
> just want to get the ball rolling so we can have something in place for 3.6.
> 
> Basically all I've done is saved how many devices the super block thinks we 
> have
> into the fs_devices struct whenever we scan a device.  Then all we have to do
> for the IOCTL is compare how many devices the fs_devices struct has in it to 
> how
> many we think we need.
> 
> The command itself just spits out 0 for yay we're ready and 1 for boo no we're
> not.  This makes it easier for Harald to do his multi-device btrfs support in
> dracut.  Thanks,
> 
> Josef
> 

any news on this?

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 7/7] Btrfs: introduce BTRFS_IOC_SEND for btrfs send/receive (part 2)

2012-07-10 Thread Alex Lyakas
Alexander,
this focuses on area of sending file extents:

> +static int is_extent_unchanged(struct send_ctx *sctx,
> +  struct btrfs_path *left_path,
> +  struct btrfs_key *ekey)
> +{
> +   int ret = 0;
> +   struct btrfs_key key;
> +   struct btrfs_path *path = NULL;
> +   struct extent_buffer *eb;
> +   int slot;
> +   struct btrfs_key found_key;
> +   struct btrfs_file_extent_item *ei;
> +   u64 left_disknr;
> +   u64 right_disknr;
> +   u64 left_offset;
> +   u64 right_offset;
> +   u64 left_len;
> +   u64 right_len;
> +   u8 left_type;
> +   u8 right_type;
> +
> +   path = alloc_path_for_send();
> +   if (!path)
> +   return -ENOMEM;
> +
> +   eb = left_path->nodes[0];
> +   slot = left_path->slots[0];
> +
> +   ei = btrfs_item_ptr(eb, slot, struct btrfs_file_extent_item);
> +   left_type = btrfs_file_extent_type(eb, ei);
> +   left_disknr = btrfs_file_extent_disk_bytenr(eb, ei);
> +   left_len = btrfs_file_extent_num_bytes(eb, ei);
> +   left_offset = btrfs_file_extent_offset(eb, ei);
> +
> +   if (left_type != BTRFS_FILE_EXTENT_REG) {
> +   ret = 0;
> +   goto out;
> +   }
> +
> +   key.objectid = ekey->objectid;
> +   key.type = BTRFS_EXTENT_DATA_KEY;
> +   key.offset = ekey->offset;
> +
> +   while (1) {
> +   ret = btrfs_search_slot_for_read(sctx->parent_root, &key, 
> path,
> +   0, 0);
> +   if (ret < 0)
> +   goto out;
> +   if (ret) {
> +   ret = 0;
> +   goto out;
> +   }
> +   btrfs_item_key_to_cpu(path->nodes[0], &found_key,
> +   path->slots[0]);
> +   if (found_key.objectid != key.objectid ||
> +   found_key.type != key.type) {
> +   ret = 0;
> +   goto out;
> +   }
> +
> +   eb = path->nodes[0];
> +   slot = path->slots[0];
> +
> +   ei = btrfs_item_ptr(eb, slot, struct btrfs_file_extent_item);
> +   right_type = btrfs_file_extent_type(eb, ei);
> +   right_disknr = btrfs_file_extent_disk_bytenr(eb, ei);
> +   right_len = btrfs_file_extent_num_bytes(eb, ei);
> +   right_offset = btrfs_file_extent_offset(eb, ei);
> +   btrfs_release_path(path);
> +
> +   if (right_type != BTRFS_FILE_EXTENT_REG) {
> +   ret = 0;
> +   goto out;
> +   }
> +
> +   if (left_disknr != right_disknr) {
> +   ret = 0;
> +   goto out;
> +   }
> +
> +   key.offset = found_key.offset + right_len;
> +   if (key.offset >= ekey->offset + left_len) {
> +   ret = 1;
> +   goto out;
> +   }
> +   }
> +
> +out:
> +   btrfs_free_path(path);
> +   return ret;
> +}
> +

Should we always treat left extent with bytenr==0 as not changed?
Because right now, it simply reads and sends data of such extent,
while bytenr==0 means "no data allocated here". Since we always do
send_truncate() afterwards, file size will always be correct, so we
can just skip bytenr==0 extents.
Same is true for BTRFS_FILE_EXTENT_PREALLOC extents, I think. Those
also don't contain real data.
So something like:
if (left_disknr == 0 || left_type == BTRFS_FILE_EXTENT_REG) {
ret = 1;
goto out;
}
before we check for BTRFS_FILE_EXTENT_REG.

Now I have a question about the rest of the logic that decides that
extent is unchanged. I understand that if we see the same extent (same
disk_bytenr) shared between parent_root and send_root, then it must
contain the same data, even in nodatacow mode, because on a first
write to such shared extent, it is cow'ed even with nodatacow.

However, shouldn't we check btrfs_file_extent_offset(), to make sure
that both send_root and parent_root point at the same offset into
extent from the same file offset? Because if extent_offset values are
different, then the data of the file might different, even though we
are talking about the same extent.

So I am thinking about something like:

- ekey.offset points at data at logical address
left_disknr+left_offset (logical address within CHUNK_ITEM address
space) for left_len bytes
- found_key.offset points at data at logical address
right_disknr+right_offset for right_len
- we know that found_key.offset <= ekey.offset

So we need to ensure that left_disknr==right_disknr and also:
right_disknr+right_offset + (ekey.offset - found_key.offset) ==
left_disknr+left_offset
or does this while loop somehow ensures this equation?

However, I must admit I don't fully understand the logic behind
deciding that extent is unc

[PATCH] Btrfs: avoid I/O repair BUG() from btree_read_extent_buffer_pages()

2012-07-10 Thread Stefan Behrens
>From btree_read_extent_buffer_pages(), currently repair_io_failure()
can be called with mirror_num being zero when submit_one_bio() returned
an error before. This used to cause a BUG_ON(!mirror_num) in
repair_io_failure() and indeed this is not a case that needs the I/O
repair code to rewrite disk blocks.
This commit prevents calling repair_io_failure() in this case and thus
avoids the BUG_ON() and malfunction.

Signed-off-by: Stefan Behrens 
---
 fs/btrfs/disk-io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8cc4710..0a7a99b 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -407,7 +407,7 @@ static int btree_read_extent_buffer_pages(struct btrfs_root 
*root,
break;
}
 
-   if (failed && !ret)
+   if (failed && !ret && failed_mirror)
repair_eb_io_failure(root, eb, failed_mirror);
 
return ret;
-- 
1.7.11.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at fs/btrfs/extent_io.c:1893!

2012-07-10 Thread Stefan Behrens
On Tue, 10 Jul 2012 09:48:27 +1000, Shavi N wrote:
> Hi,
> 
> I have this problem after trying to run btrfsck.
> I have new 11 HDDs WD 2tb, on two RAID controllers
> Arch Linux, latest kernel. What I was doing was copying and reading
> multiple data at the same time
> After getting I/O errors while trying to access through samba and
> while trying to run a VM with files stored on volume, I ran a btrfsck
> but it failed.


> [ 7155.563435] kernel BUG at fs/btrfs/extent_io.c:1893!
> [ 7155.563437] invalid opcode:  [#2] PREEMPT SMP
> [ 7155.563439] CPU 1
> [ 7155.563439] Modules linked in: nfnetlink_log nfnetlink hwmon_vid
> reiserfs btrfs zlib_deflate libcrc32c microcode ghash_clmulni_intel
> cryptd cx22702 cx88_dvb videobuf_dvb cx88_vp3054_i2c dvb_core
> rc_winfast tuner_simple tuner_types eeepc_wmi asus_wmi pci_hotplug
> tuner cx88_alsa snd_pcm snd_page_alloc snd_timer snd i915
> drm_kms_helper cx8802 cx8800 cx88xx tveeprom btcx_risc videobuf_dma_sg
> mei(C) i2c_i801 drm intel_agp soundcore i2c_algo_bit videobuf_core
> v4l2_common e1000e videodev media rc_core i2c_core iTCO_wdt
> iTCO_vendor_support button acpi_cpufreq mperf processor pcspkr fan
> video thermal sparse_keymap rfkill coretemp intel_gtt wmi evdev
> vboxnetflt(O) crc32c_intel vboxdrv(O) ext4 crc16 jbd2 mbcache usbhid
> hid sd_mod mptsas scsi_transport_sas mptscsih mptbase ahci libahci
> libata xhci_hcd ehci_hcd scsi_mod usbcore usb_common
> [ 7155.563469]
> [ 7155.563471] Pid: 2550, comm: btrfs-delayed-m Tainted: G  D  C O
> 3.4.4-2-ARCH #1 System manufacturer System Product Name/P8Z68-V LX
> [ 7155.563473] RIP: 0010:[]  []
> repair_io_failure+0x17f/0x1c0 [btrfs]
> [ 7155.563485] RSP: 0018:8802f34bb7d0  EFLAGS: 00010246
> [ 7155.563487] RAX: 8802f34bb800 RBX: 023f77d8 RCX: 
> 023f77d8
> [ 7155.563488] RDX: 1000 RSI: 023f77d8 RDI: 
> 8803fa4a0108
> [ 7155.563489] RBP: 8802f34bb840 R08: ea000f21d400 R09: 
> 
> [ 7155.563491] R10: 57ffad78bf21d400 R11: 0001 R12: 
> 1000
> [ 7155.563492] R13: ea000f21d400 R14: 8803fa4a0108 R15: 
> 
> [ 7155.563494] FS:  () GS:88041f28()
> knlGS:
> [ 7155.563496] CS:  0010 DS:  ES:  CR0: 8005003b
> [ 7155.563497] CR2: 0111dc40 CR3: 0180b000 CR4: 
> 000427e0
> [ 7155.563498] DR0:  DR1:  DR2: 
> 
> [ 7155.563500] DR3:  DR6: 0ff0 DR7: 
> 0400
> [ 7155.563502] Process btrfs-delayed-m (pid: 2550, threadinfo
> 8802f34ba000, task 8803eac44f60)
> [ 7155.563503] Stack:
> [ 7155.563504]  8802f34bb7d0 023f77d8 
> 
> [ 7155.563506]    8802f34bb800
> 8802f34bb800
> [ 7155.563508]  88020001 023f77d8 
> 8803fa4a0108
> [ 7155.563511] Call Trace:
> [ 7155.563520]  [] repair_eb_io_failure+0x82/0xb0 [btrfs]
> [ 7155.563534]  []
> btree_read_extent_buffer_pages.constprop.111+0x112/0x120 [btrfs]
> [ 7155.563539]  [] read_tree_block+0x3a/0x50 [btrfs]
> [ 7155.563544]  []
> read_block_for_search.isra.32+0x124/0x3d0 [btrfs]
> [ 7155.563548]  [] ?
> generic_bin_search.constprop.34+0x6b/0x180 [btrfs]
> [ 7155.563554]  [] ? btrfs_tree_read_unlock+0x72/0xb0 
> [btrfs]
> [ 7155.563558]  [] btrfs_search_slot+0x3ec/0x900 [btrfs]
> [ 7155.563563]  [] ?
> add_delayed_tree_ref.isra.4+0xb1/0x1f0 [btrfs]
> [ 7155.563568]  [] ?
> add_delayed_ref_head.isra.1+0xbd/0x1b0 [btrfs]
> [ 7155.563573]  [] btrfs_lookup_extent_info+0x84/0x2f0 
> [btrfs]
> [ 7155.563578]  [] ?
> btrfs_alloc_free_block+0x25c/0x380 [btrfs]
> [ 7155.563582]  [] update_ref_for_cow+0x17a/0x300 [btrfs]
> [ 7155.563586]  [] __btrfs_cow_block+0x230/0x510 [btrfs]
> [ 7155.563591]  [] ? btrfs_buffer_uptodate+0x6d/0x80 [btrfs]
> [ 7155.563596]  [] btrfs_cow_block+0xf7/0x230 [btrfs]
> [ 7155.563600]  [] btrfs_search_slot+0x193/0x900 [btrfs]
> [ 7155.563605]  [] ?
> btrfs_run_delayed_refs+0x1cb/0x450 [btrfs]
> [ 7155.563610]  [] btrfs_lookup_inode+0x2f/0xa0 [btrfs]
> [ 7155.563612]  [] ? mutex_lock+0x16/0x30
> [ 7155.563617]  []
> btrfs_update_delayed_inode+0x71/0x150 [btrfs]
> [ 7155.563622]  []
> btrfs_async_run_delayed_node_done+0x12a/0x1b0 [btrfs]
> [ 7155.563628]  [] worker_loop+0x13d/0x570 [btrfs]
> [ 7155.563633]  [] ? btrfs_queue_worker+0x320/0x320 [btrfs]
> [ 7155.563635]  [] kthread+0x93/0xa0
> [ 7155.563637]  [] kernel_thread_helper+0x4/0x10
> [ 7155.563638]  [] ? kthread_freezable_should_stop+0x70/0x70
> [ 7155.563640]  [] ? gs_change+0x13/0x13
> [ 7155.563641] Code: 68 f4 ba e0 b8 fb ff ff ff 48 8b 5d d8 4c 8b 65
> e0 4c 8b 6d e8 4c 8b 75 f0 4c 8b 7d f8 c9 c3 0f 1f 44 00 00 b8 fb ff
> ff ff eb de <0f> 0b 0f 0b 49 8b 45 08 49 8b 8f 88 00 00 00 4d 89 f0 48
> 8b 55
> [ 7155.563652] RIP  [] repair_io_failure+0x17f/0x1c0 [btrfs]
> [ 7155.563658]  RSP 
> [ 7155.

Re: Please hammer my for-linus branch

2012-07-10 Thread Daniel J Blueman
On 2 July 2012 12:20, Liu Bo  wrote:
> On 07/02/2012 11:35 AM, Daniel J Blueman wrote:
>
>>> Hi everyone,
>>>
>>> I've got a nice set of fixes from Josef, Jan, Ilya and others in my
>>> for-linus branch:
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git 
>>> for-linus
>>>
>>> Some of the changes are fixes for the tree logging code, so I ran some
>>> extra crash runs against them Friday night.
>>>
>>> I ended up with a new crash in the tree log directory deletion replay
>>> code, so I didn't send out the pull request to Linus.
>>>
>>> It isn't clear yet if the new crash is because I was testing differently
>>> or if it is a regression.  I'm nailing it down this weekend, but please
>>> give my for-linus a shot.
>>
>> With this branch (3.4.0), my test has consistently been hitting the
>> BUG_ON(owner < BTRFS_FIRST_FREE_OBJECTID) in
>> insert_inline_extent_backref [1]. This is followed by a string of
>> other issues [2] and a hard lockup, so I used netconsole to collect
>> this.
>>
>> I'm preparing my btrfs test for xfstests integration, but can slip you
>> it if interested. It hits this case in ~30s.
>>
>
>
> IMO the BUG_ON is meant to avoid to mix 'log tree' in, it should be:
>
> BUG_ON(owner < BTRFS_FIRST_FREE_OBJECTID && root_objectid == 
> BTRFS_TREE_LOG_OBJECTID);
>
> This should help you, can you give it a try?

Bo, this did address the assertion I was tripping, so looks good from
here; it allowed me to report the second (different) assertion of
course.

If you still think the fix is sound, is it a good idea for 3.5-rc7?

Thanks,
  Daniel
-- 
Daniel J Blueman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Btrfs: kill root from btrfs_is_free_space_inode

2012-07-10 Thread Liu Bo
Since root can be fetched via BTRFS_I macro directly, we can save an args
for btrfs_is_free_space_inode().

Signed-off-by: Liu Bo 
---
 fs/btrfs/btrfs_inode.h |5 +++--
 fs/btrfs/extent-tree.c |2 +-
 fs/btrfs/file-item.c   |2 +-
 fs/btrfs/inode.c   |   22 +++---
 4 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index b168238..21b8cfe 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -191,9 +191,10 @@ static inline void btrfs_i_size_write(struct inode *inode, 
u64 size)
BTRFS_I(inode)->disk_i_size = size;
 }
 
-static inline bool btrfs_is_free_space_inode(struct btrfs_root *root,
-  struct inode *inode)
+static inline bool btrfs_is_free_space_inode(struct inode *inode)
 {
+   struct btrfs_root *root = BTRFS_I(inode)->root;
+
if (root == root->fs_info->tree_root &&
btrfs_ino(inode) != BTRFS_BTREE_INODE_OBJECTID)
return true;
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 6e1d367..07087f6 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -,7 +,7 @@ int btrfs_delalloc_reserve_metadata(struct inode *inode, 
u64 num_bytes)
int ret;
 
/* Need to be holding the i_mutex here if we aren't free space cache */
-   if (btrfs_is_free_space_inode(root, inode))
+   if (btrfs_is_free_space_inode(inode))
flush = 0;
 
if (flush && btrfs_transaction_in_commit(root->fs_info))
diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
index 5d158d3..af87025 100644
--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -183,7 +183,7 @@ static int __btrfs_lookup_bio_sums(struct btrfs_root *root,
 * read from the commit root and sidestep a nasty deadlock
 * between reading the free space cache and updating the csum tree.
 */
-   if (btrfs_is_free_space_inode(root, inode)) {
+   if (btrfs_is_free_space_inode(inode)) {
path->search_commit_root = 1;
path->skip_locking = 1;
}
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index a7d1921..5d463d4 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -825,7 +825,7 @@ static noinline int cow_file_range(struct inode *inode,
struct extent_map_tree *em_tree = &BTRFS_I(inode)->extent_tree;
int ret = 0;
 
-   BUG_ON(btrfs_is_free_space_inode(root, inode));
+   BUG_ON(btrfs_is_free_space_inode(inode));
trans = btrfs_join_transaction(root);
if (IS_ERR(trans)) {
extent_clear_unlock_delalloc(inode,
@@ -1153,7 +1153,7 @@ static noinline int run_delalloc_nocow(struct inode 
*inode,
return -ENOMEM;
}
 
-   nolock = btrfs_is_free_space_inode(root, inode);
+   nolock = btrfs_is_free_space_inode(inode);
 
if (nolock)
trans = btrfs_join_transaction_nolock(root);
@@ -1466,7 +1466,7 @@ static void btrfs_set_bit_hook(struct inode *inode,
if (!(state->state & EXTENT_DELALLOC) && (*bits & EXTENT_DELALLOC)) {
struct btrfs_root *root = BTRFS_I(inode)->root;
u64 len = state->end + 1 - state->start;
-   bool do_list = !btrfs_is_free_space_inode(root, inode);
+   bool do_list = !btrfs_is_free_space_inode(inode);
 
if (*bits & EXTENT_FIRST_DELALLOC) {
*bits &= ~EXTENT_FIRST_DELALLOC;
@@ -1501,7 +1501,7 @@ static void btrfs_clear_bit_hook(struct inode *inode,
if ((state->state & EXTENT_DELALLOC) && (*bits & EXTENT_DELALLOC)) {
struct btrfs_root *root = BTRFS_I(inode)->root;
u64 len = state->end + 1 - state->start;
-   bool do_list = !btrfs_is_free_space_inode(root, inode);
+   bool do_list = !btrfs_is_free_space_inode(inode);
 
if (*bits & EXTENT_FIRST_DELALLOC) {
*bits &= ~EXTENT_FIRST_DELALLOC;
@@ -1612,7 +1612,7 @@ static int btrfs_submit_bio_hook(struct inode *inode, int 
rw, struct bio *bio,
 
skip_sum = BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM;
 
-   if (btrfs_is_free_space_inode(root, inode))
+   if (btrfs_is_free_space_inode(inode))
metadata = 2;
 
if (!(rw & REQ_WRITE)) {
@@ -1869,7 +1869,7 @@ static int btrfs_finish_ordered_io(struct 
btrfs_ordered_extent *ordered_extent)
int ret;
bool nolock;
 
-   nolock = btrfs_is_free_space_inode(root, inode);
+   nolock = btrfs_is_free_space_inode(inode);
 
if (test_bit(BTRFS_ORDERED_IOERR, &ordered_extent->flags)) {
ret = -EIO;
@@ -2007,7 +2007,7 @@ static int btrfs_writepage_end_io_hook(struct page *page, 
u64 start, u64 end,
ordered_extent->work.func = finish_ordered_fn;
ordered_extent->work.flags = 0;
 
-   if (btrfs_is_free_space_inode(root, inode))
+   if (btrfs_is_free_spa

[PATCH RFC] Btrfs: improve multi-thread buffer read

2012-07-10 Thread Liu Bo
While testing with my buffer read fio jobs[1], I find that btrfs does not
perform well enough.

Here is a scenario in fio jobs:

We have 4 threads, "t1 t2 t3 t4", starting to buffer read a same file,
and all of them will race on add_to_page_cache_lru(), and if one thread
successfully puts its page into the page cache, it takes the responsibility
to read the page's data.

And what's more, reading a page needs a period of time to finish, in which
other threads can slide in and process rest pages:

 t1  t2  t3  t4
   add Page1
   read Page1  add Page2
 | read Page2  add Page3
 ||read Page3  add Page4
 ||   |read Page4
-||---|---|
 vv   v   v
bio  bio bio bio

Now we have four bios, each of which holds only one page since we need to
maintain consecutive pages in bio.  Thus, we can end up with far more bios
than we need.

Here we're going to
a) delay the real read-page section and
b) try to put more pages into page cache.

With that said, we can make each bio hold more pages and reduce the number
of bios we need.

Here is some numbers taken from fio results:
 w/o patch w patch
   -    ---
READ:745MB/s+32%   987MB/s

[1]:
[global]
group_reporting
thread
numjobs=4
bs=32k
rw=read
ioengine=sync
directory=/mnt/btrfs/

[READ]
filename=foobar
size=2000M
invalidate=1

Signed-off-by: Liu Bo 
---
 fs/btrfs/extent_io.c |   37 +++--
 1 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 01c21b6..8f9c18d 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3549,6 +3549,11 @@ int extent_writepages(struct extent_io_tree *tree,
return ret;
 }
 
+struct pagelst {
+   struct page *page;
+   struct list_head lst;
+};
+
 int extent_readpages(struct extent_io_tree *tree,
 struct address_space *mapping,
 struct list_head *pages, unsigned nr_pages,
@@ -3557,19 +3562,47 @@ int extent_readpages(struct extent_io_tree *tree,
struct bio *bio = NULL;
unsigned page_idx;
unsigned long bio_flags = 0;
+   LIST_HEAD(page_pool);
+   struct pagelst *pagelst = NULL;
 
for (page_idx = 0; page_idx < nr_pages; page_idx++) {
struct page *page = list_entry(pages->prev, struct page, lru);
 
prefetchw(&page->flags);
list_del(&page->lru);
+
+   if (!pagelst)
+   pagelst = kmalloc(sizeof(*pagelst), GFP_NOFS);
+
+   if (!pagelst) {
+   page_cache_release(page);
+   continue;
+   }
if (!add_to_page_cache_lru(page, mapping,
page->index, GFP_NOFS)) {
-   __extent_read_full_page(tree, page, get_extent,
-   &bio, 0, &bio_flags);
+   pagelst->page = page;
+   list_add(&pagelst->lst, &page_pool);
+   page_cache_get(page);
+   pagelst = NULL;
}
page_cache_release(page);
}
+
+   while (!list_empty(&page_pool)) {
+   struct page *page;
+
+   pagelst = list_entry(page_pool.prev, struct pagelst, lst);
+   page = pagelst->page;
+
+   prefetchw(&page->flags);
+   __extent_read_full_page(tree, page, get_extent,
+   &bio, 0, &bio_flags);
+
+   page_cache_release(page);
+   list_del(&pagelst->lst);
+   kfree(pagelst);
+   }
+   BUG_ON(!list_empty(&page_pool));
BUG_ON(!list_empty(pages));
if (bio)
return submit_one_bio(READ, bio, 0, bio_flags);
-- 
1.6.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] Btrfs: fix btrfs_is_free_space_inode to recognize btree inode

2012-07-10 Thread Liu Bo
For btree inode, its root is also 'tree root', so btree inode can be
misunderstood as a free space inode.

We should add one more check for btree inode.

Signed-off-by: Liu Bo 
---
 fs/btrfs/btrfs_inode.h |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index 12394a9..b168238 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -194,8 +194,10 @@ static inline void btrfs_i_size_write(struct inode *inode, 
u64 size)
 static inline bool btrfs_is_free_space_inode(struct btrfs_root *root,
   struct inode *inode)
 {
-   if (root == root->fs_info->tree_root ||
-   BTRFS_I(inode)->location.objectid == BTRFS_FREE_INO_OBJECTID)
+   if (root == root->fs_info->tree_root &&
+   btrfs_ino(inode) != BTRFS_BTREE_INODE_OBJECTID)
+   return true;
+   if (BTRFS_I(inode)->location.objectid == BTRFS_FREE_INO_OBJECTID)
return true;
return false;
 }
-- 
1.6.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfsck crashes

2012-07-10 Thread haveanice...@cv-sv.de
This code should detect the problem without SIGSEGV but a Assertition.
...
Csum didn't match
btrfsck: btrfsck.c:1177: walk_down_tree: Assertion `!(1)' failed.
Aborted
...


--- btrfsck.c   2012-07-10 10:23:24.781622144 +0200
+++ btrfsck.c   2012-07-10 12:59:00.120146266 +0200
@@ -1173,7 +1173,7 @@
WARN_ON(*level >= BTRFS_MAX_LEVEL);
cur = path->nodes[*level];

-   if (btrfs_header_level(cur) != *level)
+   if (! cur || btrfs_header_level(cur) != *level)
WARN_ON(1);

if (path->slots[*level] >= btrfs_header_nritems(cur))

 I tried to skip this error with the code below. The next errors reported are
also below.


--- btrfsck.c   2012-07-10 10:23:24.781622144 +0200
+++ btrfsck.c   2012-07-10 12:36:51.995996771 +0200
@@ -1173,8 +1173,13 @@
WARN_ON(*level >= BTRFS_MAX_LEVEL);
cur = path->nodes[*level];

-   if (btrfs_header_level(cur) != *level)
-   WARN_ON(1);
+   if (cur != 0 ) {
+   if ( btrfs_header_level(cur) != *level)
+   WARN_ON(1);
+   }else {
+   fprintf(stderr, "CVCV path->nodes[*level] is 0!\n");
+   break;
+   }

if (path->slots[*level] >= btrfs_header_nritems(cur))
break;
@@ -1213,7 +1218,11 @@
path->slots[*level] = 0;
}
 out:
+   if ( path->nodes[*level] != 0 ){
path->slots[*level] = btrfs_header_nritems(path->nodes[*level]);
+   } else {
+   path->slots[*level] = 0;
+   }
return 0;
 }

Next errors I get are:


checking fs roots
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
Csum didn't match
CVCV path->nodes[*level] is 0!
root 5 inode 265 errors 2000
unresolved ref dir 2658782 index 3 namelen 12 name aquota.group filetype
0 error 3
unresolved ref dir 2914579 index 3 namelen 12 name aquota.group filetype
0 error 3
root 5 inode 266 errors 2000
unresolved ref dir 2658782 index 4 namelen 11 name aquota.user filetype
0 error 3
unresolved ref dir 2914579 index 4 namelen 11 name aquota.user filetype
0 error 3
root 5 inode 285 errors 2000
unresolved ref dir 2658783 index 3 namelen 3 name awk filetype 0 error 3
unresolved ref dir 2914580 index 3 namelen 3 name awk filetype 0 error 3
root 5 inode 286 errors 2000
unresolved ref dir 2658783 index 16 namelen 3 name csh filetype 0 error
3
unresolved ref dir 2914580 index 16 namelen 3 name csh filetype 0 error
3
root 5 inode 287 errors 2000
unresolved ref dir 2658783 index 27 namelen 13 name dnsdomainname
filetype 0 error 3
unresolved ref dir 2914580 index 27 namelen 13 name dnsdomainname
filetype 0 error 3
root 5 inode 288 errors 2000
unresolved ref dir 2658783 index 28 namelen 10 name domainname filetype
0 error 3
unresolved ref dir 2914580 index 28 namelen 10 name domainname filetype
0 error 3
root 5 inode 289 errors 2000
unresolved ref dir 2658783 index 34 namelen 2 name ex filetype 0 error 3
unresolved ref dir 2914580 index 34 namelen 2 name ex filetype 0 error 3
root 5 inode 290 errors 2000
unresolved ref dir 2658783 index 48 namelen 2 name ip filetype 0 error 3
unresolved ref dir 2914580 index 48 namelen 2 name ip filetype 0 error 3
root 5 inode 291 errors 2000
unresolved ref dir 2658783 index 54 namelen 3 name ksh filetype 0 error
3
unresolved ref dir 2914580 index 54 namelen 3 name ksh filetype 0 error
3
root 5 inode 292 errors 2000
unresolved ref dir 2658783 index 63 namelen 4 name mail filetype 0 error
3
unresolved ref dir 2914580 index 63 namelen 4 name mail filetype 0 error
3
...
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfsck crashes

2012-07-10 Thread haveanice...@cv-sv.de



Anand Jain  hat am 10. Juli 2012 um 08:30 geschrieben:

>
> Christian,
>
>   line # is still confusing to me as well. patch was to avoid seg
>   fault when csum_root node is null and it might not be the case
>   here then.
>
>   (If the original problem stack-trace has remained the same
>   which is as below)..

Hi Anand,

I have used git clone
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git

The stack looks now like below.  I expect the first "0" in path->nodes
 generates the problem.
(gdb) p path->nodes
$2 = {0x0, 0x27ad82e0, 0x3893a930, 0x3dc24f0, 0x75efa0, 0x0, 0x0, 0x0}



speedy:/tmp/btrfs/btrfs-progs # gdb ./btrfsck
GNU gdb (GDB) SUSE (7.3-41.1.2)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
For bug reporting instructions, please see:
...
Reading symbols from /tmp/btrfs/btrfs-progs/btrfsck...done.
(gdb) r /dev/md3
Starting program: /tmp/btrfs/btrfs-progs/btrfsck /dev/md3
Missing separate debuginfo for /lib64/ld-linux-x86-64.so.2
Try: zypper install -C
"debuginfo(build-id)=f20c99249f5a5776e1377d3bd728502e3f455a3f"
Missing separate debuginfo for /lib64/libuuid.so.1
Try: zypper install -C
"debuginfo(build-id)=24ae727f9cd5fb29f81b0f965859d3cf4668bf17"
Missing separate debuginfo for /lib64/libc.so.6
Try: zypper install -C
"debuginfo(build-id)=7b169b1db50384b70e3e4b4884cd56432d5de796"
checking extents
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
Csum didn't match
owner ref check failed [2327654400 4096]
ref mismatch on [101138354176 98304] extent item 1, found 0
Incorrect local backref count on 101138354176 root 5 owner 1867898 offset 0
found 0 wanted 1 back 0x19e025d0
backpointer mismatch on [101138354176 98304]
owner ref check failed [101138354176 98304]
ref mismatch on [101138452480 106496] extent item 1, found 0
Incorrect local backref count on 101138452480 root 5 owner 1867899 offset 0
found 0 wanted 1 back 0x19e02610
backpointer mismatch on [101138452480 106496]
owner ref check failed [101138452480 106496]
ref mismatch on [101138558976 8192] extent item 1, found 0
Incorrect local backref count on 101138558976 root 5 owner 1867901 offset 0
found 0 wanted 1 back 0x5a60f90
backpointer mismatch on [101138558976 8192]
owner ref check failed [101138558976 8192]
ref mismatch on [101138567168 16384] extent item 1, found 0
Incorrect local backref count on 101138567168 root 5 owner 1867902 offset 0
found 0 wanted 1 back 0x5a60fd0
backpointer mismatch on [101138567168 16384]
owner ref check failed [101138567168 16384]
ref mismatch on [101138583552 16384] extent item 1, found 0
Incorrect local backref count on 101138583552 root 5 owner 1867903 offset 0
found 0 wanted 1 back 0x19e04420
backpointer mismatch on [101138583552 16384]
owner ref check failed [101138583552 16384]
Errors found in extent allocation tree
checking fs roots
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
checksum verify failed on 2327654400 wanted 73CDE79C found 72
Csum didn't match

Program received signal SIGSEGV, Segmentation fault.
0x00402264 in btrfs_header_level (eb=0x0) at ctree.h:1540
1540BTRFS_SETGET_HEADER_FUNCS(header_level, struct btrfs_header, level, 8);
(gdb) bt
#0  0x00402264 in btrfs_header_level (eb=0x0) at ctree.h:1540
#1  0x0040531f in walk_down_tree (root=0x6f6c540, path=0x7fffde70,
wc=0x7fffdf40,
level=0x7fffdf00) at btrfsck.c:1176
#2  0x00406bf6 in check_fs_root (root=0x6f6c540,
root_cache=0x7fffe0c0, wc=0x7fffdf40)
at btrfsck.c:1702
#3  0x00406ecb in check_fs_roots (root=0x75ee00,
root_cache=0x7fffe0c0) at btrfsck.c:1773
#4  0x0040b49c in main (ac=1, av=0x7fffe1f8) at btrfsck.c:3576
(gdb) l
1535BTRFS_SETGET_HEADER_FUNCS(header_generation, struct btrfs_header,
1536  generation, 64);
1537BTRFS_SETGET_HEADER_FUNCS(header_owner, struct btrfs_header, owner, 64);
1538BTRFS_SETGET_HEADER_FUNCS(header_nritems, struct btrfs_header, nritems,
32);
1539BTRFS_SETGET_HEADER_FUNCS(header_flags, struct btrfs_header, flags, 64);
1540BTRFS_SETGET_HEADER_FUNCS(header_level, struct btrfs_header, level, 8);
1541
1542static inline int btrfs_header_flag(struct extent_buffer *eb, u64 flag)
1543{
1544return (btrfs_header_flags(eb) & flag) == flag;
(gdb) up
#1  0x0040531f in walk_down_tr

Re: 3.5.0-rc6: btrfs and LVM snapshots -> wrong devicename in /proc/mounts

2012-07-10 Thread Arnd Hannemann
Am 10.07.2012 00:49, schrieb cwillu:
> On Mon, Jul 9, 2012 at 4:22 PM, Arnd Hannemann  wrote:
>> Hi,
>>
>> using btrfs with LVM snapshots seems to be confusing /proc/mounts
>> After mounting a snapshot of an original filesystem, the devicename of the
>> original filesystem is overwritten with that of the snapshot in /proc/mounts.
> 
> If the lvm snapshot is visible to btrfs (i.e., btrfs dev scan), it
> will appear as another device which belongs to the original filesystem
> with a duplicate devid.  This might result in bad things happening, or
> possibly just hilarity.

You are right the same bug seems to get triggered on "btrfs dev scan":

arnd@kallisto:/mnt$ sudo grep /mnt /proc/mounts
/dev/mapper/vg0-original /mnt/original btrfs rw,relatime,ssd,space_cache 0 0
arnd@kallisto:/mnt$ sudo btrfs dev scan
Scanning for Btrfs filesystems
failed to read /dev/sr0
ERROR: unable to scan the device '/dev/dm-4' - Device or resource busy
ERROR: unable to scan the device '/dev/dm-17' - Device or resource busy

arnd@kallisto:/mnt$ sudo grep /mnt /proc/mounts
/dev/dm-16 /mnt/original btrfs rw,relatime,ssd,space_cache 0 0

arnd@kallisto:/mnt$ sudo dmsetup info /dev/dm-16
Name:  vg0-testsnap
State: ACTIVE
Read Ahead:256
Tables present:LIVE
Open count:0
Event number:  0
Major, minor:  253, 16
Number of targets: 1
UUID: LVM-pUa0TTDg9Y1dII6a6WwcUanE0ai4AVXqpS7sNnWEGZOnww76lrMaZzIEB38rug9


> You have a backup that isn't just an lvm snapshot, right?

Now I will.

Best regards
Arnd




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.5.0-rc6: btrfs and LVM snapshots -> wrong devicename in /proc/mounts

2012-07-10 Thread Arnd Hannemann
Hi,

Am 10.07.2012 05:30, schrieb Christian Robert:
> I agree with you, but you should never mount a snapshot of a btrfs filesystem 
> at the same time the original is,
> because both the original and the snapshot had same "device fsid 
> 5c3e8ca2-da56-4ade-9fef-103a6a8a70c2"
> 
> the kernel will tkink twice and fold back to the same device.

If that is correct the bug is that the kernel lets me mount the same device 
fsid on different devices twice.

> 
> btrsf does not behave like other filesystems, you can't snapshot a btrfs 
> filesystem
> and hope to mount the snapshot somewhere else.

> snapsoot also duplicate lots of things internally that have no sence in a 
> snapshot (like raid level, single or multiple devies ...)

I see. However, I expect a "simple" btrfs to just work or fail gracefully.

Best regards,
Arnd

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html