from:"\"Johannes Hirte\""

Re: how to understand "btrfs fi show" output? "No space left" issues

2016-11-14 Thread Johannes Hirte

On 2016 Sep 20, Peter Becker wrote:
> Data, RAID1: total=417.12GiB, used=131.33GiB
> 
> You have 417(total)-131(used) blocks wo are only partial filled.
> You should balance your file-system.
> 
> At first you need some free space. You could remove some files / old
> snapshots etc. or you add a empty USB-Stick with min. 4 GB to your
> BTRFS-Pool (after balancing complete you can remove the stick from the
> pool).

He has plenty of space. What you're describing is the case that either
data pool or metadata pool is full, the other has enough space and
nothing is left that could be allocated to the full pool. In this case
rebalancing would help. But in Tomasz' case there is enough space in
every pool, so the allocator should use it. This really sounds like a
bug.

> But at first you should try to free emty data and meta data blocks:
> 
> btrfs balance start -musage=0 /mnt
> btrfs balance start -dusage=0 /mnt

Since kernel 3.18 this is done automatically.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "Btrfs: device_list_add() should not update list when mounted" breaks subvol mount

2014-09-15 Thread Johannes Hirte

On Tue, 16 Sep 2014 01:39:49 +0800
Anand Jain  wrote:

> 
> 
> On 16/09/2014 01:14, Johannes Hirte wrote:
> > On Mon, 15 Sep 2014 20:32:58 +0800
> > Anand Jain  wrote:
> >
> >>
> >>
> >> Hi Johannes,
> >>
> >>Can I have you this tested.. ? Thanks.
> >>
> >>
> >> ---
> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> >> index e9676a4..1224b61 100644
> >> --- a/fs/btrfs/volumes.c
> >> +++ b/fs/btrfs/volumes.c
> >> @@ -533,7 +533,7 @@ static noinline int device_list_add(const char
> >> *path,
> >>* the btrfs dev scan cli, after FS has been
> >> mounted. */
> >>   if (fs_devices->opened) {
> >> -   return -EBUSY;
> >> +   goto out;
> >>   } else {
> >>   /*
> >>* That is if the FS is _not_ mounted
> >> and if you @@ -566,6 +566,7 @@ static noinline int
> >> device_list_add(const char *path, if (!fs_devices->opened)
> >>   device->generation = found_transid;
> >>
> >> +out:
> >>   *fs_devices_ret = fs_devices;
> >>
> >>   return ret;
> >> ---
> >
> > With this change, it works again without initramfs.
> 
>   Thanks. I am bit confused, is there any configuration that is
>   still not working, even with the above changes.
> 
> Anand

At moment, I'm not aware of any. But I will test the remaining systems
I have.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "Btrfs: device_list_add() should not update list when mounted" breaks subvol mount

2014-09-15 Thread Johannes Hirte

On Mon, 15 Sep 2014 20:32:58 +0800
Anand Jain  wrote:

> 
> 
> Hi Johannes,
> 
>   Can I have you this tested.. ? Thanks.
> 
> 
> ---
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index e9676a4..1224b61 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -533,7 +533,7 @@ static noinline int device_list_add(const char
> *path,
>   * the btrfs dev scan cli, after FS has been mounted.
>   */
>  if (fs_devices->opened) {
> -   return -EBUSY;
> +   goto out;
>  } else {
>  /*
>   * That is if the FS is _not_ mounted and if
> you @@ -566,6 +566,7 @@ static noinline int device_list_add(const
> char *path, if (!fs_devices->opened)
>  device->generation = found_transid;
> 
> +out:
>  *fs_devices_ret = fs_devices;
> 
>  return ret;
> ---

With this change, it works again without initramfs.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "Btrfs: device_list_add() should not update list when mounted" breaks subvol mount

2014-09-15 Thread Johannes Hirte

On Sun, 14 Sep 2014 00:45:49 + (UTC)
Duncan <1i5t5.dun...@cox.net> wrote:

> Johannes Hirte posted on Sat, 13 Sep 2014 23:23:20 +0200 as excerpted:
> 
> > On Sat, 13 Sep 2014 19:55:25 +0200 Johannes Hirte
> >  wrote:
> > 
> >> On Sat, 13 Sep 2014 13:36:37 +0800 Anand Jain
> >>  wrote:
> >> 
> >>> The quickest workaround for you will be to try to match
> >>> the device path as in the btrfs fi show -m  output to your
> >>> probably fstab/mnttab entry.
> >> 
> >> Doesn't work here. I don't even get a path with the affected
> >> kernels. I'll get:
> >> 
> >> Label: none  uuid: 02edbd6b-f044-4800-b21e-ca8982c2c2e5
> >> Total devices 1 FS bytes used 270.10GiB
> >> *** Some devices missing
> >> 
> >> Btrfs v3.16
> >> 
> >> with a working kernel:
> >> 
> >> Label: none  uuid: 02edbd6b-f044-4800-b21e-ca8982c2c2e5
> >> Total devices 1 FS bytes used 270.10GiB
> >> devid  1 size 293.89GiB used 289.06GiB path /dev/sda1
> >> 
> >> Btrfs v3.16
> 
> >> And now I was able to reproduce on a second machine. The main
> >> difference between the affected and the unaffected systems is
> >> initramfs. On the affected systems, I don't use one. On the working
> >> systems, the rootfs is mounted via initramfs before. I'll test, if
> >> an initramfs will solve the issue. Seems likely, cause if I put
> >> the disk of an affected system into a working system and mount it
> >> there, everything works.
> > 
> > Of course, with the initramfs it works.
> 
> I see a btrfs device scan in the initramfs script.  What happens if
> you simply run btrfs device scan manually, before doing btrfs
> filesystem show?
> 
> I'm guessing that'll fix it.

Not tested, but I doubt it will fix it. In this initramfs it is a
leftover from a time, when the system was multi-device. On two other
systems, initramfs works without the scan. Additionally I can put the
affected HDD from the laptop and put it into/on another system, that is
affected without initramfs. I can mount it there without any scan
before. But I'm not 100% sure if udev takes responsibility for scanning
here.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "Btrfs: device_list_add() should not update list when mounted" breaks subvol mount

2014-09-13 Thread Johannes Hirte

On Sat, 13 Sep 2014 19:55:25 +0200
Johannes Hirte  wrote:

> On Sat, 13 Sep 2014 13:36:37 +0800
> Anand Jain  wrote:
> 
> > Xavier, Johannes,
> > 
> >  The quickest workaround for you will be to try to match
> >   the device path as in the btrfs fi show -m  output to
> >   your probably fstab/mnttab entry.
> 
> Doesn't work here. I don't even get a path with the affected kernels.
> I'll get:
> 
> Label: none  uuid: 02edbd6b-f044-4800-b21e-ca8982c2c2e5
> Total devices 1 FS bytes used 270.10GiB
> *** Some devices missing
> 
> Btrfs v3.16
> 
> with a working kernel:
> 
> Label: none  uuid: 02edbd6b-f044-4800-b21e-ca8982c2c2e5
> Total devices 1 FS bytes used 270.10GiB
> devid1 size 293.89GiB used 289.06GiB path /dev/sda1
> 
> Btrfs v3.16
> 
> Filesystem layout is:
> 
> subvolid 0 contains only the different subvolumes
> 
> ID 257 gen 414674 top level 5 path rootfs
> ID 269 gen 414615 top level 5 path home-USER1
> ID 317 gen 411498 top level 5 path home-USER2
> ID 363 gen 410939 top level 5 path home-USER3
> ID 382 gen 315844 top level 5 path home-USER4
> ID 933 gen 410514 top level 5 path home-USER5
> ID 995 gen 315756 top level 5 path homefs-USER6
> 
> subvol rootfs (ID 257) is set to the default subvolume, mounted at
> start. Grub commandline is like following:
> 
> root=/dev/sda1 ro rootflags=subvol=rootfs,inode_cache,autodefrag
> 
> It doesn't matter, if the subvol parameter is set. I've tried with,
> without and with subvolid=0 parameter. Everytime the same result.
> 
> 
> And now I was able to reproduce on a second machine. The main
> difference between the affected and the unaffected systems is
> initramfs. On the affected systems, I don't use one. On the working
> systems, the rootfs is mounted via initramfs before. I'll test, if an
> initramfs will solve the issue. Seems likely, cause if I put the disk
> of an affected system into a working system and mount it there,
> everything works.

Of course, with the initramfs it works. Content of the init-script:

#!/bin/sh

mount -t devtmpfs devtmpfs /dev
mount -t proc proc /proc
mount -t sysfs sysfs /sys
mount -t tmpfs tmpfs /run
sleep 3  # wait for kernel msgs to quiet

echo "loading initrd"

btrfs dev scan
sleep 5

mount -o ro,subvol=rootfs,inode_cache,autodefrag /dev/sda1 /newroot

if [[ -x /newroot/sbin/init ]]; then
umount /sys /proc
exec switch_root /newroot /sbin/init
fi

#rescue shell
exec sh
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "Btrfs: device_list_add() should not update list when mounted" breaks subvol mount

2014-09-13 Thread Johannes Hirte

On Sat, 13 Sep 2014 13:36:37 +0800
Anand Jain  wrote:

> Xavier, Johannes,
> 
>  The quickest workaround for you will be to try to match
>   the device path as in the btrfs fi show -m  output to
>   your probably fstab/mnttab entry.

Doesn't work here. I don't even get a path with the affected kernels.
I'll get:

Label: none  uuid: 02edbd6b-f044-4800-b21e-ca8982c2c2e5
Total devices 1 FS bytes used 270.10GiB
*** Some devices missing

Btrfs v3.16

with a working kernel:

Label: none  uuid: 02edbd6b-f044-4800-b21e-ca8982c2c2e5
Total devices 1 FS bytes used 270.10GiB
devid1 size 293.89GiB used 289.06GiB path /dev/sda1

Btrfs v3.16

Filesystem layout is:

subvolid 0 contains only the different subvolumes

ID 257 gen 414674 top level 5 path rootfs
ID 269 gen 414615 top level 5 path home-USER1
ID 317 gen 411498 top level 5 path home-USER2
ID 363 gen 410939 top level 5 path home-USER3
ID 382 gen 315844 top level 5 path home-USER4
ID 933 gen 410514 top level 5 path home-USER5
ID 995 gen 315756 top level 5 path homefs-USER6

subvol rootfs (ID 257) is set to the default subvolume, mounted at
start. Grub commandline is like following:

root=/dev/sda1 ro rootflags=subvol=rootfs,inode_cache,autodefrag

It doesn't matter, if the subvol parameter is set. I've tried with,
without and with subvolid=0 parameter. Everytime the same result.

And now I was able to reproduce on a second machine. The main
difference between the affected and the unaffected systems is
initramfs. On the affected systems, I don't use one. On the working
systems, the rootfs is mounted via initramfs before. I'll test, if an
initramfs will solve the issue. Seems likely, cause if I put the disk
of an affected system into a working system and mount it there,
everything works.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

"Btrfs: device_list_add() should not update list when mounted" breaks subvol mount

2014-09-10 Thread Johannes Hirte

commit b96de000bc8bc9688b3a2abea4332bd57648a49f breaks subvolume mount
on one of my systems. I've bisected a mount problem to this commit.
Situation is:

- one hdd with btrfs
- default subvolume (rootfs) is different from subovlid=0
- at boot, several subvols are mounted at /home/$DIR

after commit b96de000bc8bc9688b3a2abea4332bd57648a49f this is not
possible anymore. Trying to mount results in (example):

mount: /dev/sda1 is already mounted or /home/video busy

The output of btrfs show is curious too:

btrfs fi show
Label: none  uuid: 43438ef5-adac-46a9-823e-14951ee6866a
Total devices 1 FS bytes used 150.05GiB
*** Some devices missing

Btrfs v3.16

As this is a laptop with only one drive bay, this was never a multi
device setup. I've two more systems with kernel version >3.17-rc3
running and no problem like this.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BTRFS setup advice for laptop performance ?

2014-04-07 Thread Johannes Hirte

On Fri, 04 Apr 2014 08:33:10 -0400
Austin S Hemmelgarn  wrote:

> On 2014-04-04 04:02, Swâmi Petaramesh wrote:
> > - Is it still recommended to mkfs with a nodesize or leafsize
> > different (bigger) than the default ? I wouldn't like to lose too
> > much disk space anyway (1/2 nodesize per file on average ?), as it
> > will be limited...
> This depends on many things, the average size of the files on the disk
> is the biggest factor.  In general you should get the best disk
> utilization by setting nodesize so that a majority of the files are
> less than the leafsize minus 256 bytes, and all but a few are smaller
> than two times the leafsize minus 256 bytes.  However, if you want to
> really benefit from the data compression, you should just use the
> smallest leaf/nodesize for your system (which is what mkfs defaults
> to), as data that gets as BTRFS stores files whose size is at least
> (roughly) 256 bytes less than the leafsize inline with the metadata,
> and doesn't compress such files.

With commit c652e4efb8e2dd76ef1627d8cd649c6af5905902 the default
node-/leafsize has changed:

commit c652e4efb8e2dd76ef1627d8cd649c6af5905902
Author: Chris Mason 
Date:   Fri Nov 8 13:51:52 2013 -0500

mkfs: change default metadata blocksize to 16KB

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-15 Thread Johannes Hirte

On Fri, 14 Feb 2014 14:29:35 -0500
Josef Bacik  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> 
> 
> On 02/14/2014 02:25 PM, Johannes Hirte wrote:
> > On Thu, 6 Feb 2014 16:19:46 -0500 Josef Bacik 
> > wrote:
> > 
> >> Ok so I thought I reproduced the problem but I just reproduced a 
> >> different problem.  Please undo any changes you've made and
> >> apply this patch and reproduce and then provide me with any debug
> >> output that gets spit out.  I'm sending this via thunderbird with
> >> 6 different extensions to make sure it comes out right so if it
> >> doesn't work let me know and I'll just paste it somewhere.
> >> Thanks,
> > 
> > Sorry for the long delay. Was to busy last week.
> > 
> 
> Ok perfect this is fixed by
> 
> [PATCH] Btrfs: don't loop forever if we can't run because of the tree
> mod log
> 
> and it went into -rc2 iirc, so give that a whirl and make sure it
> fixes your problem.  Thanks,

Yes, seems to be fixed now. I wasn't able to reproduce it anymore.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-14 Thread Johannes Hirte

On Thu, 6 Feb 2014 16:19:46 -0500
Josef Bacik  wrote:

> Ok so I thought I reproduced the problem but I just reproduced a
> different problem.  Please undo any changes you've made and apply
> this patch and reproduce and then provide me with any debug output
> that gets spit out.  I'm sending this via thunderbird with 6
> different extensions to make sure it comes out right so if it doesn't
> work let me know and I'll just paste it somewhere.  Thanks,

Sorry for the long delay. Was to busy last week.

Here is the output:

[   25.240971] looped a lot, count 14, nr 32, no_selected_ref 99986
[   25.267639] looped a lot, count 14, nr 32, no_selected_ref 199987
[   25.294308] looped a lot, count 14, nr 32, no_selected_ref 299988
[   25.320605] looped a lot, count 14, nr 32, no_selected_ref 399989
[   25.346639] looped a lot, count 14, nr 32, no_selected_ref 40
[   25.372517] looped a lot, count 14, nr 32, no_selected_ref 51
[   25.398924] looped a lot, count 14, nr 32, no_selected_ref 62
[   25.425443] looped a lot, count 14, nr 32, no_selected_ref 73
[   25.451344] looped a lot, count 14, nr 32, no_selected_ref 84
[   25.477350] looped a lot, count 14, nr 32, no_selected_ref 95
[   25.503069] looped a lot, count 14, nr 32, no_selected_ref 106
[   25.529372] looped a lot, count 14, nr 32, no_selected_ref 117
[   25.49] looped a lot, count 14, nr 32, no_selected_ref 128
[   25.581418] looped a lot, count 14, nr 32, no_selected_ref 139
[   25.607514] looped a lot, count 14, nr 32, no_selected_ref 150
[   25.633794] looped a lot, count 14, nr 32, no_selected_ref 161
[   25.659699] looped a lot, count 14, nr 32, no_selected_ref 172
[   25.686095] looped a lot, count 14, nr 32, no_selected_ref 183
[   25.711906] looped a lot, count 14, nr 32, no_selected_ref 194
[   25.752255] looped a lot, count 14, nr 32, no_selected_ref 205
[   25.788077] looped a lot, count 0, nr 32, no_selected_ref 10
[   25.811966] looped a lot, count 14, nr 32, no_selected_ref 216
[  360.749227] looped a lot, count 8, nr 32, no_selected_ref 2
[  360.770434] looped a lot, count 8, nr 32, no_selected_ref 13
[  360.792136] looped a lot, count 8, nr 32, no_selected_ref 24
[  360.813571] looped a lot, count 8, nr 32, no_selected_ref 35
[  360.834932] looped a lot, count 8, nr 32, no_selected_ref 46
[  360.856085] looped a lot, count 8, nr 32, no_selected_ref 57
[  360.877374] looped a lot, count 8, nr 32, no_selected_ref 68
[  360.899455] looped a lot, count 8, nr 32, no_selected_ref 79
[  360.921175] looped a lot, count 8, nr 32, no_selected_ref 90
[  360.942409] looped a lot, count 8, nr 32, no_selected_ref 101
[  360.963800] looped a lot, count 8, nr 32, no_selected_ref 112
[  360.985397] looped a lot, count 8, nr 32, no_selected_ref 123
[  361.007148] looped a lot, count 8, nr 32, no_selected_ref 134
[  361.028789] looped a lot, count 8, nr 32, no_selected_ref 145
[  361.050564] looped a lot, count 8, nr 32, no_selected_ref 156
[  361.072008] looped a lot, count 8, nr 32, no_selected_ref 167
[  361.093269] looped a lot, count 8, nr 32, no_selected_ref 178
[  361.114645] looped a lot, count 8, nr 32, no_selected_ref 189
[  361.136099] looped a lot, count 8, nr 32, no_selected_ref 1900010
[  361.157566] looped a lot, count 8, nr 32, no_selected_ref 211
[  361.178969] looped a lot, count 8, nr 32, no_selected_ref 2100012
[  361.200397] looped a lot, count 8, nr 32, no_selected_ref 2200013
[  361.221980] looped a lot, count 8, nr 32, no_selected_ref 2300014
[  361.243435] looped a lot, count 8, nr 32, no_selected_ref 2400015
[  361.264777] looped a lot, count 8, nr 32, no_selected_ref 2500016
[  361.286518] looped a lot, count 8, nr 32, no_selected_ref 2600017
[  361.308240] looped a lot, count 8, nr 32, no_selected_ref 2700018
[  361.329850] looped a lot, count 8, nr 32, no_selected_ref 2800019
[  361.351420] looped a lot, count 8, nr 32, no_selected_ref 2900020
[  361.372633] looped a lot, count 8, nr 32, no_selected_ref 321
[  361.394330] looped a lot, count 8, nr 32, no_selected_ref 3100022
[  361.416039] looped a lot, count 8, nr 32, no_selected_ref 3200023
[  361.437659] looped a lot, count 8, nr 32, no_selected_ref 3300024
[  361.459181] looped a lot, count 8, nr 32, no_selected_ref 3400025
[  361.481058] looped a lot, count 8, nr 32, no_selected_ref 3500026
[  361.502441] looped a lot, count 8, nr 32, no_selected_ref 3600027
[  361.523964] looped a lot, count 8, nr 32, no_selected_ref 3700028
[  361.545387] looped a lot, count 8, nr 32, no_selected_ref 3800029
[  361.566717] looped a lot, count 8, nr 32, no_selected_ref 3900030
[  361.588079] looped a lot, count 8, nr 32, no_selected_ref 431
[  361.609673] looped a lot, count 8, nr 32, no_selected_ref 4100032
[  361.631028] looped a lot, count 8, nr 32, no_selected_ref 4200033
[  361.652498] looped a lot, count 8, nr 32, no_selected

Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-05 Thread Johannes Hirte

On Wed, 5 Feb 2014 16:46:57 -0500
Josef Bacik  wrote:

> 
> On 02/05/2014 04:42 PM, Johannes Hirte wrote:
> > On Wed, 5 Feb 2014 14:36:39 -0500
> > Josef Bacik  wrote:
> >
> >> On 02/05/2014 02:30 PM, Johannes Hirte wrote:
> >>> On Wed, 5 Feb 2014 14:00:57 -0500
> >>> Josef Bacik  wrote:
> >>>
> >>>> On 02/05/2014 12:34 PM, Johannes Hirte wrote:
> >>>>> On Wed, 5 Feb 2014 10:49:15 -0500
> >>>>> Josef Bacik  wrote:
> >>>>>
> >>>>>> Ok none of those make sense which makes me think it may be the
> >>>>>> ktime bits, instead of un-applying the whole patch could you
> >>>>>> just comment out the parts
> >>>>>>
> >>>>>> ktime_t start = ktime_get();
> >>>>>>
> >>>>>> and
> >>>>>>
> >>>>>> if (actual_count > 0) {
> >>>>>> u64 runtime =
> >>>>>> ktime_to_ns(ktime_sub(ktime_get(), start)); u64 avg;
> >>>>>>
> >>>>>> /*
> >>>>>>  * We weigh the current average higher than
> >>>>>> our current runtime
> >>>>>>  * to avoid large swings in the average.
> >>>>>>  */
> >>>>>> spin_lock(&delayed_refs->lock);
> >>>>>> avg = fs_info->avg_delayed_ref_runtime * 3
> >>>>>> + runtime; avg = div64_u64(avg, 4);
> >>>>>> fs_info->avg_delayed_ref_runtime = avg;
> >>>>>> spin_unlock(&delayed_refs->lock);
> >>>>>> }
> >>>>>>
> >>>>>> in __btrfs_run_delayed_refs and see if that makes the problem
> >>>>>> stop? If it does will you try chris's for-linus branch to see
> >>>>>> if it still reproduces there?  Maybe some patch changed
> >>>>>> ktime_get() in -rc1 that is causing issues and we're just now
> >>>>>> exposing it. Thanks,
> >>>>> With the ktime bits disabled, I wasn't able to reproduce the
> >>>>> problem anymore. With Chris' for-linus branch it took longer but
> >>>>> still appeared.
> >>>>>
> >>>> Ok can you send your .config, maybe there's some weird time bug
> >>>> being exposed.  What kind of CPU do you have?  Thanks,
> >>>>
> >>>> Josef
> >>> It's a Core i5-540M, dualcore + hyperthreading
> >> Ok while I'm doing this can you change
> >> btrfs_should_throttle_delayed_refs to _always_ return 1, still with
> >> all the ktime stuff commented out, and see if that causes the
> >> problem to happen?  Thanks,
> > Yes it does. Same behavior as without ktime stuff commented out.
> >
> Ok perfect, can you send me a btrfs fi df of that volume, and do you 
> have any snapshots or anything?  Thanks,

btrfs fi df /
Data, single: total=220.01GiB, used=210.85GiB
System, DUP: total=8.00MiB, used=32.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, DUP: total=4.00GiB, used=2.93GiB
Metadata, single: total=8.00MiB, used=0.00

No snapshots but several subvolumes. / itself is a seperate subvolume
and subvol 0 only contains the other subvolumes (5 at moment). qgroups
aren't enabled.

mount options are noatime,inode_cache, if this matters

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-05 Thread Johannes Hirte

On Wed, 5 Feb 2014 14:36:39 -0500
Josef Bacik  wrote:

> 
> On 02/05/2014 02:30 PM, Johannes Hirte wrote:
> > On Wed, 5 Feb 2014 14:00:57 -0500
> > Josef Bacik  wrote:
> >
> >> On 02/05/2014 12:34 PM, Johannes Hirte wrote:
> >>> On Wed, 5 Feb 2014 10:49:15 -0500
> >>> Josef Bacik  wrote:
> >>>
> >>>> Ok none of those make sense which makes me think it may be the
> >>>> ktime bits, instead of un-applying the whole patch could you just
> >>>> comment out the parts
> >>>>
> >>>>ktime_t start = ktime_get();
> >>>>
> >>>> and
> >>>>
> >>>>if (actual_count > 0) {
> >>>>u64 runtime =
> >>>> ktime_to_ns(ktime_sub(ktime_get(), start)); u64 avg;
> >>>>
> >>>>/*
> >>>> * We weigh the current average higher than
> >>>> our current runtime
> >>>> * to avoid large swings in the average.
> >>>> */
> >>>>spin_lock(&delayed_refs->lock);
> >>>>avg = fs_info->avg_delayed_ref_runtime * 3 +
> >>>> runtime; avg = div64_u64(avg, 4);
> >>>>fs_info->avg_delayed_ref_runtime = avg;
> >>>>spin_unlock(&delayed_refs->lock);
> >>>>}
> >>>>
> >>>> in __btrfs_run_delayed_refs and see if that makes the problem
> >>>> stop? If it does will you try chris's for-linus branch to see if
> >>>> it still reproduces there?  Maybe some patch changed ktime_get()
> >>>> in -rc1 that is causing issues and we're just now exposing it.
> >>>> Thanks,
> >>> With the ktime bits disabled, I wasn't able to reproduce the
> >>> problem anymore. With Chris' for-linus branch it took longer but
> >>> still appeared.
> >>>
> >> Ok can you send your .config, maybe there's some weird time bug
> >> being exposed.  What kind of CPU do you have?  Thanks,
> >>
> >> Josef
> > It's a Core i5-540M, dualcore + hyperthreading
> Ok while I'm doing this can you change 
> btrfs_should_throttle_delayed_refs to _always_ return 1, still with
> all the ktime stuff commented out, and see if that causes the problem
> to happen?  Thanks,

Yes it does. Same behavior as without ktime stuff commented out.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-05 Thread Johannes Hirte

On Wed, 5 Feb 2014 10:49:15 -0500
Josef Bacik  wrote:

> Ok none of those make sense which makes me think it may be the ktime 
> bits, instead of un-applying the whole patch could you just comment
> out the parts
> 
>  ktime_t start = ktime_get();
> 
> and
> 
>  if (actual_count > 0) {
>  u64 runtime = ktime_to_ns(ktime_sub(ktime_get(),
> start)); u64 avg;
> 
>  /*
>   * We weigh the current average higher than our
> current runtime
>   * to avoid large swings in the average.
>   */
>  spin_lock(&delayed_refs->lock);
>  avg = fs_info->avg_delayed_ref_runtime * 3 + runtime;
>  avg = div64_u64(avg, 4);
>  fs_info->avg_delayed_ref_runtime = avg;
>  spin_unlock(&delayed_refs->lock);
>  }
> 
> in __btrfs_run_delayed_refs and see if that makes the problem stop?
> If it does will you try chris's for-linus branch to see if it still 
> reproduces there?  Maybe some patch changed ktime_get() in -rc1 that
> is causing issues and we're just now exposing it.  Thanks,

With the ktime bits disabled, I wasn't able to reproduce the
problem anymore. With Chris' for-linus branch it took longer but still
appeared.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-05 Thread Johannes Hirte

On Tue, 4 Feb 2014 09:12:54 -0500
Josef Bacik  wrote:

> Hrm I was hoping that was going to be more helpful.  Can you get perf 
> record -ag and then perf report while it's at full cpu and get the
> first 3 or 4 things with their traces?

Here it comes:

# 
# captured on: Wed Feb  5 00:11:41 2014
# 
#
no symbols found in /usr/sbin/acpid, maybe install a debug package?
unexpected end of event stream
# Samples: 168K of event 'cycles'   


# Event count (approx.): 126847081763   


#   


# Overhead  Command   Shared Object 
  Symbol

#   ...  ..  
...

#   


18.48%  btrfs-freespace  [kernel.kallsyms]   [k] state_store


|
--- state_store

10.25%  btrfs-freespace  [kernel.kallsyms]   [k] 
sys_sched_rr_get_interval   
   
|
--- sys_sched_rr_get_interval

 9.02%  btrfs-freespace  [kernel.kallsyms]   [k] 
rt_mutex_slowunlock 
   
|
--- rt_mutex_slowunlock

 8.76%  btrfs-freespace  [kernel.kallsyms]   [k] 
btrfs_submit_compressed_write   
   
|
--- btrfs_submit_compressed_write

 6.63%  btrfs-freespace  [kernel.kallsyms]   [k] sched_show_task

|
--- sched_show_task

 5.19%  btrfs-freespace  [kernel.kallsyms]   [k] find_free_extent   

|
--- find_free_extent

 5.15%  btrfs-freespace  [kernel.kallsyms]   [k] 
trace_print_graph_duration  
   
|
--- trace_print_graph_duration

> I'm going to try and
> reproduce today, is there anything special about your fs?
> Compression, large blocksizes, skinny metadata?  Thanks,

Filesystem was created with -l 32768 -n 32768 and skinny metadata enabled.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-03 Thread Johannes Hirte

On Mon, 3 Feb 2014 16:08:08 -0500
Josef Bacik  wrote:

> 
> On 02/03/2014 01:28 PM, Johannes Hirte wrote:
> > On Thu, 23 Jan 2014 13:07:52 -0500
> > Josef Bacik  wrote:
> >
> >> On one of our gluster clusters we noticed some pretty big lag
> >> spikes.  This turned out to be because our transaction commit was
> >> taking like 3 minutes to complete.  This is because we have like 30
> >> gigs of metadata, so our global reserve would end up being the max
> >> which is like 512 mb.  So our throttling code would allow a
> >> ridiculous amount of delayed refs to build up and then they'd all
> >> get run at transaction commit time, and for a cold mounted file
> >> system that could take up to 3 minutes to run.  So fix the
> >> throttling to be based on both the size of the global reserve and
> >> how long it takes us to run delayed refs. This patch tracks the
> >> time it takes to run delayed refs and then only allows 1 seconds
> >> worth of outstanding delayed refs at a time.  This way it will
> >> auto-tune itself from cold cache up to when everything is in
> >> memory and it no longer has to go to disk.  This makes our
> >> transaction commits take much less time to run. Thanks,
> >>
> >> Signed-off-by: Josef Bacik 
> > This one breaks my system. Shortly after boot the btrfs-freespace
> > thread goes up to 100% CPU usage and the system is nearly
> > unresponsive. I've seen it first with the full pull request for
> > 3.14-rc1 and was able to track it down to this patch.
> Could you turn on the softlockup timer and see if you can get a 
> backtrace of where it is stuck?  In the meantime I will go through
> and see if I can pinpoint where it may be happening.  Thanks,
> 
> Josef

This is what I've got with

CONFIG_LOCKUP_DETECTOR=y
CONFIG_HARDLOCKUP_DETECTOR=y
# CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is not set
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=0
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0
# CONFIG_PANIC_ON_OOPS is not set
CONFIG_PANIC_ON_OOPS_VALUE=0
CONFIG_PANIC_TIMEOUT=0
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
CONFIG_TIMER_STATS=y
CONFIG_DEBUG_PREEMPT=y

[  203.610758] perf samples too long (2513 > 2500), lowering 
kernel.perf_event_max_sample_rate to 5
[  360.625822] INFO: task btrfs-endio-wri:1075 blocked for more than 120 
seconds.
[  360.625826]   Not tainted 3.14.0-rc1 #19
[  360.625828] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  360.625829] btrfs-endio-wri D 880137c12d00 0  1075  2 0x
[  360.625833]  8800b6b10950 0002 00012d00 
8800b6b10950
[  360.625837]  8801325b3fd8 8800a2dcc000 8801325719e8 

[  360.625840]   880132571800 8800b635ba00 
81256192
[  360.625844] Call Trace:
[  360.625854]  [] ? wait_current_trans.isra.19+0xbb/0xdf
[  360.625858]  [] ? finish_wait+0x65/0x65
[  360.625860]  [] ? start_transaction+0x2f1/0x4e3
[  360.625864]  [] ? btrfs_finish_ordered_io+0x44c/0x7b2
[  360.625869]  [] ? try_to_del_timer_sync+0x53/0x5e
[  360.625871]  [] ? del_timer_sync+0x26/0x43
[  360.625875]  [] ? schedule_timeout+0xeb/0x104
[  360.625877]  [] ? rcu_read_unlock_sched_notrace+0x11/0x11
[  360.625882]  [] ? worker_loop+0x162/0x4c3
[  360.625884]  [] ? btrfs_queue_worker+0x275/0x275
[  360.625888]  [] ? kthread+0xa3/0xab
[  360.625893]  [] ? trace_preempt_on+0xd/0x2a
[  360.625895]  [] ? freeze_workqueues_begin+0x8/0x11e
[  360.625897]  [] ? __kthread_parkme+0x5a/0x5a
[  360.625901]  [] ? ret_from_fork+0x7c/0xb0
[  360.625903]  [] ? __kthread_parkme+0x5a/0x5a
[  360.625906] INFO: task btrfs-transacti:1084 blocked for more than 120 
seconds.
[  360.625908]   Not tainted 3.14.0-rc1 #19
[  360.625909] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  360.625910] btrfs-transacti D 880137c52d00 0  1084  2 0x
[  360.625912]  880132428950 0002 00012d00 
880132428950
[  360.625915]  8800b5a35fd8 8801331a5a70 8801331a5ae8 

[  360.625918]  8800aba981b8 00015000 0001 
8126b986
[  360.625921] Call Trace:
[  360.625925]  [] ? btrfs_start_ordered_extent+0x91/0xdf
[  360.625928]  [] ? finish_wait+0x65/0x65
[  360.625931]  [] ? btrfs_wait_ordered_range+0xab/0x10a
[  360.625934]  [] ? __btrfs_write_out_cache+0x43c/0x67f
[  360.625939]  [] ? kmem_cache_free+0x66/0x10d
[  360.625942]  [] ? btrfs_update_inode_item+0xb9/0xcd
[  360.625944]  [] ? __btrfs_prealloc_file_range+0x276/0x2db
[

Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-03 Thread Johannes Hirte

On Thu, 23 Jan 2014 13:07:52 -0500
Josef Bacik  wrote:

> On one of our gluster clusters we noticed some pretty big lag
> spikes.  This turned out to be because our transaction commit was
> taking like 3 minutes to complete.  This is because we have like 30
> gigs of metadata, so our global reserve would end up being the max
> which is like 512 mb.  So our throttling code would allow a
> ridiculous amount of delayed refs to build up and then they'd all get
> run at transaction commit time, and for a cold mounted file system
> that could take up to 3 minutes to run.  So fix the throttling to be
> based on both the size of the global reserve and how long it takes us
> to run delayed refs. This patch tracks the time it takes to run
> delayed refs and then only allows 1 seconds worth of outstanding
> delayed refs at a time.  This way it will auto-tune itself from cold
> cache up to when everything is in memory and it no longer has to go
> to disk.  This makes our transaction commits take much less time to
> run. Thanks,
> 
> Signed-off-by: Josef Bacik 

This one breaks my system. Shortly after boot the btrfs-freespace
thread goes up to 100% CPU usage and the system is nearly unresponsive.
I've seen it first with the full pull request for 3.14-rc1 and was able
to track it down to this patch.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

2013-10-04 Thread Johannes Hirte

On Fri, 27 Sep 2013 09:37:00 -0400
Josef Bacik  wrote:

> A user reported a problem where they were getting csum errors when
> running a balance and running systemd's journal.  This is because
> systemd is awesome and fallocate()'s its log space and writes into
> it.  Unfortunately we assume that when we read in all the csums for
> an extent that they are sequential starting at the bytenr we care
> about.  This obviously isn't the case for prealloc extents, where we
> could have written to the middle of the prealloc extent only, which
> means the csum would be for the bytenr in the middle of our range and
> not the front of our range.  Fix this by offsetting the new bytenr we
> are logging to based on the original bytenr the csum was for.  With
> this patch I no longer see the csum errors I was seeing.  Thanks,

Any assessment when this goes upstream? Until it hit Linus tree it
won't won't appear in stable. And this seems rather important.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: balance induced csum errors, systemd-journal

2013-09-27 Thread Johannes Hirte

On Tue, 24 Sep 2013 22:34:20 -0600
Chris Murphy  wrote:

> OK so I think I'm narrowing this down to just the systemd journal,
> and it's not checksums that are corrupted, it's the journal itself.

I doubt it's systemd-dependent, cause I've seen similar behaviour on a
Gentoo system without systemd. Before balance the filesystem was ok,
after I get

root 257 inode 2875 errors 1800
root 257 inode 2881 errors 1800
root 257 inode 2969 errors 1800
root 257 inode 3063 errors 1800
root 257 inode 3120 errors 1800
root 257 inode 12407 errors 1800
root 257 inode 19496 errors 1800
root 257 inode 19500 errors 1800
root 257 inode 19564 errors 1800
root 257 inode 19643 errors 1800
root 257 inode 19693 errors 1800
root 257 inode 19949 errors 1800
root 257 inode 20178 errors 1800
root 257 inode 20320 errors 1800
root 257 inode 20406 errors 1800
root 257 inode 20512 errors 1800
root 257 inode 20586 errors 1800
root 257 inode 20654 errors 1800
root 257 inode 20727 errors 1800
root 257 inode 20728 errors 1800
root 257 inode 20821 errors 1800
root 257 inode 20843 errors 1800
root 257 inode 21062 errors 1800
root 257 inode 21078 errors 1800
root 257 inode 21222 errors 1800
root 257 inode 21356 errors 1800
root 257 inode 21437 errors 1800
root 257 inode 55082 errors 1800
root 257 inode 65343 errors 1800
root 257 inode 72413 errors 1800

on a fsck and scrub tells me that there are unfixable csum errors.
Kernel is 3.12.0-rc2-00083-g4b97280.

I've observed this two times, and every time only the first
subvolume (root 257) was affected.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: hang on 3.9, 3.10-rc5

2013-06-21 Thread Johannes Hirte

On Tue, 18 Jun 2013 17:19:04 + (UTC)
Jon Nelson  wrote:

> Josef Bacik  fusionio.com> writes:
> 
> > 
> > On Tue, Jun 11, 2013 at 11:43:30AM -0400, Sage Weil wrote:
> > > I'm also seeing this hang regularly with both 3.9 and 3.10-rc5.
> > > Is this is a known problem?  In this case there is no
> > > powercycling; just a regular ceph-osd workload.
> 
> ..
> 
> 
> I'm able to cause a complete kernel hang by defrag'ing even one 
> file on 3.9.X (3.9.0 through 3.9.4, so far).

I see similar behavior with autodefrag enabled. When fetching mails
with claws (piped through bogofilter) the whole system got stuck more
or less. I can switch between the tasks but everything involving I/O is
hanging. Most time I was able to solve this with the sync command in a
shell. I got only one time a backtrace from hung task checker:

Jun 20 12:37:47 localhost kernel: INFO: task btrfs-cleaner:771 blocked for more 
than 120 seconds.
Jun 20 12:37:47 localhost kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 20 12:37:47 localhost kernel: btrfs-cleaner   D 88011fc91740 0   
771  2 0x
Jun 20 12:37:47 localhost kernel: 88011aaff530 0002 
880119929fd8 00011740
Jun 20 12:37:47 localhost kernel: 880119929fd8 88011aabddc0 
880119928000 880119929cb8
Jun 20 12:37:47 localhost kernel: 8800aaac22b4 8800aaac22b0 
 8800aaac22b8
Jun 20 12:37:47 localhost kernel: Call Trace:
Jun 20 12:37:47 localhost kernel: [] ? 
schedule_preempt_disabled+0x19/0x1e
Jun 20 12:37:47 localhost kernel: [] ? 
__mutex_lock_common.isra.9+0x19d/0x283
Jun 20 12:37:47 localhost kernel: [] ? 
ondemand_readahead+0x15d/0x200
Jun 20 12:37:47 localhost kernel: [] ? mutex_lock+0xe/0x1d
Jun 20 12:37:47 localhost kernel: [] ? 
btrfs_defrag_file+0x41a/0xa4e
Jun 20 12:37:47 localhost kernel: [] ? 
_raw_spin_unlock+0x27/0x31
Jun 20 12:37:47 localhost kernel: [] ? 
btrfs_run_defrag_inodes+0x1f7/0x2d3
Jun 20 12:37:47 localhost kernel: [] ? 
btrfs_run_delayed_iputs+0x44/0xbe
Jun 20 12:37:47 localhost kernel: [] ? 
cleaner_kthread+0x89/0xf3
Jun 20 12:37:47 localhost kernel: [] ? 
transaction_kthread+0x17a/0x17a
Jun 20 12:37:47 localhost kernel: [] ? kthread+0x7d/0x85
Jun 20 12:37:47 localhost kernel: [] ? 
thaw_workqueues+0xd3/0xff
Jun 20 12:37:47 localhost kernel: [] ? 
__kthread_parkme+0x59/0x59
Jun 20 12:37:47 localhost kernel: [] ? ret_from_fork+0x7c/0xb0
Jun 20 12:37:47 localhost kernel: [] ? 
__kthread_parkme+0x59/0x59
Jun 20 12:37:47 localhost kernel: INFO: task konqueror:2384 blocked for more 
than 120 seconds.
Jun 20 12:37:47 localhost kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 20 12:37:47 localhost kernel: konqueror   D 88011fc11740 0  
2384   2133 0x
Jun 20 12:37:47 localhost kernel: 8800c82dce20 0002 
8800c0c55fd8 00011740
Jun 20 12:37:47 localhost kernel: 8800c0c55fd8 81a10400 
8800c0c54000 8800c0c55e90
Jun 20 12:37:47 localhost kernel: 8800c40aeb44 8800c40aeb40 
 8800c40aeb48
Jun 20 12:37:47 localhost kernel: Call Trace:
Jun 20 12:37:47 localhost kernel: [] ? 
schedule_preempt_disabled+0x19/0x1e
Jun 20 12:37:47 localhost kernel: [] ? 
__mutex_lock_common.isra.9+0x19d/0x283
Jun 20 12:37:47 localhost kernel: [] ? mutex_lock+0xe/0x1d
Jun 20 12:37:47 localhost kernel: [] ? do_unlinkat+0x88/0x17a
Jun 20 12:37:47 localhost kernel: [] ? 
_raw_spin_unlock+0x27/0x31
Jun 20 12:37:47 localhost kernel: [] ? 
syscall_trace_enter+0xcf/0x145
Jun 20 12:37:47 localhost kernel: [] ? tracesys+0xdd/0xe2
Jun 20 12:37:47 localhost kernel: INFO: task konqueror:2385 blocked for more 
than 120 seconds.
Jun 20 12:37:47 localhost kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 20 12:37:47 localhost kernel: konqueror   D 88011fc91740 0  
2385   2133 0x
Jun 20 12:37:47 localhost kernel: 8800c82dd5f0 0002 
8800c0d1dfd8 00011740
Jun 20 12:37:47 localhost kernel: 8800c0d1dfd8 88011aabddc0 
8800c0d1c000 8800c0d1de90
Jun 20 12:37:47 localhost kernel: 8800c40aeb44 8800c40aeb40 
 8800c40aeb48
Jun 20 12:37:47 localhost kernel: Call Trace:
Jun 20 12:37:47 localhost kernel: [] ? 
schedule_preempt_disabled+0x19/0x1e
Jun 20 12:37:47 localhost kernel: [] ? 
__mutex_lock_common.isra.9+0x19d/0x283
Jun 20 12:37:47 localhost kernel: [] ? mutex_lock+0xe/0x1d
Jun 20 12:37:47 localhost kernel: [] ? do_unlinkat+0x88/0x17a
Jun 20 12:37:47 localhost kernel: [] ? 
_raw_spin_unlock+0x27/0x31
Jun 20 12:37:47 localhost kernel: [] ? 
syscall_trace_enter+0xcf/0x145
Jun 20 12:37:47 localhost kernel: [] ? tracesys+0xdd/0xe2
Jun 20 12:37:47 localhost kernel: INFO: task konqueror:2393 blocked for more 
than 120 seconds.
Jun 20 12:37:47 localhost kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 20 12:

Re: WARNING: at fs/btrfs/extent_map.c:77 free_extent_map

2013-03-12 Thread Johannes Hirte

On Tue, 12 Mar 2013 09:39:35 +0800
Liu Bo  wrote:

> Hi Johannes,
> 
> Could you please tell us what mount options you're with?
> 
> thanks,
> liubo

The Filesystem has six subvolumes, so mount options are:

noatime,inode_cache,autodefrag,subvolid=...

for each subvol.

I was able to reproduce it and getting a full backtrace
with netconsole:

[ cut here ]
WARNING: at fs/btrfs/extent_map.c:77 free_extent_map+0x64/0x76()
Hardware name: EasyNote TK81
Modules linked in: netconsole configfs nfsd exportfs auth_rpcgss nfs_acl fuse 
nfs lockd sunrpc snd_hda_codec_hdmi ath9k sr_mod snd_hda_codec_realtek 
acpi_cpufreq broadcom tg3 acer_wmi k10temp mperf ath9k_common ath9k_hw ath 
snd_hda_intel snd_hda_codec snd_hwdep snd_pcm cdrom snd_page_alloc snd_timer 
snd wmi soundcore i2c_piix4 ohci_hcd
Pid: 15870, comm: bogofilter Not tainted 3.9.0-rc2-00112-g7c6baa3 #294
Call Trace:
 [] ? warn_slowpath_common+0x76/0x8c
 [] ? free_extent_map+0x64/0x76
 [] ? btrfs_drop_extent_cache+0x363/0x39f
 [] ? __cow_file_range+0x175/0x3c1
 [] ? find_get_pages_contig+0x100/0x115
 [] ? join_transaction.isra.34+0x30f/0x31a
 [] ? start_transaction+0x2d8/0x3e8
 [] ? cow_file_range+0xa9/0xc5
 [] ? run_delalloc_range+0x9d/0x33b
 [] ? free_extent_state+0x12/0x21
 [] ? __extent_writepage+0x1a8/0x5d8
 [] ? end_extent_writepage+0x5d/0x5d
 [] ? 
extent_write_cache_pages.isra.29.constprop.47+0x14a/0x255
 [] ? extent_writepages+0x49/0x60
 [] ? btrfs_update_inode_item+0xde/0xde
 [] ? __filemap_fdatawrite_range+0x4d/0x52
 [] ? btrfs_sync_file+0x48/0x203
 [] ? do_fsync+0x2b/0x50
 [] ? sys_fdatasync+0xb/0xf
 [] ? system_call_fastpath+0x16/0x1b
---[ end trace a3b02a44716bacc5 ]---
[ cut here ]
WARNING: at fs/btrfs/extent_map.c:77 free_extent_map+0x64/0x76()
Hardware name: EasyNote TK81
Modules linked in: netconsole configfs nfsd exportfs auth_rpcgss nfs_acl fuse 
nfs lockd sunrpc snd_hda_codec_hdmi ath9k sr_mod snd_hda_codec_realtek 
acpi_cpufreq broadcom tg3 acer_wmi k10temp mperf ath9k_common ath9k_hw ath 
snd_hda_intel snd_hda_codec snd_hwdep snd_pcm cdrom snd_page_alloc snd_timer 
snd wmi soundcore i2c_piix4 ohci_hcd
Pid: 15870, comm: bogofilter Tainted: GW3.9.0-rc2-00112-g7c6baa3 
#294
Call Trace:
 [] ? warn_slowpath_common+0x76/0x8c
 [] ? free_extent_map+0x64/0x76
 [] ? btrfs_drop_extent_cache+0x363/0x39f
 [] ? __cow_file_range+0x175/0x3c1
 [] ? find_get_pages_contig+0x100/0x115
 [] ? join_transaction.isra.34+0x30f/0x31a
 [] ? start_transaction+0x2d8/0x3e8
 [] ? cow_file_range+0xa9/0xc5
 [] ? run_delalloc_range+0x9d/0x33b
 [] ? free_extent_state+0x12/0x21
 [] ? __extent_writepage+0x1a8/0x5d8
 [] ? end_extent_writepage+0x5d/0x5d
 [] ? 
extent_write_cache_pages.isra.29.constprop.47+0x14a/0x255
 [] ? extent_writepages+0x49/0x60
 [] ? btrfs_update_inode_item+0xde/0xde
 [] ? __filemap_fdatawrite_range+0x4d/0x52
 [] ? btrfs_sync_file+0x48/0x203
 [] ? do_fsync+0x2b/0x50
 [] ? sys_fdatasync+0xb/0xf
 [] ? system_call_fastpath+0x16/0x1b
---[ end trace a3b02a44716bacc6 ]---
[ cut here ]
WARNING: at fs/btrfs/extent_map.c:77 free_extent_map+0x64/0x76()
Hardware name: EasyNote TK81
Modules linked in: netconsole configfs nfsd exportfs auth_rpcgss nfs_acl fuse 
nfs lockd sunrpc snd_hda_codec_hdmi ath9k sr_mod snd_hda_codec_realtek 
acpi_cpufreq broadcom tg3 acer_wmi k10temp mperf ath9k_common ath9k_hw ath 
snd_hda_intel snd_hda_codec snd_hwdep snd_pcm cdrom snd_page_alloc snd_timer 
snd wmi soundcore i2c_piix4 ohci_hcd
Pid: 15887, comm: bogofilter Tainted: GW3.9.0-rc2-00112-g7c6baa3 
#294
Call Trace:
 [] ? warn_slowpath_common+0x76/0x8c
 [] ? free_extent_map+0x64/0x76
 [] ? btrfs_drop_extent_cache+0x363/0x39f
 [] ? __cow_file_range+0x175/0x3c1
 [] ? join_transaction.isra.34+0x30f/0x31a
 [] ? start_transaction+0x2d8/0x3e8
 [] ? cow_file_range+0xa9/0xc5
 [] ? run_delalloc_range+0x9d/0x33b
 [] ? free_extent_state+0x12/0x21
 [] ? __extent_writepage+0x1a8/0x5d8
 [] ? end_extent_writepage+0x5d/0x5d
 [] ? 
extent_write_cache_pages.isra.29.constprop.47+0x14a/0x255
 [] ? extent_writepages+0x49/0x60
 [] ? btrfs_update_inode_item+0xde/0xde
 [] ? __filemap_fdatawrite_range+0x4d/0x52
 [] ? btrfs_sync_file+0x48/0x203
 [] ? vfs_write+0xaf/0xf8
 [] ? do_fsync+0x2b/0x50
 [] ? sys_fdatasync+0xb/0xf
 [] ? system_call_fastpath+0x16/0x1b
---[ end trace a3b02a44716bacc7 ]---
[ cut here ]
WARNING: at fs/btrfs/extent_map.c:77 free_extent_map+0x64/0x76()
Hardware name: EasyNote TK81
Modules linked in: netconsole configfs nfsd exportfs auth_rpcgss nfs_acl fuse 
nfs lockd sunrpc snd_hda_codec_hdmi ath9k sr_mod snd_hda_codec_realtek 
acpi_cpufreq broadcom tg3 acer_wmi k10temp mperf ath9k_common ath9k_hw ath 
snd_hda_intel snd_hda_codec snd_hwdep snd_pcm cdrom snd_page_alloc snd_timer 
snd wmi soundcore i2c_piix4 ohci_hcd
Pid: 15887, comm: bogofilter Tainted: GW3.9.0-rc2-00112-g7c6baa3 
#294
Call Trace:
 [] ? warn_slowpath_common+0x76/0x8c
 [] ? free_extent_map+0x6

WARNING: at fs/btrfs/extent_map.c:77 free_extent_map

2013-03-11 Thread Johannes Hirte

Since the updates for linux-3.9 I've had three or four times a system
freeze and only a reset (Magic SysRq) helped. After the reboot I found
a bunch of this in syslog:

Mar 11 21:56:09 localhost kernel: [ cut here ]
Mar 11 21:56:09 localhost kernel: WARNING: at fs/btrfs/extent_map.c:77 
free_extent_map+0x64/0x76()
Mar 11 21:56:09 localhost kernel: Hardware name: EasyNote TK81
Mar 11 21:56:09 localhost kernel: Modules linked in: nfsv4 nfsd exportfs 
auth_rpcgss nfs_acl fuse nfs lockd sunrpc snd_hda_codec_hdmi 
snd_hda_codec_realtek snd_hda_intel ath9k snd_hda_codec ath9k_common ath9k_hw 
acer_wmi snd_hwdep snd_pcm ath sr_mod wmi broadcom snd_page_alloc snd_timer 
cdrom tg3 k10temp snd acpi_cpufreq ohci_hcd soundcore i2c_piix4 mperf
Mar 11 21:56:09 localhost kernel: Pid: 11260, comm: bogofilter Tainted: G   
 W3.9.0-rc2 #293
Mar 11 21:56:09 localhost kernel: Call Trace:
Mar 11 21:56:09 localhost kernel: [] ? 
warn_slowpath_common+0x76/0x8c
Mar 11 21:56:09 localhost kernel: [] ? 
free_extent_map+0x64/0x76
Mar 11 21:56:09 localhost kernel: [] ? 
btrfs_drop_extent_cache+0x363/0x39f
Mar 11 21:56:09 localhost kernel: [] ? 
__cow_file_range+0x175/0x3c1
Mar 11 21:56:09 localhost kernel: [] ? 
join_transaction.isra.34+0x30f/0x31a
Mar 11 21:56:09 localhost kernel: [] ? 
start_transaction+0x2d8/0x3e8
Mar 11 21:56:09 localhost kernel: [] ? 
cow_file_range+0xa9/0xc5
Mar 11 21:56:09 localhost kernel: [] ? 
run_delalloc_range+0x9d/0x33b
Mar 11 21:56:09 localhost kernel: [] ? 
free_extent_state+0x12/0x21
Mar 11 21:56:09 localhost kernel: [] ? 
__extent_writepage+0x1a8/0x5d8
Mar 11 21:56:09 localhost kernel: [] ? 
end_extent_writepage+0x5d/0x5d
Mar 11 21:56:09 localhost kernel: [] ? 
extent_write_cache_pages.isra.29.constprop.47+0x14a/0x255
Mar 11 21:56:09 localhost kernel: [] ? 
extent_writepages+0x49/0x60
Mar 11 21:56:09 localhost kernel: [] ? 
btrfs_update_inode_item+0xde/0xde
Mar 11 21:56:09 localhost kernel: [] ? 
__filemap_fdatawrite_range+0x4d/0x52
Mar 11 21:56:09 localhost kernel: [] ? 
btrfs_sync_file+0x48/0x203
Mar 11 21:56:09 localhost kernel: [] ? vfs_write+0xaf/0xf8
Mar 11 21:56:09 localhost kernel: [] ? do_fsync+0x2b/0x50
Mar 11 21:56:09 localhost kernel: [] ? sys_fdatasync+0xb/0xf
Mar 11 21:56:09 localhost kernel: [] ? 
system_call_fastpath+0x16/0x1b
Mar 11 21:56:09 localhost kernel: ---[ end trace 3eaea449d8d56f92 ]---

As far as I remeber, it happend when fetching emails with claws. But
it's not a reliable testcase. 

Another trace from the first time I found it in the logs. But here the
system didn't hang:

Mar  4 14:28:35 localhost kernel: [ cut here ]
Mar  4 14:28:35 localhost kernel: WARNING: at fs/btrfs/extent_map.c:77 
free_extent_map+0x64/0x76()
Mar  4 14:28:35 localhost kernel: Hardware name: EasyNote TK81
Mar  4 14:28:35 localhost kernel: Modules linked in: nfsd exportfs auth_rpcgss 
nfs_acl fuse nfs lockd sunrpc snd_hda_codec_hdmi snd_hda_codec_realtek 
snd_hda_intel ath9k snd_hda_codec ath9k_common snd_hwdep snd_pcm broadcom 
ath9k_hw snd_page_alloc ath sr_mod snd_timer acer_wmi snd cdrom wmi tg3 
ohci_hcd soundcore k10temp edac_core acpi_cpufreq i2c_piix4 mperf
Mar  4 14:28:35 localhost kernel: Pid: 1574, comm: flush-btrfs-1 Not tainted 
3.9.0-rc1 #289
Mar  4 14:28:35 localhost kernel: Call Trace:
Mar  4 14:28:35 localhost kernel: [] ? 
warn_slowpath_common+0x76/0x8c
Mar  4 14:28:35 localhost kernel: [] ? 
free_extent_map+0x64/0x76
Mar  4 14:28:35 localhost kernel: [] ? 
btrfs_drop_extent_cache+0x363/0x39f
Mar  4 14:28:35 localhost kernel: [] ? 
__cow_file_range+0x175/0x3c1
Mar  4 14:28:36 localhost kernel: [] ? 
_raw_spin_unlock+0x1c/0x28
Mar  4 14:28:36 localhost kernel: [] ? 
release_extent_buffer.isra.25+0x90/0x97
Mar  4 14:28:36 localhost kernel: [] ? 
run_delalloc_nocow+0x6fa/0x795
Mar  4 14:28:36 localhost kernel: [] ? 
run_delalloc_range+0x64/0x33b
Mar  4 14:28:36 localhost kernel: [] ? 
free_extent_state+0x12/0x21
Mar  4 14:28:36 localhost kernel: [] ? 
__extent_writepage+0x1a8/0x5d8
Mar  4 14:28:36 localhost kernel: [] ? 
end_extent_writepage+0x5d/0x5d
Mar  4 14:28:36 localhost kernel: [] ? 
cpumask_any_but+0x25/0x34
Mar  4 14:28:36 localhost kernel: [] ? 
vma_interval_tree_subtree_search+0x33/0x55
Mar  4 14:28:36 localhost kernel: [] ? 
page_mkclean+0x107/0x119
Mar  4 14:28:36 localhost kernel: [] ? 
extent_write_cache_pages.isra.29.constprop.47+0x14a/0x255
Mar  4 14:28:36 localhost kernel: [] ? 
btrfs_submit_bio_hook+0x14f/0x14f
Mar  4 14:28:36 localhost kernel: [] ? 
extent_writepages+0x49/0x60
Mar  4 14:28:36 localhost kernel: [] ? 
btrfs_update_inode_item+0xde/0xde
Mar  4 14:28:36 localhost kernel: [] ? 
_raw_spin_unlock+0x1c/0x28
Mar  4 14:28:36 localhost kernel: [] ? 
__writeback_single_inode+0x37/0xd6
Mar  4 14:28:36 localhost kernel: [] ? 
writeback_sb_inodes+0x1b8/0x2d3
Mar  4 14:28:36 localhost kernel: [] ? 
__writeback_inodes_wb+0x69/0xab
Mar  4 14:28:36 localhost kernel: [] ? wb_writeback+0xfa/0x193
Mar  4 14:28:36 lo

Re: Another defrag question

2013-02-21 Thread Johannes Hirte

On Thu, 21 Feb 2013 18:47:28 +0100
Swâmi Petaramesh  wrote:

> Le 21/02/2013 18:25, Hugo Mills a écrit :
> > Correct. But btrfs isn't at that stage yet. It's getting visibly
> > closer, but it's not quite there. Hence the very strong
> > recommendation to keep up with the latest code. Hugo. 
> 
> The matter is that BTRFS had many early adopters just because it is -
> and has been for long now - in the mainline Linux kernel, so supposed
> stable and good choice for the future.

And it's marked as EXPERIMENTAL. So if you want to join the game, you
have to accept the rules.

> OTOH my 6th machine runs native ZFS on Linux, and I have to tell that
> it shows orders of magnitude better performance and never gave me a
> single problem in several (3 ?) years. Only upgrading the distro is
> always a big frightening and problematic. And initial installation
> was a bit tricky.

You didn't, many other had. I remember a lot of threads in the
OpenSolaris forum, where the solution for problems was: recreate your
filesystem, replay your backup.

> Everytime I show my Linux machines to friends and say : “Hey, I got
> the most advanced filesystem on earth !” I soon get the answer “Oh
> boy, that's the slowest machine boot and FS I've ever seen since I was
> reading floppy disks on my 386SX in 1991 ! Can you really live with
> this ?”

Did you presented it while (re)creating the inode_cache? Sounds a
little like that.

> So, for "not quite there" and the return codes "+20" that have been a
> minor pain in the arse for a couple years but the line is still in the
> code... I can understand developer's PoV, been there, done that, but
> still, BTRFS might in the end lose a numer of its early adopters if it
> keeps being "not quite there" too long.
> 
> Shitfing to ZFS is just a PPA and 2 apt-get install commands away...
> It will definitely be easier than start playing with mainline PPA
> Ubuntu kernels...

So why do you bother with btrfs, if ZFS fit your needs?


regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs and 1 billion small files

2012-05-07 Thread Johannes Hirte

Am Mon, 7 May 2012 12:39:28 +0100
schrieb Hugo Mills :

> On Mon, May 07, 2012 at 01:15:26PM +0200, Alessio Focardi wrote:
...
> > That's a very clever suggestion, I'm preparing a test server right
> > now: going to use the -m single option. Any other suggestion
> > regarding format options?
> > 
> > pagesize? leafsize?
> 
>I'm not sure about these -- some values of them definitely break
> things. I think they are required to be the same, and that you could
> take them up to 64k with no major problems, but do check that first
> with someone who actually knows.

First, if you have this filesystem as rootfs, a separate /boot
partition is needed. Grub is unable to boot from btrfs with different
node-/leafsize. Second a very recent kernel is needed (linux-3.4-rc1 at
least).

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel BUG at fs/btrfs/delayed-inode.c:1466!

2012-03-12 Thread Johannes Hirte

Am Mon, 12 Mar 2012 15:21:49 +0100
schrieb Jacek Luczak :

> > 2) A *regression* in 3.3.0-rc6-00197-g9f8050c
> > - completely unusable as reports ENOSPC
> > - to reproduce, mount volume and issue:
> > # CNT=1 ; while [ $CNT -lt 1 ] ; do  rm -f /btrfs/dd ; ! touch
> > /btrfs/dd && echo "$CNT" && break  ; CNT=$(( $CNT + 1 )) ; done
> > On my host this shows:
> > # CNT=1 ; while [ $CNT -lt 1 ] ; do  rm -f /btrfs/dd ; ! touch
> > /btrfs/dd && echo "$CNT" && break  ; CNT=$(( $CNT + 1 )) ; done
> > touch: cannot touch `/btrfs/dd': No space left on device
> > 423
> > - remount to reset:
> > # CNT=1 ; while [ $CNT -lt 1 ] ; do  rm -f /btrfs/dd ; ! touch
> > /btrfs/dd && echo "$CNT" && break  ; CNT=$(( $CNT + 1 )) ; done
> > touch: cannot touch `/btrfs/dd': No space left on device
> > 1
> > # umount /btrfs/
> > # mount -t btrfs /dev/vg00/btrfs /btrfs/ -o
> > noatime,nodatacow,defaults # CNT=1 ; while [ $CNT -lt 1 ] ; do
> >  rm -f /btrfs/dd ; ! touch /vdd && echo "$CNT" && break  ;
> > CNT=$(( $CNT + 1 )) ; done touch: cannot touch `/btrfs/dd': No
> > space left on device 423
> > - bisected down to 5500cdb (Btrfs: increase the global block reserve
> > estimates). After reverting this one Linus master works for me
> > again.
> 
> With above patch reverted after a longer run I've got ENOSPC again:
> 1) # df -hP /btrfs
> FilesystemSize  Used Avail Use% Mounted on
> /dev/mapper/vg00-btrfs  195G  179G   11G  95% /btrfs
> 2) # rm -f /btrfs/dd
> rm: cannot remove `/btrfs/dd': No space left on device
> 3) strace
> unlink("/btrfs/dd")= -1 ENOSPC (No space left on
> device) 4) last message from kernel (except WARN_ONs):
> btrfs: fail to dirty inode 116882385  
> 
> I've remouted volume and after that I've been able to remove dd file
> from volume. In dmesg there's bunch on new WARN_ONs:
> [ cut here ]
> WARNING: at fs/btrfs/extent-tree.c:4185
> btrfs_free_block_groups+0x17d/0x2b8 [btrfs]()
> Hardware name: ProLiant BL460c G6
> Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf
> autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa
> ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3
> ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror
> dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac
> parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm
> hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core
> hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd
> Pid: 9518, comm: umount Tainted: GW
> 3.3.0-rc6-00197-g9f8050c-dirty #1 Call Trace:
>  [] ? print_oops_end_marker+0x9/0x20
>  [] ? btrfs_free_block_groups+0x17d/0x2b8 [btrfs]
>  [] ? warn_slowpath_common+0x78/0x8d
>  [] ? btrfs_free_block_groups+0x17d/0x2b8 [btrfs]
>  [] ? close_ctree+0x1e1/0x380 [btrfs]
>  [] ? dispose_list+0x27/0x31
>  [] ? evict_inodes+0xc5/0xcc
>  [] ? generic_shutdown_super+0x4d/0xc1
>  [] ? kill_anon_super+0x9/0x11
>  [] ? btrfs_kill_super+0xd/0x73 [btrfs]
>  [] ? deactivate_locked_super+0x2f/0x5f
>  [] ? sys_umount+0x2c1/0x30b
>  [] ? system_call_fastpath+0x16/0x1b
> ---[ end trace fd6da849e53b77dd ]---
> [ cut here ]
> WARNING: at fs/btrfs/extent-tree.c:4186
> btrfs_free_block_groups+0x198/0x2b8 [btrfs]()
> Hardware name: ProLiant BL460c G6
> Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf
> autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa
> ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3
> ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror
> dm_region_hash dm_log dm_multipath video battery acpi_pad acpi_ipmi ac
> parport usbhid evdev acpi_power_meter radeon ttm drm_kms_helper drm
> hwmon ipmi_si bnx2x backlight i2c_algo_bit ipmi_msghandler i2c_core
> hpilo mdio hpwdt psmouse uhci_hcd ehci_hcd
> Pid: 9518, comm: umount Tainted: GW
> 3.3.0-rc6-00197-g9f8050c-dirty #1 Call Trace:
>  [] ? print_oops_end_marker+0x9/0x20
>  [] ? btrfs_free_block_groups+0x198/0x2b8 [btrfs]
>  [] ? warn_slowpath_common+0x78/0x8d
>  [] ? btrfs_free_block_groups+0x198/0x2b8 [btrfs]
>  [] ? close_ctree+0x1e1/0x380 [btrfs]
>  [] ? dispose_list+0x27/0x31
>  [] ? evict_inodes+0xc5/0xcc
>  [] ? generic_shutdown_super+0x4d/0xc1
>  [] ? kill_anon_super+0x9/0x11
>  [] ? btrfs_kill_super+0xd/0x73 [btrfs]
>  [] ? deactivate_locked_super+0x2f/0x5f
>  [] ? sys_umount+0x2c1/0x30b
>  [] ? system_call_fastpath+0x16/0x1b
> ---[ end trace fd6da849e53b77de ]---
> [ cut here ]
> WARNING: at fs/btrfs/extent-tree.c:4187
> btrfs_free_block_groups+0x1b3/0x2b8 [btrfs]()
> Hardware name: ProLiant BL460c G6
> Modules linked in: btrfs zlib_deflate lzo_compress ipmi_devintf
> autofs4 be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa
> ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi iw_cxgb3
> ib_core cxgb3 libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror
> dm_region_hash dm_log

Re: [PATCH] Btrfs: hold enough space for global_rsv

2012-03-10 Thread Johannes Hirte

Am Fri, 09 Mar 2012 09:28:56 +0800
schrieb Liu Bo :

> On 03/09/2012 03:22 AM, Johannes Hirte wrote:
> > Am Tue, 6 Mar 2012 14:50:32 +0100
> > schrieb Johannes Hirte :
> > 
> >> I've backed up the filesystem, deleted the subvolumes, recreated
> >> them and copied the data back. Now everything seems to work again.
> >> I've also a full image of the damaged filesystem for further
> >> investigation. If someone has an idea for testing, I'm happy to try
> >> it.
> > 
> > It's much worse than I thought. After a short time the same error
> > happened again (no space left on device). So recreated the
> > filesystem (mkbtrfs with default values) and copied the data from
> > the backup back, but the error still came back. I'm now on kernel
> > 3.2 which seems to work. I'll try to bisect the bad commit. For
> > info, df says:
> > 
> 
> OK, plz show us the results after your bisect, let's narrow down
> where goes wrong.
> 
> thanks,
> liubo

Bisect points again to:

5500cdbe14d7435e04f66ff3cfb8ecd8b8e44ebf is the first bad commit
commit 5500cdbe14d7435e04f66ff3cfb8ecd8b8e44ebf
Author: Liu Bo 
Date:   Thu Feb 23 10:49:04 2012 -0500

Btrfs: increase the global block reserve estimates

When doing IO with large amounts of data fragmentation, the global
block reserve calulations are too low.  This increases them to avoid
ENOSPC crashes.

Signed-off-by: Liu Bo 
Signed-off-by: Chris Mason 

The revision before is working and reverting this commit from master
works too. But as mentioned before, I'm not sure if this is root cause.
First time I've seen the error it happened without this patch too later
on.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: filesystem full when it's not? out of inodes? huh?

2012-03-09 Thread Johannes Hirte

Am Sat, 25 Feb 2012 20:05:13 -0800
schrieb Fahrzin Hemmati :

> No, at least not yet, nor am I aware of any plans for subvolume
> quotas, though I could be wrong. 

Arne Jansen is working on it, IIRC.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: hold enough space for global_rsv

2012-03-08 Thread Johannes Hirte

Am Tue, 6 Mar 2012 14:50:32 +0100
schrieb Johannes Hirte :

> I've backed up the filesystem, deleted the subvolumes, recreated them
> and copied the data back. Now everything seems to work again. I've
> also a full image of the damaged filesystem for further
> investigation. If someone has an idea for testing, I'm happy to try
> it.

It's much worse than I thought. After a short time the same error
happened again (no space left on device). So recreated the filesystem
(mkbtrfs with default values) and copied the data from the backup back,
but the error still came back. I'm now on kernel 3.2 which seems to
work. I'll try to bisect the bad commit. For info, df says:

Filesystem  Size  Used Avail Use% Mounted on
rootfs  200G  128G   69G  66% /
/dev/sda1   200G  128G   69G  66% /
rc-svcdir   1.0M  128K  896K  13% /lib64/rc/init.d
cgroup_root  10M   52K   10M   1% /sys/fs/cgroup
udev 10M  168K  9.9M   2% /dev
shm 2.0G 0  2.0G   0% /dev/shm
/dev/sda1   200G  128G   69G  66% /home

and btrfs fi df:

Data: total=149.01GB, used=118.57GB
System, DUP: total=8.00MB, used=24.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=6.38GB, used=4.55GB
Metadata: total=8.00MB, used=0.0

Kernel 3.3-rc6 fails on this with "no space left on device".
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: hold enough space for global_rsv

2012-03-06 Thread Johannes Hirte

Am Tue, 28 Feb 2012 10:06:14 +0800
schrieb Liu Bo :

> On 02/27/2012 09:29 PM, Johannes Hirte wrote:
> > Am Tue, 17 Jan 2012 17:51:59 +0800
> > schrieb Liu Bo :
> > 
> >> I've kept hitting enospc warnings of global_rsv while running
> >> defragment on files:
> >> btrfs: block rsv returned -28
> >> WARNING: at fs/btrfs/extent-tree.c:5984
> >> btrfs_alloc_free_block+0x333/0x340 [btrfs]() ...
> >>
> >> I used a fio jobs to create a file with lots of fragments:
> >> $ filefrag /mnt/btrfs/foobar
> >> /mnt/btrfs/foobar: 66964 extents found
> >>
> >> and then "btrfs fi defrag /mnt/btrfs/foobar && sync" would pop the
> >> warnings.
> >>
> >> I found that the global_rsv size is just not enough for defragment,
> >> and didn't find any space leak in using global_rsv, so double it
> >> and go ahead.
> >>
> >> Signed-off-by: Liu Bo 
> >> ---
> >>  fs/btrfs/extent-tree.c |2 +-
> >>  1 files changed, 1 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> >> index 8603ee4..77ea23c 100644
> >> --- a/fs/btrfs/extent-tree.c
> >> +++ b/fs/btrfs/extent-tree.c
> >> @@ -3979,7 +3979,7 @@ static u64 calc_global_metadata_size(struct
> >> btrfs_fs_info *fs_info) num_bytes += div64_u64(data_used +
> >> meta_used, 50); 
> >>if (num_bytes * 3 > meta_used)
> >> -  num_bytes = div64_u64(meta_used, 3);
> >> +  num_bytes = div64_u64(meta_used, 3) * 2;
> >>  
> >>return ALIGN(num_bytes, fs_info->extent_root->leafsize <<
> >> 10); }
> > 
> > This patch breakes my system. With this applied all services fail on
> > boot with "no space left" messages.
> > 
> 
> It's weird since this patch is just aiming to enlarge our metadata
> reservation count.
> 
> so you've tried a revert or a bisect, right?  Can you show me the
> environment or any log messages?
> 
> thanks,
> liubo

Sorry for the long delay. My system was really screwed up and
took time to fix it.
First, it wasn't your patch that made the system fail. At this time, it
was the first revision that didn't work anymore. I don't know why this
one. Short time later also earlier revisions showed that error. I was
able to boot with a live system from USB stick. The filesystem was
mountable and readable, but I couldn't modify or create a single file.
Two or three times I got a

btrfs: fail to dirty inode 256 error -28

but most times nothing was reported in the logs.

The filesystem consists of three subvolumes, the default one, one for
rootfs and one for home. If I did a defrag on the rootfs, I was able to
create files. But after unmounting and remounting the filesystem, the
same error appeared again. Also a balance of the filesystem resulted in
no space error after some time.
I've backed up the filesystem, deleted the subvolumes, recreated them
and copied the data back. Now everything seems to work again. I've also
a full image of the damaged filesystem for further investigation. If
someone has an idea for testing, I'm happy to try it.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: hold enough space for global_rsv

2012-02-27 Thread Johannes Hirte

Am Tue, 17 Jan 2012 17:51:59 +0800
schrieb Liu Bo :

> I've kept hitting enospc warnings of global_rsv while running
> defragment on files:
> btrfs: block rsv returned -28
> WARNING: at fs/btrfs/extent-tree.c:5984
> btrfs_alloc_free_block+0x333/0x340 [btrfs]() ...
> 
> I used a fio jobs to create a file with lots of fragments:
> $ filefrag /mnt/btrfs/foobar
> /mnt/btrfs/foobar: 66964 extents found
> 
> and then "btrfs fi defrag /mnt/btrfs/foobar && sync" would pop the
> warnings.
> 
> I found that the global_rsv size is just not enough for defragment,
> and didn't find any space leak in using global_rsv, so double it and
> go ahead.
> 
> Signed-off-by: Liu Bo 
> ---
>  fs/btrfs/extent-tree.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 8603ee4..77ea23c 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3979,7 +3979,7 @@ static u64 calc_global_metadata_size(struct
> btrfs_fs_info *fs_info) num_bytes += div64_u64(data_used + meta_used,
> 50); 
>   if (num_bytes * 3 > meta_used)
> - num_bytes = div64_u64(meta_used, 3);
> + num_bytes = div64_u64(meta_used, 3) * 2;
>  
>   return ALIGN(num_bytes, fs_info->extent_root->leafsize <<
> 10); }

This patch breakes my system. With this applied all services fail on
boot with "no space left" messages.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: defrag makes fragmentation worse

2011-06-22 Thread Johannes Hirte

On Friday 10 June 2011 01:53:41 David Sterba wrote:
> On Fri, Jun 10, 2011 at 12:48:36AM +0200, Johannes Hirte wrote:
> > I've observed several times that after a btrfs filesystem defrag a file
> > was way more fragmented than before. For example, a file that was
> > recently written, had 10 extents (output from filefrag). After a defrag
> > filefrag showed more than 1900 extents.
> > For curiosity, a simple copy of this "defragmented" file reduced the
> > number of fragments to 1. With a different file I got 63 extents before
> > and over 3000 extents after defrag.
> 
> Do you have compression enabled? Or autodefrag mount option?

No compression, only autodefrag was enabled this time. The last time before I 
saw this, autodefrag didn't exist. 

> 'filefrag -v' will tell you size of the extents, would be interesting
> to see.

Needed some tries but now I have one. Before defrag the file consisted of 174 
extents. Now after defrag there are 786 extents. filefrag -v shows:

Filesystem type is: 9123683e
File size of test1 is 106857600 (26089 blocks, blocksize 4096)
 ext logical physical expected length flags
   0   0  7037185 299 
   1 299  7989102  7037483 64 
   2 363  7037548  7989165  1 
   3 364  7989538  7037548 64 
   4 428  7037613  7989601  1 
   5 429  7990288  7037613 64 
   6 493  7037678  7990351  1 
   7 494  7992819  7037678 64 
   8 558  7037743  7992882  1 
   9 559  7993037  7037743 64 
  10 623  7037809  7993100  1 
  11 624  7993171  7037809 64 
  12 688  9547471  7993234  1 
  13 689  9547947  9547471 64 
  14 753  9547536  9548010  1 
...
   89   16159  9696920  9696590 64 
 490   16223  9696655  9696983  1 
 491   16224  9700654  9696655 64 
 492   16288  9697127  9700717276 
 493   16564  9700718  9697402 64 
 494   16628  9697467  9700781  1 
...
 781   25924 12575962 12535459 64 
 782   25988 12535524 12576025  1 
 783   25989 12576026 12535524 64 
 784   26053 12531601 12576089  1 
 785   26054 12541596 12531601 35 eof
test1: 786 extents found

> > It's no problem if defrag can't reduce the fragmentation. But in this
> > case it shouldn't be done at all.
> 
> AFAIK defragmentation just reads the file, marks all pages dirty and
> lets it be written  back. If the free space is fragmented, so will be
> the newly written copy. I do not know if there is some logic comparing
> old and new extent layout (or if it's even possible).

If there is some comparison between old and new, it seems to be broken (or the 
fiemap from btrfs).

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

how reliable is btrfsck?

2011-06-22 Thread Johannes Hirte

After some problems with btrfs (Oopses), I've checked now the btrfs filesystems 
on my systems as a precaution. On the first system I got

root 256 inode 404596 errors 400
root 256 inode 404603 errors 400
root 256 inode 409540 errors 400
root 256 inode 409562 errors 400
root 258 inode  errors 400
root 258 inode 5556 errors 400
root 258 inode 27624 errors 400
root 258 inode 27659 errors 400
root 258 inode 29243 errors 400
root 258 inode 29255 errors 400
root 258 inode 29256 errors 400
root 258 inode 90400 errors 400
root 258 inode 90401 errors 400
root 258 inode 91155 errors 400
root 258 inode 166025 errors 400
found 71626346496 bytes used err is 1
total csum bytes: 68939428
total tree bytes: 1028964352
total fs tree bytes: 901152768
btree space waste bytes: 274375495
file data blocks allocated: 73391845376
 referenced 70346993664
Btrfs v0.19-50-ge6bd18d

This is a single disk system with three subvolumes additional to the default.
On a second system with a RAID1 setup with two 500G disks I get:

failed to read /dev/sr0
failed to read /dev/sr0
root 5 inode 1891143 errors 400
root 5 inode 1891166 errors 400
root 5 inode 1915207 errors 400
root 5 inode 1915214 errors 400
root 5 inode 1915531 errors 400
root 5 inode 1915547 errors 400
root 5 inode 1915599 errors 400
root 5 inode 1915750 errors 400
root 5 inode 1915777 errors 400
found 391921790976 bytes used err is 1
total csum bytes: 379086532
total tree bytes: 3737182208
total fs tree bytes: 3013656576
btree space waste bytes: 1024370594
file data blocks allocated: 388184608768
 referenced 388184051712
Btrfs v0.19-36-g28da90f

The output is identical for both disks. A third disk in this system doesn't 
show any errors.

So the question is, is btrfsck miss-reporting something here? Or are these 
real errors?

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

defrag makes fragmentation worse

2011-06-09 Thread Johannes Hirte

I've observed several times that after a btrfs filesystem defrag a file was way 
more fragmented than before. For example, a file that was recently written, had 
10 extents (output from filefrag). After a defrag filefrag showed more than 
1900 
extents. For curiosity, a simple copy of this "defragmented" file reduced the 
number of fragments to 1. With a different file I got 63 extents before and 
over 
3000 extents after defrag. 
It's no problem if defrag can't reduce the fragmentation. But in this case it 
shouldn't be done at all. 

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Quota Implementation

2011-06-03 Thread Johannes Hirte

On Friday 03 June 2011 18:24:41 Arne Jansen wrote:
> Hi,
> 
> If no one is already working on it, I'd like to take the Quota lock and
> see how far I come.
> Let me sketch out in short what I'm planning to do:
> 
>   - Quota will be subvolume based. Only the FS-trees and data extents
> will be accounted.
>   - Quota Groups can be defined. Every quota group can comprise any
> number of subvolumes. A subvolume can be assigned to any number
> of quota groups.
>   - A Quota Group can account/limit the total amount of space that is
> referenced by it and/or the amount of space that is exclusively
> referenced (i.e. referenced by no other quota group).
>   - With this it is possible to define a hierarchical quota that need
> not necessarily reflect the filesystem hierarchy.
>   - It is also possible to decide for each snapshot if it should be
> accounted into the parent group. So in a scenario where each
> subvolume reflect a user home, it's possible to have some snapshots
> accounted to the user and others not (e.g. the ones needed for system
> backups).
>   - Quota information will be stored in new records, possibly in a
> separate tree.
>   - It should be possible to change the Quota config and group
> assignments online, though this might need a full re-scan of the fs.
>   - It does NOT include any kind of user/group (UID/GID) quota.
> 
> Any addenda or arguments why it's impossible or insane welcome.

What's the benefit of this complexity? Why not a more simple quota/reservation 
per subvolume? The semantics you described, can be achived by user/group 
quotas too. And we need them anyway. Perhaps this can be implemented together, 
reusing the code. Then we have the question if user/group quotas are per 
filesystem or per subvolume.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Having parent transid verify failed

2011-06-02 Thread Johannes Hirte

On Thursday 05 May 2011 22:32:42 Chris Mason wrote:
> Excerpts from Konstantinos Skarlatos's message of 2011-05-05 16:27:54 -0400:
> > I think i made some progress. When i tried to remove the directory that
> > i suspect contains the problematic file, i got this on the console
> > 
> > rm -rf serverloft/
> 
> Ok, our one bad block is in the extent allocation tree.  This is going
> to be the very hardest thing to fix.
> 
> Until I finish off the code to rebuild parts of the extent allocation
> tree, I think your best bet is to copy the files off.
> 
> The big question is, what happened to make this error?  Can you describe
> your setup in more detail?
> 
> -chris

It seems that I run into the same problem:

parent transid verify failed on 32940560384 wanted 210334 found 210342
BUG: scheduling while atomic: chrome/17058/0x0002
Modules linked in: snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
snd_pcm_oss snd_mixer_oss fuse dm_crypt dm_mod usbhid snd_intel8x0 
snd_ac97_codec sr_mod cdrom ac97_bus snd_pcm sg snd_timer snd e1000 fschmd 
uhci_hcd snd_page_alloc i2c_i801 [last unloaded: microcode]
Pid: 17058, comm: chrome Tainted: GW   2.6.39 #29
Call Trace:
 [] ? schedule+0x78/0x6ef
 [] ? generic_make_request+0x1d5/0x22f
 [] ? submit_bio+0x98/0x9f
 [] ? btrfs_map_bio+0x1ab/0x1b5
 [] ? io_schedule+0x3f/0x50
 [] ? sleep_on_page+0x5/0x8
 [] ? __wait_on_bit+0x31/0x58
 [] ? __lock_page+0x52/0x52
 [] ? wait_on_page_bit+0x5a/0x62
 [] ? autoremove_wake_function+0x29/0x29
 [] ? read_extent_buffer_pages+0x33a/0x3b5
 [] ? btree_read_extent_buffer_pages.clone.51+0x44/0x9e
 [] ? verify_parent_transid+0x147/0x147
 [] ? read_tree_block+0x2d/0x3e
 [] ? read_block_for_search.clone.36+0xc3/0x35d
 [] ? btrfs_tree_unlock+0x19/0x3a
 [] ? unlock_up+0x88/0x9f
 [] ? btrfs_search_slot+0x39d/0x4fe
 [] ? lookup_inline_extent_backref+0x116/0x49b
 [] ? set_extent_dirty+0x19/0x1d
 [] ? __btrfs_free_extent+0xe2/0x6c6
 [] ? run_clustered_refs+0x6ad/0x720
 [] ? btrfs_find_ref_cluster+0x53/0x11f
 [] ? btrfs_run_delayed_refs+0xb8/0x18d
 [] ? __btrfs_end_transaction+0x5a/0x17f
 [] ? btrfs_end_transaction+0x9/0xb
 [] ? btrfs_evict_inode+0x190/0x1a7
 [] ? evict+0x56/0xeb
 [] ? do_unlinkat+0xc3/0x103
 [] ? sysenter_do_call+0x12/0x26
 [] ? console_conditional_schedule+0x8/0xf
parent transid verify failed on 32940560384 wanted 210334 found 210342
BUG: scheduling while atomic: chrome/17058/0x0002
Modules linked in: snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
snd_pcm_oss snd_mixer_oss fuse dm_crypt dm_mod usbhid snd_intel8x0 
snd_ac97_codec sr_mod cdrom ac97_bus snd_pcm sg snd_timer snd e1000 fschmd 
uhci_hcd snd_page_alloc i2c_i801 [last unloaded: microcode]
Pid: 17058, comm: chrome Tainted: GW   2.6.39 #29
Call Trace:
 [] ? schedule+0x78/0x6ef
 [] ? generic_make_request+0x1d5/0x22f
 [] ? submit_bio+0x98/0x9f
 [] ? btrfs_map_bio+0x1ab/0x1b5
 [] ? io_schedule+0x3f/0x50
 [] ? sleep_on_page+0x5/0x8
 [] ? __wait_on_bit+0x31/0x58
 [] ? __lock_page+0x52/0x52
 [] ? wait_on_page_bit+0x5a/0x62
 [] ? autoremove_wake_function+0x29/0x29
 [] ? read_extent_buffer_pages+0x33a/0x3b5
 [] ? lookup_extent_mapping+0x5a/0x148
 [] ? btree_read_extent_buffer_pages.clone.51+0x44/0x9e
 [] ? verify_parent_transid+0x147/0x147
 [] ? read_tree_block+0x2d/0x3e
 [] ? read_block_for_search.clone.36+0xc3/0x35d
 [] ? btrfs_tree_unlock+0x19/0x3a
 [] ? unlock_up+0x88/0x9f
 [] ? btrfs_search_slot+0x39d/0x4fe
 [] ? lookup_inline_extent_backref+0x116/0x49b
 [] ? set_extent_dirty+0x19/0x1d
 [] ? __btrfs_free_extent+0xe2/0x6c6
 [] ? run_clustered_refs+0x6ad/0x720
 [] ? btrfs_find_ref_cluster+0x53/0x11f
 [] ? btrfs_run_delayed_refs+0xb8/0x18d
 [] ? __btrfs_end_transaction+0x5a/0x17f
 [] ? btrfs_end_transaction+0x9/0xb
 [] ? btrfs_evict_inode+0x190/0x1a7
 [] ? evict+0x56/0xeb
 [] ? do_unlinkat+0xc3/0x103
 [] ? sysenter_do_call+0x12/0x26
 [] ? console_conditional_schedule+0x8/0xf
parent transid verify failed on 32940560384 wanted 210334 found 210342
BUG: scheduling while atomic: chrome/17058/0x0002
Modules linked in: snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
snd_pcm_oss snd_mixer_oss fuse dm_crypt dm_mod usbhid snd_intel8x0 
snd_ac97_codec sr_mod cdrom ac97_bus snd_pcm sg snd_timer snd e1000 fschmd 
uhci_hcd snd_page_alloc i2c_i801 [last unloaded: microcode]
Pid: 17058, comm: chrome Tainted: GW   2.6.39 #29
Call Trace:
 [] ? schedule+0x78/0x6ef
 [] ? generic_make_request+0x1d5/0x22f
 [] ? submit_bio+0x98/0x9f
 [] ? btrfs_map_bio+0x1ab/0x1b5
 [] ? io_schedule+0x3f/0x50
 [] ? sleep_on_page+0x5/0x8
 [] ? __wait_on_bit+0x31/0x58
 [] ? __lock_page+0x52/0x52
 [] ? wait_on_page_bit+0x5a/0x62
 [] ? autoremove_wake_function+0x29/0x29
 [] ? read_extent_buffer_pages+0x33a/0x3b5
 [] ? lookup_extent_mapping+0x5a/0x148
 [] ? btree_read_extent_buffer_pages.clone.51+0x44/0x9e
 [] ? verify_parent_transid+0x147/0x147
 [] ? read_tree_block+0x2d/0x3e
 [] ? read_block_for_search.clone.36+0xc3/0x35d
 [] ? btrfs_tree_unlock+0x19/0x3a
 [] ? unlock_up+0x8

Re: [PATCH v2 0/7] Btrfs: New inode number allocator

2011-05-31 Thread Johannes Hirte

On Monday 25 April 2011 10:57:47 Li Zefan wrote:
> Currently btrfs stores the highest objectid of the fs tree, and it always
> returns (highest+1) inode number when we create a file, so inode numbers
> won't be reclaimed when we delete files, so we'll run out of inode numbers
> as we keep create/delete files in 32bits machines.
> 
> This patchset aims to fix this, and it works similar to free space caching
> for block groups.
> 
> I've run xfstests, and I also tested it with snapshot, balance etc.
> 
> More testing is appreciated!
> 
> Changelog v2:
> 
> - Rebased against latest btrfs-unstable tree
> - Fixed several small bugs.
> 
> ---
>  fs/btrfs/btrfs_inode.h  |9 +
>  fs/btrfs/compression.c  |5 +-
>  fs/btrfs/ctree.h|   29 +-
>  fs/btrfs/disk-io.c  |   19 +
>  fs/btrfs/export.c   |   25 +-
>  fs/btrfs/extent-tree.c  |   50 ++-
>  fs/btrfs/extent_io.c|4 +-
>  fs/btrfs/file-item.c|5 +-
>  fs/btrfs/file.c |   27 +-
>  fs/btrfs/free-space-cache.c |  968
> ++- fs/btrfs/free-space-cache.h | 
>  48 ++-
>  fs/btrfs/inode-map.c|  428 +++-
>  fs/btrfs/inode-map.h|   13 +
>  fs/btrfs/inode.c|  282 +++--
>  fs/btrfs/ioctl.c|   22 +-
>  fs/btrfs/relocation.c   |   27 +-
>  fs/btrfs/transaction.c  |   13 +-
>  fs/btrfs/tree-log.c |   54 ++--
>  fs/btrfs/xattr.c|8 +-
>  19 files changed, 1402 insertions(+), 634 deletions(-)

This makes my laptop unusable here. With linux-3.0-rc1 I have the problem that 
booting is horrible slow with very much IO (the kernel thread that reads the 
file tree?). I've never tried how long it would take to boot the system. After 
10-15 minutes I've canceled the boot (sysrq-u, sysrq-b) and took a working 
kernel.

git bisect pointed me to:

commit 581bb050941b4f220f84d3e5ed6dace3d42dd382
Author: Li Zefan 
Date:   Wed Apr 20 10:06:11 2011 +0800

Btrfs: Cache free inode numbers in memory

Currently btrfs stores the highest objectid of the fs tree, and it always
returns (highest+1) inode number when we create a file, so inode numbers
won't be reclaimed when we delete files, so we'll run out of inode numbers
as we keep create/delete files in 32bits machines.

This fixes it, and it works similarly to how we cache free space in block
cgroups.

We start a kernel thread to read the file tree. By scanning inode items,
we know which chunks of inode numbers are free, and we cache them in
an rb-tree.

Because we are searching the commit root, we have to carefully handle the
cross-transaction case.

The rb-tree is a hybrid extent+bitmap tree, so if we have too many small
chunks of inode numbers, we'll use bitmaps. Initially we allow 16K ram
of extents, and a bitmap will be used if we exceed this threshold. The
extents threshold is adjusted in runtime.

Signed-off-by: Li Zefan 

I have three subvolumes here, the default one, one for / and one for /home. 
Don't know if this matters. If you need more infos, please tell me.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BUG: unable to handle kernel NULL pointer dereference at (null)

2011-04-07 Thread Johannes Hirte

On Wednesday 06 April 2011 19:15:41 Josef Bacik wrote:
> On Wed, Apr 06, 2011 at 01:10:38PM +0200, Johannes Hirte wrote:
> > On Tuesday 05 April 2011 23:57:53 Josef Bacik wrote:
> > > > Now it hit
> > > 
> > > Man I cannot catch a break.  I hope this is the last one.  Thanks,
> 
> Ok I give up, I just cleaned it all up and don't mark the pages as dirty
> unless we're actually going to succeed at writing them.  This should fix
> everything
> 
> ---
>  fs/btrfs/ctree.h|5 ++
>  fs/btrfs/file.c |   21 +++
>  fs/btrfs/free-space-cache.c |  117
> --- 3 files changed, 69
> insertions(+), 74 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 3458b57..0d00a07 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -2576,6 +2576,11 @@ int btrfs_drop_extents(struct btrfs_trans_handle
> *trans, struct inode *inode, int btrfs_mark_extent_written(struct
> btrfs_trans_handle *trans,
> struct inode *inode, u64 start, u64 end);
>  int btrfs_release_file(struct inode *inode, struct file *file);
> +void btrfs_drop_pages(struct page **pages, size_t num_pages);
> +int btrfs_dirty_pages(struct btrfs_root *root, struct inode *inode,
> +   struct page **pages, size_t num_pages,
> +   loff_t pos, size_t write_bytes,
> +   struct extent_state **cached);
> 
>  /* tree-defrag.c */
>  int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index e621ea5..75899a0 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -104,7 +104,7 @@ static noinline int btrfs_copy_from_user(loff_t pos,
> int num_pages, /*
>   * unlocks pages after btrfs_file_write is done with them
>   */
> -static noinline void btrfs_drop_pages(struct page **pages, size_t
> num_pages) +void btrfs_drop_pages(struct page **pages, size_t num_pages)
>  {
>   size_t i;
>   for (i = 0; i < num_pages; i++) {
> @@ -127,16 +127,13 @@ static noinline void btrfs_drop_pages(struct page
> **pages, size_t num_pages) * this also makes the decision about creating
> an inline extent vs * doing real data extents, marking pages dirty and
> delalloc as required. */
> -static noinline int dirty_and_release_pages(struct btrfs_root *root,
> - struct file *file,
> - struct page **pages,
> - size_t num_pages,
> - loff_t pos,
> - size_t write_bytes)
> +int btrfs_dirty_pages(struct btrfs_root *root, struct inode *inode,
> +   struct page **pages, size_t num_pages,
> +   loff_t pos, size_t write_bytes,
> +   struct extent_state **cached)
>  {
>   int err = 0;
>   int i;
> - struct inode *inode = fdentry(file)->d_inode;
>   u64 num_bytes;
>   u64 start_pos;
>   u64 end_of_last_block;
> @@ -149,7 +146,7 @@ static noinline int dirty_and_release_pages(struct
> btrfs_root *root,
> 
>   end_of_last_block = start_pos + num_bytes - 1;
>   err = btrfs_set_extent_delalloc(inode, start_pos, end_of_last_block,
> - NULL);
> + cached);
>   if (err)
>   return err;
> 
> @@ -992,9 +989,9 @@ static noinline ssize_t __btrfs_buffered_write(struct
> file *file, }
> 
>   if (copied > 0) {
> - ret = dirty_and_release_pages(root, file, pages,
> -   dirty_pages, pos,
> -   copied);
> + ret = btrfs_dirty_pages(root, inode, pages,
> + dirty_pages, pos, copied,
> + NULL);
>   if (ret) {
>   btrfs_delalloc_release_space(inode,
>   dirty_pages << PAGE_CACHE_SHIFT);
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index f561c95..a3f420d 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -508,6 +508,7 @@ int btrfs_write_out_cache(struct btrfs_root *root,
>   struct inode *inode;
>   struct rb_node *node;
>   struct list_head *pos, *n;
> + struct page **pages;
>   struct page *page;
>   struct extent_state *cached_state = NULL;
>   struct btrfs_free_cluster *cluster = NU

Re: BUG: unable to handle kernel NULL pointer dereference at (null)

2011-04-06 Thread Johannes Hirte

On Wednesday 06 April 2011 22:47:28 Jordan Patterson wrote:
> Hi Josef:
> 
> I tried your latest patch, since I had the same issue from the first
> email.  With the patch applied, I am now hitting the
> BUG_ON(block_group->total_bitmaps >= max_bitmaps); in add_new_bitmap
> in
> fs/btrfs/free-space-cache.c:1246 as soon as I mount the filesystem,
> with or without -o clear_cache.
> 
> It works fine in 2.6.38.  I get the same error after mounting with
> clear_cache under 2.6.38 and rebooting into the current kernel with
> your patch.
> 
> Jordan

What filesystem is it and how did you mount it with -o clear_cache? If it is 
your rootfs did you applied clear_cache to /etc/fstab or your bootloader? If 
it was the latter it won't work. For the rootfs you need to add it to the boot 
options. For me this worked every time.
Josef, is there any way to detect a wrong cache, saved by an pre-2.6.39 kernel 
and discard it?

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BUG: unable to handle kernel NULL pointer dereference at (null)

2011-04-06 Thread Johannes Hirte

On Tuesday 05 April 2011 23:57:53 Josef Bacik wrote:
> > Now it hit
> 
> Man I cannot catch a break.  I hope this is the last one.  Thanks,
> 
> Josef
> 
> ---
>  fs/btrfs/free-space-cache.c |   32 
>  1 files changed, 32 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index 74bc432..b8052be 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -522,6 +522,7 @@ int btrfs_write_out_cache(struct btrfs_root *root,
>   int bitmaps = 0;
>   int ret = 0;
>   bool next_page = false;
> + bool out_of_space = false;
> 
>   root = root->fs_info->tree_root;
> 
> @@ -629,6 +630,11 @@ int btrfs_write_out_cache(struct btrfs_root *root,
>   offset = start_offset;
>   }
> 
> + if (index > last_index) {
> + out_of_space = true;
> + break;
> + }
> +
>   page = find_get_page(inode->i_mapping, index);
> 
>   addr = kmap(page);
> @@ -732,6 +738,10 @@ int btrfs_write_out_cache(struct btrfs_root *root,
>   struct btrfs_free_space *entry =
>   list_entry(pos, struct btrfs_free_space, list);
> 
> + if (index > last_index) {
> + out_of_space = true;
> + break;
> + }
>   page = find_get_page(inode->i_mapping, index);
> 
>   addr = kmap(page);
> @@ -754,6 +764,28 @@ int btrfs_write_out_cache(struct btrfs_root *root,
>   index++;
>   }
> 
> + if (out_of_space) {
> + page = find_get_page(inode->i_mapping, 0);
> +
> + /*
> +  * Have to do the normal stuff in case writeback gets started on
> +  * this page before we invalidate it.
> +  */
> + ClearPageChecked(page);
> + set_page_extent_mapped(page);
> + SetPageUptodate(page);
> + set_page_dirty(page);
> + unlock_page(page);
> + page_cache_release(page);
> + page_cache_release(page);
> +
> + ret = 0;
> + unlock_extent_cached(&BTRFS_I(inode)->io_tree, 0,
> +  i_size_read(inode) - 1, &cached_state,
> +  GFP_NOFS);
> + goto out_free;
> + }
> +
>   /* Zero out the rest of the pages just to make sure */
>   while (index <= last_index) {
>   void *addr;

Sorry no, it still hits the BUG() in inode.c (line 1565). It takes longer to 
hit than before but is still reproducible.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BUG: unable to handle kernel NULL pointer dereference at (null)

2011-04-05 Thread Johannes Hirte

On Tuesday 05 April 2011 23:12:27 Josef Bacik wrote:
> On Tue, Apr 05, 2011 at 11:08:52PM +0200, Johannes Hirte wrote:
> > On Tuesday 05 April 2011 21:31:43 Josef Bacik wrote:
> > > On Tue, Apr 05, 2011 at 09:21:55PM +0200, Johannes Hirte wrote:
> > > > On Tuesday 05 April 2011 20:53:24 Josef Bacik wrote:
> > > > > On Tue, Apr 05, 2011 at 08:52:21PM +0200, Johannes Hirte wrote:
> > > > > > On Tuesday 05 April 2011 19:42:03 Josef Bacik wrote:
> > > > > > > On Tue, Apr 05, 2011 at 07:38:13PM +0200, Johannes Hirte wrote:
> > > > > > > > With the latest btrfs changes, I got this Oops when doing rm
> > > > > > > > on a large directory:
> > > > > > > > 
> > > > > > > > BUG: unable to handle kernel NULL pointer dereference at  
> > > > > > > > (null) IP: [] kunmap+0x46/0x46
> > > > > > > > *pdpt = 34a85001 *pde = 
> > > > > > > > Oops:  [#1] PREEMPT SMP
> > > > > > > > last sysfs file: /sys/devices/virtual/vtconsole/vtcon1/uevent
> > > > > > > > Modules linked in: snd_seq_oss snd_seq_midi_event snd_seq
> > > > > > > > snd_seq_device snd_pcm_oss snd_mixer_oss fuse dm_crypt dm_mod
> > > > > > > > usbhid snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_timer
> > > > > > > > sr_mod cdrom sg snd fschmd e1000 uhci_hcd snd_page_alloc
> > > > > > > > i2c_i801 [last unloaded: microcode]
> > > > > > > > 
> > > > > > > > Pid: 1156, comm: btrfs-transacti Tainted: GW
> > > > > > > > 2.6.39-rc1-00262- gc53813f #20 FUJITSU SIEMENS SCENIC P /
> > > > > > > > SCENICO P/D1561
> > > > > > > > EIP: 0060:[] EFLAGS: 00010296 CPU: 1
> > > > > > > > EIP is at kmap+0x0/0x38
> > > > > > > > EAX:  EBX:  ECX:  EDX: 0010
> > > > > > > > ESI: f5bc6400 EDI: f3c75520 EBP: f3c755f0 ESP: f58f9e10
> > > > > > > > 
> > > > > > > >  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
> > > > > > > > 
> > > > > > > > Process btrfs-transacti (pid: 1156, ti=f58f8000 task=f6516f40
> > > > > > > > task.ti=f58f8000)
> > > > > > > > 
> > > > > > > > Stack:
> > > > > > > >  c1186d15 ffc22000 f58f9ec0 0010 f3c75610 
> > > > > > > >  f5885780 f52339e8 0009 f5bc6400 0001 
> > > > > > > >  f6415800 f3c75638 08bb f5bc63c0 f58857b4 f60b68a0
> > > > > > > >  0040 f52338e8 ffc22000  0008 0010
> > > > > > > > 
> > > > > > > > Call Trace:
> > > > > > > >  [] ? btrfs_write_out_cache+0x60c/0xa3c
> > > > > > > >  [] ? btrfs_write_dirty_block_groups+0x400/0x494
> > > > > > > >  [] ? commit_cowonly_roots+0xa9/0x180
> > > > > > > >  [] ? btrfs_commit_transaction+0x2ee/0x59c
> > > > > > > >  [] ? wake_up_bit+0x16/0x16
> > > > > > > >  [] ? transaction_kthread+0x149/0x1d6
> > > > > > > >  [] ? complete+0x28/0x36
> > > > > > > >  [] ? btrfs_congested_fn+0x5d/0x5d
> > > > > > > >  [] ? kthread+0x63/0x68
> > > > > > > >  [] ? kthread_worker_fn+0xeb/0xeb
> > > > > > > >  [] ? kernel_thread_helper+0x6/0xd
> > > > > > > > 
> > > > > > > > Code: 8d 8a 00 e4 54 c1 2b 8a 8c e7 54 c1 81 f9 00 08 00 00
> > > > > > > > 74 11 81 f9 00 0c 00 00 75 0e 83 3d 10 2f 60 c1 02 75 05 e9
> > > > > > > > 5e a3 04 00 c3 <8b> 10 c1 ea 1e c1 e2 0a 8d 8a 00 e4 54 c1
> > > > > > > > 2b 8a 8c e7 54 c1 81 EIP: [] kmap+0x0/0x38 SS:ESP
> > > > > > > > 0068:f58f9e10 CR2: 
> > > > > > > > ---[ end trace c8511126ee91dfdf ]---
> > > > > > > > 
> > > > > > > > This is the second Oops. On the first one I wasn't able to
> > > > > > > > catch the backtrace, but IIRC the bug happend on kmap not
> > > > > > > > kunmap the first time.
> > > > > > > 
> >

Re: BUG: unable to handle kernel NULL pointer dereference at (null)

2011-04-05 Thread Johannes Hirte

On Tuesday 05 April 2011 20:53:24 Josef Bacik wrote:
> On Tue, Apr 05, 2011 at 08:52:21PM +0200, Johannes Hirte wrote:
> > On Tuesday 05 April 2011 19:42:03 Josef Bacik wrote:
> > > On Tue, Apr 05, 2011 at 07:38:13PM +0200, Johannes Hirte wrote:
> > > > With the latest btrfs changes, I got this Oops when doing rm on a
> > > > large directory:
> > > > 
> > > > BUG: unable to handle kernel NULL pointer dereference at   (null)
> > > > IP: [] kunmap+0x46/0x46
> > > > *pdpt = 34a85001 *pde = 
> > > > Oops:  [#1] PREEMPT SMP
> > > > last sysfs file: /sys/devices/virtual/vtconsole/vtcon1/uevent
> > > > Modules linked in: snd_seq_oss snd_seq_midi_event snd_seq
> > > > snd_seq_device snd_pcm_oss snd_mixer_oss fuse dm_crypt dm_mod usbhid
> > > > snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_timer sr_mod cdrom
> > > > sg snd fschmd e1000 uhci_hcd snd_page_alloc i2c_i801 [last unloaded:
> > > > microcode]
> > > > 
> > > > Pid: 1156, comm: btrfs-transacti Tainted: GW  
> > > > 2.6.39-rc1-00262- gc53813f #20 FUJITSU SIEMENS SCENIC P / SCENICO
> > > > P/D1561
> > > > EIP: 0060:[] EFLAGS: 00010296 CPU: 1
> > > > EIP is at kmap+0x0/0x38
> > > > EAX:  EBX:  ECX:  EDX: 0010
> > > > ESI: f5bc6400 EDI: f3c75520 EBP: f3c755f0 ESP: f58f9e10
> > > > 
> > > >  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
> > > > 
> > > > Process btrfs-transacti (pid: 1156, ti=f58f8000 task=f6516f40
> > > > task.ti=f58f8000)
> > > > 
> > > > Stack:
> > > >  c1186d15 ffc22000 f58f9ec0 0010 f3c75610  f5885780
> > > >  f52339e8 0009 f5bc6400 0001  f6415800 f3c75638
> > > >  08bb f5bc63c0 f58857b4 f60b68a0 0040 f52338e8 ffc22000
> > > >   0008 0010
> > > > 
> > > > Call Trace:
> > > >  [] ? btrfs_write_out_cache+0x60c/0xa3c
> > > >  [] ? btrfs_write_dirty_block_groups+0x400/0x494
> > > >  [] ? commit_cowonly_roots+0xa9/0x180
> > > >  [] ? btrfs_commit_transaction+0x2ee/0x59c
> > > >  [] ? wake_up_bit+0x16/0x16
> > > >  [] ? transaction_kthread+0x149/0x1d6
> > > >  [] ? complete+0x28/0x36
> > > >  [] ? btrfs_congested_fn+0x5d/0x5d
> > > >  [] ? kthread+0x63/0x68
> > > >  [] ? kthread_worker_fn+0xeb/0xeb
> > > >  [] ? kernel_thread_helper+0x6/0xd
> > > > 
> > > > Code: 8d 8a 00 e4 54 c1 2b 8a 8c e7 54 c1 81 f9 00 08 00 00 74 11 81
> > > > f9 00 0c 00 00 75 0e 83 3d 10 2f 60 c1 02 75 05 e9 5e a3 04 00 c3
> > > > <8b> 10 c1 ea 1e c1 e2 0a 8d 8a 00 e4 54 c1 2b 8a 8c e7 54 c1 81
> > > > EIP: [] kmap+0x0/0x38 SS:ESP 0068:f58f9e10
> > > > CR2: 
> > > > ---[ end trace c8511126ee91dfdf ]---
> > > > 
> > > > This is the second Oops. On the first one I wasn't able to catch the
> > > > backtrace, but IIRC the bug happend on kmap not kunmap the first
> > > > time.
> > > 
> > > Yeah I think I know what this is but I need somebody to verify it for
> > > me. Can you run with this patch and let me know what happens?  Thanks,
> > > 
> > > Josef
> > > 
> > > diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> > > index 74bc432..5e6f4b3 100644
> > > --- a/fs/btrfs/free-space-cache.c
> > > +++ b/fs/btrfs/free-space-cache.c
> > > @@ -624,6 +624,7 @@ int btrfs_write_out_cache(struct btrfs_root *root,
> > > 
> > >   next_page = false;
> > > 
> > > + BUG_ON(index > last_index);
> > > 
> > >   if (index == 0) {
> > >   
> > >   start_offset = first_page_offset;
> > >   offset = start_offset;
> > > 
> > > @@ -732,6 +733,7 @@ int btrfs_write_out_cache(struct btrfs_root *root,
> > > 
> > >   struct btrfs_free_space *entry =
> > >   
> > >   list_entry(pos, struct btrfs_free_space, list);
> > > 
> > > + BUG_ON(index > last_index);
> > > 
> > >   page = find_get_page(inode->i_mapping, index);
> > >   
> > >   addr = kmap(page);
> > 
> > Hm, I tried but now I hit the
> > BUG_ON(block_group->total_bitmaps >= max_bitmaps); in add_new_bitmap in
> > fs/btrfs/free-space-cache.c:1255 when booting the system.
> 
> Can you mount -o clear_cache to make sure it's not the cache thats causing
> that? Thanks,
> 
> Josef

Mounting  with clear_cache under 2.6.38 helped. I was able to boot and test 
with your patch an hit the second BUG_ON on free-space-cache.c:738.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BUG: unable to handle kernel NULL pointer dereference at (null)

2011-04-05 Thread Johannes Hirte

On Tuesday 05 April 2011 19:42:03 Josef Bacik wrote:
> On Tue, Apr 05, 2011 at 07:38:13PM +0200, Johannes Hirte wrote:
> > With the latest btrfs changes, I got this Oops when doing rm on a large
> > directory:
> > 
> > BUG: unable to handle kernel NULL pointer dereference at   (null)
> > IP: [] kunmap+0x46/0x46
> > *pdpt = 34a85001 *pde = 
> > Oops:  [#1] PREEMPT SMP
> > last sysfs file: /sys/devices/virtual/vtconsole/vtcon1/uevent
> > Modules linked in: snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
> > snd_pcm_oss snd_mixer_oss fuse dm_crypt dm_mod usbhid snd_intel8x0
> > snd_ac97_codec ac97_bus snd_pcm snd_timer sr_mod cdrom sg snd fschmd
> > e1000 uhci_hcd snd_page_alloc i2c_i801 [last unloaded: microcode]
> > 
> > Pid: 1156, comm: btrfs-transacti Tainted: GW   2.6.39-rc1-00262-
> > gc53813f #20 FUJITSU SIEMENS SCENIC P / SCENICO P/D1561
> > EIP: 0060:[] EFLAGS: 00010296 CPU: 1
> > EIP is at kmap+0x0/0x38
> > EAX:  EBX:  ECX:  EDX: 0010
> > ESI: f5bc6400 EDI: f3c75520 EBP: f3c755f0 ESP: f58f9e10
> > 
> >  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
> > 
> > Process btrfs-transacti (pid: 1156, ti=f58f8000 task=f6516f40
> > task.ti=f58f8000)
> > 
> > Stack:
> >  c1186d15 ffc22000 f58f9ec0 0010 f3c75610  f5885780 f52339e8
> >  0009 f5bc6400 0001  f6415800 f3c75638 08bb f5bc63c0
> >  f58857b4 f60b68a0 0040 f52338e8 ffc22000  0008 0010
> > 
> > Call Trace:
> >  [] ? btrfs_write_out_cache+0x60c/0xa3c
> >  [] ? btrfs_write_dirty_block_groups+0x400/0x494
> >  [] ? commit_cowonly_roots+0xa9/0x180
> >  [] ? btrfs_commit_transaction+0x2ee/0x59c
> >  [] ? wake_up_bit+0x16/0x16
> >  [] ? transaction_kthread+0x149/0x1d6
> >  [] ? complete+0x28/0x36
> >  [] ? btrfs_congested_fn+0x5d/0x5d
> >  [] ? kthread+0x63/0x68
> >  [] ? kthread_worker_fn+0xeb/0xeb
> >  [] ? kernel_thread_helper+0x6/0xd
> > 
> > Code: 8d 8a 00 e4 54 c1 2b 8a 8c e7 54 c1 81 f9 00 08 00 00 74 11 81 f9
> > 00 0c 00 00 75 0e 83 3d 10 2f 60 c1 02 75 05 e9 5e a3 04 00 c3 <8b> 10
> > c1 ea 1e c1 e2 0a 8d 8a 00 e4 54 c1 2b 8a 8c e7 54 c1 81
> > EIP: [] kmap+0x0/0x38 SS:ESP 0068:f58f9e10
> > CR2: 
> > ---[ end trace c8511126ee91dfdf ]---
> > 
> > This is the second Oops. On the first one I wasn't able to catch the
> > backtrace, but IIRC the bug happend on kmap not kunmap the first time.
> 
> Yeah I think I know what this is but I need somebody to verify it for me. 
> Can you run with this patch and let me know what happens?  Thanks,
> 
> Josef
> 
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index 74bc432..5e6f4b3 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -624,6 +624,7 @@ int btrfs_write_out_cache(struct btrfs_root *root,
> 
>   next_page = false;
> 
> + BUG_ON(index > last_index);
>   if (index == 0) {
>   start_offset = first_page_offset;
>   offset = start_offset;
> @@ -732,6 +733,7 @@ int btrfs_write_out_cache(struct btrfs_root *root,
>   struct btrfs_free_space *entry =
>   list_entry(pos, struct btrfs_free_space, list);
> 
> + BUG_ON(index > last_index);
>   page = find_get_page(inode->i_mapping, index);
> 
>   addr = kmap(page);

Hm, I tried but now I hit the 
BUG_ON(block_group->total_bitmaps >= max_bitmaps); in add_new_bitmap in
fs/btrfs/free-space-cache.c:1255 when booting the system.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

BUG: unable to handle kernel NULL pointer dereference at (null)

2011-04-05 Thread Johannes Hirte

With the latest btrfs changes, I got this Oops when doing rm on a large 
directory:

BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [] kunmap+0x46/0x46
*pdpt = 34a85001 *pde =  
Oops:  [#1] PREEMPT SMP 
last sysfs file: /sys/devices/virtual/vtconsole/vtcon1/uevent
Modules linked in: snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
snd_pcm_oss snd_mixer_oss fuse dm_crypt dm_mod usbhid snd_intel8x0 
snd_ac97_codec ac97_bus snd_pcm snd_timer sr_mod cdrom sg snd fschmd e1000 
uhci_hcd snd_page_alloc i2c_i801 [last unloaded: microcode]

Pid: 1156, comm: btrfs-transacti Tainted: GW   2.6.39-rc1-00262-
gc53813f #20 FUJITSU SIEMENS SCENIC P / SCENICO P/D1561
EIP: 0060:[] EFLAGS: 00010296 CPU: 1
EIP is at kmap+0x0/0x38
EAX:  EBX:  ECX:  EDX: 0010
ESI: f5bc6400 EDI: f3c75520 EBP: f3c755f0 ESP: f58f9e10
 DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
Process btrfs-transacti (pid: 1156, ti=f58f8000 task=f6516f40 
task.ti=f58f8000)
Stack:
 c1186d15 ffc22000 f58f9ec0 0010 f3c75610  f5885780 f52339e8
 0009 f5bc6400 0001  f6415800 f3c75638 08bb f5bc63c0
 f58857b4 f60b68a0 0040 f52338e8 ffc22000  0008 0010
Call Trace:
 [] ? btrfs_write_out_cache+0x60c/0xa3c
 [] ? btrfs_write_dirty_block_groups+0x400/0x494
 [] ? commit_cowonly_roots+0xa9/0x180
 [] ? btrfs_commit_transaction+0x2ee/0x59c
 [] ? wake_up_bit+0x16/0x16
 [] ? transaction_kthread+0x149/0x1d6
 [] ? complete+0x28/0x36
 [] ? btrfs_congested_fn+0x5d/0x5d
 [] ? kthread+0x63/0x68
 [] ? kthread_worker_fn+0xeb/0xeb
 [] ? kernel_thread_helper+0x6/0xd
Code: 8d 8a 00 e4 54 c1 2b 8a 8c e7 54 c1 81 f9 00 08 00 00 74 11 81 f9 00 0c 
00 00 75 0e 83 3d 10 2f 60 c1 02 75 05 e9 5e a3 04 00 c3 <8b> 10 c1 ea 1e c1 
e2 0a 8d 8a 00 e4 54 c1 2b 8a 8c e7 54 c1 81 
EIP: [] kmap+0x0/0x38 SS:ESP 0068:f58f9e10
CR2: 
---[ end trace c8511126ee91dfdf ]---

This is the second Oops. On the first one I wasn't able to catch the backtrace, 
but IIRC the bug happend on kmap not kunmap the first time.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] btrfs file write debugging patch

2011-03-07 Thread Johannes Hirte

On Monday 07 March 2011 20:56:50 Maria Wikström wrote:
> mån 2011-03-07 klockan 00:07 -0600 skrev Mitch Harder:
> > On Sun, Mar 6, 2011 at 6:58 PM, Chris Mason  
wrote:
> > > Excerpts from Chris Mason's message of 2011-03-06 13:00:27 -0500:
> > >> Excerpts from Mitch Harder's message of 2011-03-05 11:50:14 -0500:
> > >> > I've constructed a test patch that is currently addressing all the
> > >> > issues on my system.
> > >> > 
> > >> > The portion of Openmotif that was having issues with page faults
> > >> > works correctly with this patch, and gcc-4.4.5 builds without
> > >> > issue.
> > >> > 
> > >> > I extracted only the portion of the first patch that corrects the
> > >> > handling of dirty_pages when copied==0, and incorporated the second
> > >> > patch that falls back to one-page-at-a-time if there are troubles
> > >> > with page faults.
> > >> 
> > >> Just to make sure I understand, could you please post the full
> > >> combined path that was giving you trouble with gcc?  We do need to
> > >> make sure the pages are properly up to date if we fall back to
> > >> partial writes.
> > > 
> > > Ok, I was able to reproduce this easily with fsx.  The problem is that
> > > I wasn't making sure the last partial page in the write was up to date
> > > when it was also the first page in the write.
> > 
> > > Here is the updated patch, it has all the fixes we've found so far:
> > This latest patch that Chris has sent out fixes the issues I've been
> > encountering.
> > 
> > I can build gcc-4.4.5 without problems.
> > 
> > Also, the portion of Openmotif that was having issues with page faults
> > is working correctly.
> > 
> > Let me know if you still would like to see the path names for the
> > portions of the gcc-4.4.5 build that were giving me issues.  I didn't
> > save that information, but I can regenerate it.  But it sounds like
> > it's irrelevant now.
> 
> With the patch I can compile libgcrypt without any problem, so it solves
> my problems to.

Can confirm this. And the bug seems to be hardware-related. On my Pentium4 
system it was 100% reproducible, on my Atom-based system I couldn't trigger 
it.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] btrfs file write debugging patch

2011-02-28 Thread Johannes Hirte

On Monday 28 February 2011 02:46:05 Chris Mason wrote:
> Excerpts from Mitch Harder's message of 2011-02-25 13:43:37 -0500:
> > Some clarification on my previous message...
> > 
> > After looking at my ftrace log more closely, I can see where Btrfs is
> > trying to release the allocated pages.  However, the calculation for
> > the number of dirty_pages is equal to 1 when "copied == 0".
> > 
> > So I'm seeing at least two problems:
> > (1)  It keeps looping when "copied == 0".
> > (2)  One dirty page is not being released on every loop even though
> > "copied == 0" (at least this problem keeps it from being an infinite
> > loop by eventually exhausting reserveable space on the disk).
> 
> Hi everyone,
> 
> There are actually tow bugs here.  First the one that Mitch hit, and a
> second one that still results in bad file_write results with my
> debugging hunks (the first two hunks below) in place.
> 
> My patch fixes Mitch's bug by checking for copied == 0 after
> btrfs_copy_from_user and going the correct delalloc accounting.  This
> one looks solved, but you'll notice the patch is bigger.
> 
> First, I add some random failures to btrfs_copy_from_user() by failing
> everyone once and a while.  This was much more reliable than trying to
> use memory pressure than making copy_from_user fail.
> 
> If copy_from_user fails and we partially update a page, we end up with a
> page that may go away due to memory pressure.  But, btrfs_file_write
> assumes that only the first and last page may have good data that needs
> to be read off the disk.
> 
> This patch ditches that code and puts it into prepare_pages instead.
> But I'm still having some errors during long stress.sh runs.  Ideas are
> more than welcome, hopefully some other timezones will kick in ideas
> while I sleep.

At least it doesn't fix the emerge-problem for me. The behavior is now the same 
as with 2.6.38-rc3. It needs a 'emerge --oneshot dev-libs/libgcrypt' with no 
further interaction to get the emerge-process hang with a svn-process 
consuming 100% CPU. I can cancel the emerge-process with ctrl-c but the 
spawned svn-process stays and it needs a reboot to get rid of it. 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-02-23 Thread Johannes Hirte

On Wednesday 23 February 2011 22:56:27 Chris Mason wrote:
> Excerpts from Zhong, Xin's message of 2011-02-23 02:27:05 -0500:
> > In the dmesg of rc4, I can see svn hang in shrink_dellalloc and there's
> > two flush-btrfs threads hang there too.
> > 
> > Josef, it seems you are the expert in this area. Could you take a quick
> > look? Thanks!
> 
> Ok, it does look like the fluhs-btrfs threads are busy trying to flush
> things.
> 
> Could you please do a btrfs-show and a btrfs fi df /xxx (where xxx is
> your mount point) and send the results here?
> 
> -chris

failed to read /dev/sr0
Label: none  uuid: 00eab15f-c4cf-4403-a529-9bc11fa50167
Total devices 1 FS bytes used 47.72GB
devid1 size 65.69GB used 65.69GB path /dev/sda2

Label: none  uuid: c6f4e6e6-c4ba-4394-9e9c-bbc3d0b32793
Total devices 1 FS bytes used 9.48GB
devid1 size 20.01GB used 20.01GB path /dev/sda1

Btrfs v0.19-35-g1b444cd-dirty

and btrfs fi df on

/

Data: total=15.49GB, used=8.35GB
System, DUP: total=8.00MB, used=12.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=2.25GB, used=1.13GB

/home

Data: total=63.42GB, used=47.47GB
System: total=4.00MB, used=16.00KB
Metadata: total=2.27GB, used=251.34MB

The bug is reproducable on both filesystems.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-02-01 Thread Johannes Hirte

On Friday 28 January 2011 04:53:24 Zhong, Xin wrote:
> Could you describe the steps to recreate it?
> It will be a great help for me to look further. Thanks!

It's a little strange. I have to systems with btrfs, both Gentoo-based. One is 
affected by this bug the other is not. On the affected system it is enough to 
do 
a 'emerge dev-libs/libgcrypt' that should normaly compile and install 
libgcrypt. The emerge command is part of portage, the package management of 
Gentoo. 
The strace output looks similar to the one from Maria:

open("/home/tmp/portage/dev-libs/libgcrypt-1.4.6/.ipc_in", O_RDONLY|
O_NONBLOCK|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFIFO|0770, st_size=0, ...}) = 0
ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
0xbff5f678) = -1 EINVAL (Invalid argument)
open("/dev/ptmx", O_RDWR)   = 5
ioctl(5, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(5, TIOCGPTN, [2]) = 0
stat64("/dev/pts/2", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
getuid32()  = 0
ioctl(5, TIOCSPTLCK, [0])   = 0
ioctl(5, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(5, TIOCGPTN, [2]) = 0
stat64("/dev/pts/2", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
open("/dev/pts/2", O_RDWR|O_NOCTTY) = 6
ioctl(6, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(6, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(6, SNDCTL_TMR_START or SNDRV_TIMER_IOCTL_TREAD or TCSETS, {B38400 -opost 
isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
stat64("/root/.terminfo", 0xbff5e790)   = -1 ENOENT (No such file or directory)
stat64("/etc/terminfo", {st_mode=S_IFDIR|0755, st_size=14, ...}) = 0
access("/etc/terminfo/x/xterm", R_OK)   = 0
open("/etc/terminfo/x/xterm", O_RDONLY|O_LARGEFILE) = 7
read(7, "\32\0010\0&\0\17\0\235\1l\5xterm|xterm terminal"..., 4097) = 3258
close(7)= 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
ioctl(1, TIOCGWINSZ, {ws_row=40, ws_col=207, ws_xpixel=0, ws_ypixel=0}) = 0
access("/usr/local/sbin/stty", X_OK)= -1 ENOENT (No such file or directory)
access("/usr/local/bin/stty", X_OK) = -1 ENOENT (No such file or directory)
access("/usr/sbin/stty", X_OK)  = -1 ENOENT (No such file or directory)
access("/usr/bin/stty", X_OK)   = -1 ENOENT (No such file or directory)
access("/sbin/stty", X_OK)  = -1 ENOENT (No such file or directory)
access("/bin/stty", X_OK)   = 0
stat64("/bin/stty", {st_mode=S_IFREG|0755, st_size=58836, ...}) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0xb753d728) = 2752
waitpid(2752, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 2752
--- SIGCHLD (Child exited) @ 0 (0) ---
fcntl64(5, F_GETFL) = 0x2 (flags O_RDWR)
fcntl64(5, F_SETFL, O_RDWR|O_NONBLOCK)  = 0
fstat64(5, {st_mode=S_IFCHR|0666, st_rdev=makedev(5, 2), ...}) = 0
ioctl(5, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 -opost isig icanon echo ...}) = 0
open("/home/tmp/portage/dev-libs/libgcrypt-1.4.6/temp/build.log", O_WRONLY|
O_CREAT|O_APPEND|O_LARGEFILE, 0666) = 7
fstat64(7, {st_mode=S_IFREG|0660, st_size=480, ...}) = 0
_llseek(7, 0, [480], SEEK_END)  = 0
ioctl(7, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
0xbff5fad8) = -1 ENOTTY (Inappropriate ioctl for device)
fstat64(7, {st_mode=S_IFREG|0660, st_size=480, ...}) = 0
_llseek(7, 0, [480], SEEK_CUR)  = 0
stat64("/home/tmp/portage/dev-libs/libgcrypt-1.4.6/temp/build.log", 
{st_mode=S_IFREG|0660, st_size=480, ...}) = 0
dup(1)  = 8
fstat64(8, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
ioctl(8, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 
{B38400 opost isig icanon echo ...}) = 0
fstat64(8, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
_llseek(8, 0, 0xbff5f820, SEEK_CUR) = -1 ESPIPE (Illegal seek)
stat64("/home/tmp/portage/dev-libs/libgcrypt-1.4.6/temp/environment", 
{st_mode=S_IFREG|0664, st_size=106597, ...}) = 0
clone(child

Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-01-27 Thread Johannes Hirte

On Friday 28 January 2011 02:26:43 Zhong, Xin wrote:
> Please try the fix in below link:
> http://www.spinics.net/lists/linux-btrfs/msg08051.html
> 
> Thanks!

This doesn't fix it for me. At least there is a difference. Whereas the svn 
process started consuming 100% CPU without any further interaction before, the 
system just hang now. The svn process starts eating the CPU when I cancel the 
emerge via ctrl-c. Additional I see a flush-btrfs task now consuming CPU time.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page

2011-01-27 Thread Johannes Hirte

On Thursday 09 December 2010 10:30:14 Zhong, Xin wrote:
> This problem is found in meego testing:
> http://bugs.meego.com/show_bug.cgi?id=6672
> A file in btrfs is mmaped and the mmaped buffer is passed to pwrite to
> write to the same page of the same file. In btrfs_file_aio_write(), the
> pages is locked by prepare_pages(). So when btrfs_copy_from_user() is
> called, page fault happens and the same page needs to be locked again in
> filemap_fault(). The fix is to move iov_iter_fault_in_readable() before
> prepage_pages() to make page fault happen before pages are locked. And
> also disable page fault in critical region in btrfs_copy_from_user().
> 
> Reviewed-by: Yan, Zheng
> Signed-off-by: Zhong, Xin 
> ---
>  fs/btrfs/file.c |   92
> --- 1 files changed,
> 60 insertions(+), 32 deletions(-)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index c1faded..66836d8 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -48,30 +48,34 @@ static noinline int btrfs_copy_from_user(loff_t pos,
> int num_pages, struct page **prepared_pages,
>struct iov_iter *i)
>  {
> - size_t copied;
> + size_t copied = 0;
>   int pg = 0;
>   int offset = pos & (PAGE_CACHE_SIZE - 1);
> + int total_copied = 0;
> 
>   while (write_bytes > 0) {
>   size_t count = min_t(size_t,
>PAGE_CACHE_SIZE - offset, write_bytes);
>   struct page *page = prepared_pages[pg];
> -again:
> - if (unlikely(iov_iter_fault_in_readable(i, count)))
> - return -EFAULT;
> -
> - /* Copy data from userspace to the current page */
> - copied = iov_iter_copy_from_user(page, i, offset, count);
> + /*
> +  * Copy data from userspace to the current page
> +  *
> +  * Disable pagefault to avoid recursive lock since
> +  * the pages are already locked
> +  */
> + pagefault_disable();
> + copied = iov_iter_copy_from_user_atomic(page, i, offset, count);
> + pagefault_enable();
> 
>   /* Flush processor's dcache for this page */
>   flush_dcache_page(page);
>   iov_iter_advance(i, copied);
>   write_bytes -= copied;
> + total_copied += copied;
> 
> + /* Return to btrfs_file_aio_write to fault page */
>   if (unlikely(copied == 0)) {
> - count = min_t(size_t, PAGE_CACHE_SIZE - offset,
> -   iov_iter_single_seg_count(i));
> - goto again;
> + break;
>   }
> 
>   if (unlikely(copied < PAGE_CACHE_SIZE - offset)) {
> @@ -81,7 +85,7 @@ again:
>   offset = 0;
>   }
>   }
> - return 0;
> + return total_copied;
>  }
> 
>  /*
> @@ -854,6 +858,8 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb,
>   unsigned long last_index;
>   int will_write;
>   int buffered = 0;
> + int copied = 0;
> + int dirty_pages = 0;
> 
>   will_write = ((file->f_flags & O_DSYNC) || IS_SYNC(inode) ||
> (file->f_flags & O_DIRECT));
> @@ -970,7 +976,17 @@ static ssize_t btrfs_file_aio_write(struct kiocb
> *iocb, WARN_ON(num_pages > nrptrs);
>   memset(pages, 0, sizeof(struct page *) * nrptrs);
> 
> - ret = btrfs_delalloc_reserve_space(inode, write_bytes);
> + /*
> +  * Fault pages before locking them in prepare_pages
> +  * to avoid recursive lock
> +  */
> + if (unlikely(iov_iter_fault_in_readable(&i, write_bytes))) {
> + ret = -EFAULT;
> + goto out;
> + }
> +
> + ret = btrfs_delalloc_reserve_space(inode,
> + num_pages << PAGE_CACHE_SHIFT);
>   if (ret)
>   goto out;
> 
> @@ -978,37 +994,49 @@ static ssize_t btrfs_file_aio_write(struct kiocb
> *iocb, pos, first_index, last_index,
>   write_bytes);
>   if (ret) {
> - btrfs_delalloc_release_space(inode, write_bytes);
> + btrfs_delalloc_release_space(inode,
> + num_pages << PAGE_CACHE_SHIFT);
>   goto out;
>   }
> 
> - ret = btrfs_copy_from_user(pos, num_pages,
> + copied = btrfs_copy_from_user(pos, num_pages,
>  write_bytes, pages, &i);
> - if (ret == 0) {
> + dirty_pages = (copied + PAGE_CACHE_SIZE - 1) >>
> + PAGE_CACHE_SHIFT;
> +
> + if (num_pages > dirty_pages) {
> + if (copied > 0)
> +

Re: version

2011-01-24 Thread Johannes Hirte

On Monday 24 January 2011 09:33:00 Helmut Hullen wrote:
> Hallo, Chris,
> 
> Du meintest am 24.01.11:
> >> Thank you - that's more simple for me than first cloning via
> >> "git clone" and then running "git log".
> > 
> > Ah, but the repo you were asked to clone was for the user tools,
> > not for the kernel code that implements the filesystem itself
> > which is what that Wiki page is for.
> 
> Yes - I've seen that small difference ... but I've installed 2.6.37.
> 
> It includes btrfs changes (from the btrfs-unstable branch) until 2010-
> 12-14.
> 
> My other problem: where has the ENOSPC problem be cured? Is it a kernel
> problem?

Which one? And yes, ENOSPC is kernel related.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel BUG at fs/btrfs/inode.c:806

2010-12-02 Thread Johannes Hirte

On Thursday 02 December 2010 20:21:30 Chris Mason wrote:
> Excerpts from Johannes Hirte's message of 2010-12-02 12:02:16 -0500:
> > On Thursday 02 December 2010 17:52:50 Johannes Hirte wrote:
> > > On Thursday 02 December 2010 17:19:56 Chris Mason wrote:
> > > > Excerpts from Johannes Hirte's message of 2010-12-01 08:11:01 -0500:
> > > > > On one of my machines with btrfs I got this bug:
> > > > > 
> > > > > entry offset 29085974528, bytes 4096, bitmap no
> > > > > entry offset 29162995712, bytes 20480, bitmap yes
> > > > > entry offset 29171744768, bytes 4096, bitmap no
> > > > > block group has cluster?: no
> > > > > 0 blocks of free space at or bigger than bytes is
> > > > > block group 29834084352 has 1073741824 bytes, 1072648192 used 0 
> > > > > pinned 0 reserved
> > > > 
> > > > Well, you've had an ENOSPC explosion.
> > > > 
> > > > > 
> > > > > The "block group" messages where way more, too much for the dmesg log 
> > > > > buffer.
> > > > > Kernel is a 2.6.37-rc3+ without the latest btrfs-fixes. The bug 
> > > > > occurred when
> > > > > compiling openoffice.org. After the bug a 'df -h' showed:
> > > > > 
> > > > > df -h:
> > > > > FilesystemSize  Used Avail Use% Mounted on
> > > > > rootfs 21G   17G  770M  96% /
> > > > > /dev/root  21G   17G  770M  96% /
> > > > > rc-svcdir 1.0M  108K  916K  11% /lib/rc/init.d
> > > > > udev   10M  116K  9.9M   2% /dev
> > > > > shm  1013M 0 1013M   0% /dev/shm
> > > > > /dev/sda2  66G   46G   20G  71% /home
> > > > > /dev/sdb1  75G   56G   19G  75% /mnt/windows
> > > > 
> > > > Which of these filesystems were you compiling on?
> > > 
> > > On /. It's a gentoo system and the bug happened during an 'emerge 
> > > openoffice'.
> > > The compilation ist usually done under /var/tmp/portage.
> > 
> > Btw, I was able to reproduce this with a second try to emerge openoffice.
> 
> Ok, there is one related fix in the git tree right now that you don't
> have.  I'm not 100% sure it'll fix this, but it can't hurt.
> 
> -chris
> 
Unfortunately it didn't fixed the bug. The system crashed again on emerging
openoffice.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: disk space caching generation missmatch

2010-12-02 Thread Johannes Hirte

On Friday 03 December 2010 01:44:49 C Anthony Risinger wrote:
> Did you fix that typo I posted?
> 
> C Anthony [mobile]
> 

Yes, without fix it wouldn't compile.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: disk space caching generation missmatch

2010-12-02 Thread Johannes Hirte

On Thursday 02 December 2010 21:34:10 Josef Bacik wrote:
> On Wed, Dec 01, 2010 at 10:40:29PM +0100, Johannes Hirte wrote:
> > On Wednesday 01 December 2010 22:22:45 Johannes Hirte wrote:
> > > On Wednesday 01 December 2010 21:03:13 Josef Bacik wrote:
> > > > On Wed, Dec 01, 2010 at 08:56:14PM +0100, Johannes Hirte wrote:
> > > > > On Wednesday 01 December 2010 18:40:18 Josef Bacik wrote:
> > > > > > On Wed, Dec 01, 2010 at 05:46:14PM +0100, Johannes Hirte wrote:
> > > > > > > After enabling disk space caching I've observed several log 
> > > > > > > entries like this:
> > > > > > > 
> > > > > > > btrfs: free space inode generation (0) did not match free space 
> > > > > > > cache generation (169594) for block group 15464398848
> > > > > > > 
> > > > > > > I'm not sure, but it seems this happens on every reboot. Is this 
> > > > > > > something to
> > > > > > > worry about?
> > > > > > > 
> > > > > > 
> > > > > > So that usually means 1 of a couple of things
> > > > > > 
> > > > > > 1) You didn't have space for us to save the free space cache
> > > > > > 2) When trying to write out the cache we hit one of those cases 
> > > > > > where we would
> > > > > > deadlock so we couldn't write the cache out
> > > > > > 
> > > > > > It's nothing to worry about, it's doing what it is supposed to.  
> > > > > > However I'd
> > > > > > like to know why we're not able to write out the cache.  Are you 
> > > > > > running close
> > > > > > to full?  Thanks,
> > > > > > 
> > > > > > Josef
> > > > > >
> > > > > 
> > > > > I think there should be enough free space:
> > > > > 
> > > > 
> 
> Ok it doesn't look like theres an actual problem, we're just being 
> sub-optimal.
> Take out the other patch and apply this one, boot into that kernel and then
> reboot and then give me the dmesg. 

Here it comes:

Initializing cgroup subsys cpuset
Linux version 2.6.37-rc4-space-cache-dbg-00022-g620731b-dirty (r...@netbook) 
(gcc version 4.5.1 (Gentoo 4.5.1-r1 p1.3, pie-0.4.5) ) #126 SMP PREEMPT Fri Dec 
3 00:40:04 CET 2010
Atom PSE erratum detected, BIOS microcode update recommended
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009f800 (usable)
 BIOS-e820: 0009f800 - 000a (reserved)
 BIOS-e820: 000dc000 - 000e4000 (reserved)
 BIOS-e820: 000e8000 - 0010 (reserved)
 BIOS-e820: 0010 - 7f6d (usable)
 BIOS-e820: 7f6d - 7f6e2000 (ACPI data)
 BIOS-e820: 7f6e2000 - 7f6e3000 (ACPI NVS)
 BIOS-e820: 7f6e3000 - 8000 (reserved)
 BIOS-e820: e000 - f000 (reserved)
 BIOS-e820: fec0 - fec1 (reserved)
 BIOS-e820: fed0 - fed00400 (reserved)
 BIOS-e820: fed14000 - fed1a000 (reserved)
 BIOS-e820: fed1c000 - fed9 (reserved)
 BIOS-e820: fee0 - fee01000 (reserved)
 BIOS-e820: ff00 - 0001 (reserved)
NX (Execute Disable) protection: active
DMI present.
DMI: M912/M912, BIOS R02 05/04/2009
e820 update range:  - 0001 (usable) ==> (reserved)
e820 remove range: 000a - 0010 (usable)
last_pfn = 0x7f6d0 max_arch_pfn = 0x100
MTRR default type: uncachable
MTRR fixed ranges enabled:
  0-9 write-back
  A-B uncachable
  C-C write-protect
  D-D uncachable
  E-F write-protect
MTRR variable ranges enabled:
  0 base 0 mask 08000 write-back
  1 base 07F70 mask 0FFF0 uncachable
  2 base 07F80 mask 0FF80 uncachable
  3 disabled
  4 disabled
  5 disabled
  6 disabled
  7 disabled
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
Scanning 0 areas for low memory corruption
initial memory mapped : 0 - 01a0
init_memory_mapping: -37bfe000
 00 - 0037bfe000 page 4k
kernel direct mapping tables up to 37bfe000 @ 183f000-1a0
ACPI: RSDP 000f7e40 00024 (v02 GBT   )
ACPI: XSDT 7f6dc705 00084 (v01 GBTGBTUACPI 0604  LTP )
ACPI: FACP 7f6e1bd2 000F4 (v03 INTEL  CALISTGA 0604 ALAN 0001)
ACPI: DSDT 7f6dd907 04257 (v01 INTEL  CALISTGA 0604 INTL 20050624)
ACPI: FACS 7f6e2fc0 00040
ACPI: APIC 7f6e1cc6 00068 (v01 INT

Re: kernel BUG at fs/btrfs/inode.c:806

2010-12-02 Thread Johannes Hirte

On Thursday 02 December 2010 17:52:50 Johannes Hirte wrote:
> On Thursday 02 December 2010 17:19:56 Chris Mason wrote:
> > Excerpts from Johannes Hirte's message of 2010-12-01 08:11:01 -0500:
> > > On one of my machines with btrfs I got this bug:
> > > 
> > > entry offset 29085974528, bytes 4096, bitmap no
> > > entry offset 29162995712, bytes 20480, bitmap yes
> > > entry offset 29171744768, bytes 4096, bitmap no
> > > block group has cluster?: no
> > > 0 blocks of free space at or bigger than bytes is
> > > block group 29834084352 has 1073741824 bytes, 1072648192 used 0 pinned 0 
> > > reserved
> > 
> > Well, you've had an ENOSPC explosion.
> > 
> > > 
> > > The "block group" messages where way more, too much for the dmesg log 
> > > buffer.
> > > Kernel is a 2.6.37-rc3+ without the latest btrfs-fixes. The bug occurred 
> > > when
> > > compiling openoffice.org. After the bug a 'df -h' showed:
> > > 
> > > df -h:
> > > FilesystemSize  Used Avail Use% Mounted on
> > > rootfs 21G   17G  770M  96% /
> > > /dev/root  21G   17G  770M  96% /
> > > rc-svcdir 1.0M  108K  916K  11% /lib/rc/init.d
> > > udev   10M  116K  9.9M   2% /dev
> > > shm  1013M 0 1013M   0% /dev/shm
> > > /dev/sda2  66G   46G   20G  71% /home
> > > /dev/sdb1  75G   56G   19G  75% /mnt/windows
> > 
> > Which of these filesystems were you compiling on?
> 
> On /. It's a gentoo system and the bug happened during an 'emerge openoffice'.
> The compilation ist usually done under /var/tmp/portage.

Btw, I was able to reproduce this with a second try to emerge openoffice.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel BUG at fs/btrfs/inode.c:806

2010-12-02 Thread Johannes Hirte

On Thursday 02 December 2010 17:19:56 Chris Mason wrote:
> Excerpts from Johannes Hirte's message of 2010-12-01 08:11:01 -0500:
> > On one of my machines with btrfs I got this bug:
> > 
> > entry offset 29085974528, bytes 4096, bitmap no
> > entry offset 29162995712, bytes 20480, bitmap yes
> > entry offset 29171744768, bytes 4096, bitmap no
> > block group has cluster?: no
> > 0 blocks of free space at or bigger than bytes is
> > block group 29834084352 has 1073741824 bytes, 1072648192 used 0 pinned 0 
> > reserved
> 
> Well, you've had an ENOSPC explosion.
> 
> > 
> > The "block group" messages where way more, too much for the dmesg log 
> > buffer.
> > Kernel is a 2.6.37-rc3+ without the latest btrfs-fixes. The bug occurred 
> > when
> > compiling openoffice.org. After the bug a 'df -h' showed:
> > 
> > df -h:
> > FilesystemSize  Used Avail Use% Mounted on
> > rootfs 21G   17G  770M  96% /
> > /dev/root  21G   17G  770M  96% /
> > rc-svcdir 1.0M  108K  916K  11% /lib/rc/init.d
> > udev   10M  116K  9.9M   2% /dev
> > shm  1013M 0 1013M   0% /dev/shm
> > /dev/sda2  66G   46G   20G  71% /home
> > /dev/sdb1  75G   56G   19G  75% /mnt/windows
> 
> Which of these filesystems were you compiling on?

On /. It's a gentoo system and the bug happened during an 'emerge openoffice'.
The compilation ist usually done under /var/tmp/portage.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: disk space caching generation missmatch

2010-12-01 Thread Johannes Hirte

On Wednesday 01 December 2010 22:22:45 Johannes Hirte wrote:
> On Wednesday 01 December 2010 21:03:13 Josef Bacik wrote:
> > On Wed, Dec 01, 2010 at 08:56:14PM +0100, Johannes Hirte wrote:
> > > On Wednesday 01 December 2010 18:40:18 Josef Bacik wrote:
> > > > On Wed, Dec 01, 2010 at 05:46:14PM +0100, Johannes Hirte wrote:
> > > > > After enabling disk space caching I've observed several log entries 
> > > > > like this:
> > > > > 
> > > > > btrfs: free space inode generation (0) did not match free space cache 
> > > > > generation (169594) for block group 15464398848
> > > > > 
> > > > > I'm not sure, but it seems this happens on every reboot. Is this 
> > > > > something to
> > > > > worry about?
> > > > > 
> > > > 
> > > > So that usually means 1 of a couple of things
> > > > 
> > > > 1) You didn't have space for us to save the free space cache
> > > > 2) When trying to write out the cache we hit one of those cases where 
> > > > we would
> > > > deadlock so we couldn't write the cache out
> > > > 
> > > > It's nothing to worry about, it's doing what it is supposed to.  
> > > > However I'd
> > > > like to know why we're not able to write out the cache.  Are you 
> > > > running close
> > > > to full?  Thanks,
> > > > 
> > > > Josef
> > > >
> > > 
> > > I think there should be enough free space:
> > > 
> > 
> > Hmm well then we're hitting one of the other corner cases.  Can you run with
> > this debug thread and reboot.  Hopefully it will tell me why we're not 
> > saving
> > the free space cache. Thanks,
> > 
> > Josef
> > 
> > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> > index 87aae66..4fd5659 100644
> > --- a/fs/btrfs/extent-tree.c
> > +++ b/fs/btrfs/extent-tree.c
> > @@ -2794,13 +2794,17 @@ again:
> > if (i_size_read(inode) > 0) {
> > ret = btrfs_truncate_free_space_cache(root, trans, path,
> >   inode);
> > -   if (ret)
> > +   if (ret) {
> > +   printk(KERN_ERR "truncate free space cache failed for 
> > %llu, %d\n",
> > +  block_group->key.objectid, ret);
> > goto out_put;
> > +   }
> > }
> >  
> > spin_lock(&block_group->lock);
> > if (block_group->cached != BTRFS_CACHE_FINISHED) {
> > spin_unlock(&block_group->lock);
> > +   printk(KERN_ERR "block group %llu not cached\n", 
> > block_group->key.objectid);
> > goto out_put;
> > }
> > spin_unlock(&block_group->lock);
> > @@ -2820,8 +2824,10 @@ again:
> > num_pages *= PAGE_CACHE_SIZE;
> >  
> > ret = btrfs_check_data_free_space(inode, num_pages);
> > -   if (ret)
> > +   if (ret) {
> > +   printk(KERN_ERR "not enough free space for cache %llu\n", 
> > block_group->key.objectid);
> > goto out_put;
> > +   }
> >  
> > ret = btrfs_prealloc_file_range_trans(inode, trans, 0, 0, num_pages,
> >   num_pages, num_pages,
> > diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> > index 22ee0dc..0078172 100644
> > --- a/fs/btrfs/free-space-cache.c
> > +++ b/fs/btrfs/free-space-cache.c
> > @@ -511,6 +511,8 @@ int btrfs_write_out_cache(struct btrfs_root *root,
> > spin_lock(&block_group->lock);
> > if (block_group->disk_cache_state < BTRFS_DC_SETUP) {
> > spin_unlock(&block_group->lock);
> > +   printk(KERN_ERR "block group %llu, wrong dcs %d\n", 
> > block_group->key.objectid,
> > +  block_group->disk_cache_state);
> > return 0;
> > }
> > spin_unlock(&block_group->lock);
> > @@ -520,6 +522,7 @@ int btrfs_write_out_cache(struct btrfs_root *root,
> > return 0;
> >  
> > if (!i_size_read(inode)) {
> > +   printk(KERN_ERR "no allocated space for block group %llu\n", 
> > block_group->key.objectid);
> > iput(inode);
> > return 0;
> > }
> > @@ -771,6 +774,7 @@ out_free:
>

Re: disk space caching generation missmatch

2010-12-01 Thread Johannes Hirte

On Wednesday 01 December 2010 21:03:13 Josef Bacik wrote:
> On Wed, Dec 01, 2010 at 08:56:14PM +0100, Johannes Hirte wrote:
> > On Wednesday 01 December 2010 18:40:18 Josef Bacik wrote:
> > > On Wed, Dec 01, 2010 at 05:46:14PM +0100, Johannes Hirte wrote:
> > > > After enabling disk space caching I've observed several log entries 
> > > > like this:
> > > > 
> > > > btrfs: free space inode generation (0) did not match free space cache 
> > > > generation (169594) for block group 15464398848
> > > > 
> > > > I'm not sure, but it seems this happens on every reboot. Is this 
> > > > something to
> > > > worry about?
> > > > 
> > > 
> > > So that usually means 1 of a couple of things
> > > 
> > > 1) You didn't have space for us to save the free space cache
> > > 2) When trying to write out the cache we hit one of those cases where we 
> > > would
> > > deadlock so we couldn't write the cache out
> > > 
> > > It's nothing to worry about, it's doing what it is supposed to.  However 
> > > I'd
> > > like to know why we're not able to write out the cache.  Are you running 
> > > close
> > > to full?  Thanks,
> > > 
> > > Josef
> > >
> > 
> > I think there should be enough free space:
> > 
> 
> Hmm well then we're hitting one of the other corner cases.  Can you run with
> this debug thread and reboot.  Hopefully it will tell me why we're not saving
> the free space cache. Thanks,
> 
> Josef
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 87aae66..4fd5659 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -2794,13 +2794,17 @@ again:
>   if (i_size_read(inode) > 0) {
>   ret = btrfs_truncate_free_space_cache(root, trans, path,
> inode);
> - if (ret)
> + if (ret) {
> + printk(KERN_ERR "truncate free space cache failed for 
> %llu, %d\n",
> +block_group->key.objectid, ret);
>   goto out_put;
> + }
>   }
>  
>   spin_lock(&block_group->lock);
>   if (block_group->cached != BTRFS_CACHE_FINISHED) {
>   spin_unlock(&block_group->lock);
> + printk(KERN_ERR "block group %llu not cached\n", 
> block_group->key.objectid);
>   goto out_put;
>   }
>   spin_unlock(&block_group->lock);
> @@ -2820,8 +2824,10 @@ again:
>   num_pages *= PAGE_CACHE_SIZE;
>  
>   ret = btrfs_check_data_free_space(inode, num_pages);
> - if (ret)
> + if (ret) {
> + printk(KERN_ERR "not enough free space for cache %llu\n", 
> block_group->key.objectid);
>   goto out_put;
> + }
>  
>   ret = btrfs_prealloc_file_range_trans(inode, trans, 0, 0, num_pages,
> num_pages, num_pages,
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index 22ee0dc..0078172 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -511,6 +511,8 @@ int btrfs_write_out_cache(struct btrfs_root *root,
>   spin_lock(&block_group->lock);
>   if (block_group->disk_cache_state < BTRFS_DC_SETUP) {
>   spin_unlock(&block_group->lock);
> + printk(KERN_ERR "block group %llu, wrong dcs %d\n", 
> block_group->key.objectid,
> +block_group->disk_cache_state);
>   return 0;
>   }
>   spin_unlock(&block_group->lock);
> @@ -520,6 +522,7 @@ int btrfs_write_out_cache(struct btrfs_root *root,
>   return 0;
>  
>   if (!i_size_read(inode)) {
> + printk(KERN_ERR "no allocated space for block group %llu\n", 
> block_group->key.objectid);
>   iput(inode);
>   return 0;
>   }
> @@ -771,6 +774,7 @@ out_free:
>   block_group->disk_cache_state = BTRFS_DC_ERROR;
>   spin_unlock(&block_group->lock);
>   BTRFS_I(inode)->generation = 0;
> + printk(KERN_ERR "problem writing out block group cache for 
> %llu\n", block_group->key.objectid);
>   }
>   kfree(checksums);
>   btrfs_update_inode(trans, root, inode);
> 

This is from dmesg shortly after reboot with the debug patch:

btrfs: free space inode g

Re: disk space caching generation missmatch

2010-12-01 Thread Johannes Hirte

On Wednesday 01 December 2010 18:40:18 Josef Bacik wrote:
> On Wed, Dec 01, 2010 at 05:46:14PM +0100, Johannes Hirte wrote:
> > After enabling disk space caching I've observed several log entries like 
> > this:
> > 
> > btrfs: free space inode generation (0) did not match free space cache 
> > generation (169594) for block group 15464398848
> > 
> > I'm not sure, but it seems this happens on every reboot. Is this something 
> > to
> > worry about?
> > 
> 
> So that usually means 1 of a couple of things
> 
> 1) You didn't have space for us to save the free space cache
> 2) When trying to write out the cache we hit one of those cases where we would
> deadlock so we couldn't write the cache out
> 
> It's nothing to worry about, it's doing what it is supposed to.  However I'd
> like to know why we're not able to write out the cache.  Are you running close
> to full?  Thanks,
> 
> Josef
>

I think there should be enough free space:

df -h

FilesystemSize  Used Avail Use% Mounted on
rootfs 41G   29G  8.4G  78% /
/dev/root  41G   29G  8.4G  78% /
rc-svcdir 1.0M  112K  912K  11% /lib/rc/init.d
udev   10M  284K  9.8M   3% /dev
shm  1008M 0 1008M   0% /dev/shm
/dev/sda3 108G   90G   15G  87% /home

btrfs filesystem df /

Data: total=34.48GB, used=26.13GB
System, DUP: total=8.00MB, used=12.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=2.75GB, used=1.26GB
Metadata: total=8.00MB, used=0.00

btrfs filesystem df /home

Data: total=88.01GB, used=84.84GB
System, DUP: total=8.00MB, used=20.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=4.00GB, used=2.43GB
Metadata: total=8.00MB, used=0.00
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

disk space caching generation missmatch

2010-12-01 Thread Johannes Hirte

After enabling disk space caching I've observed several log entries like this:

btrfs: free space inode generation (0) did not match free space cache 
generation (169594) for block group 15464398848

I'm not sure, but it seems this happens on every reboot. Is this something to
worry about?

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

kernel BUG at fs/btrfs/inode.c:806

2010-12-01 Thread Johannes Hirte

On one of my machines with btrfs I got this bug:

entry offset 29085974528, bytes 4096, bitmap no
entry offset 29162995712, bytes 20480, bitmap yes
entry offset 29171744768, bytes 4096, bitmap no
block group has cluster?: no
0 blocks of free space at or bigger than bytes is
block group 29834084352 has 1073741824 bytes, 1072648192 used 0 pinned 0 
reserved
entry offset 29834084352, bytes 376832, bitmap yes
entry offset 29890392064, bytes 4096, bitmap no
entry offset 29895069696, bytes 4096, bitmap no
entry offset 29896048640, bytes 4096, bitmap no
entry offset 29896364032, bytes 4096, bitmap no
entry offset 29896482816, bytes 4096, bitmap no
entry offset 29905817600, bytes 4096, bitmap no
entry offset 29906878464, bytes 4096, bitmap no
entry offset 29908029440, bytes 4096, bitmap no
entry offset 29908418560, bytes 4096, bitmap no
entry offset 29910061056, bytes 4096, bitmap no
entry offset 29911105536, bytes 4096, bitmap no
entry offset 29912371200, bytes 4096, bitmap no
entry offset 29912748032, bytes 4096, bitmap no
entry offset 29914660864, bytes 4096, bitmap no
entry offset 29914755072, bytes 4096, bitmap no
entry offset 29915865088, bytes 4096, bitmap no
entry offset 29915914240, bytes 4096, bitmap no
entry offset 29916409856, bytes 4096, bitmap no
entry offset 29916471296, bytes 4096, bitmap no
entry offset 29924597760, bytes 4096, bitmap no
entry offset 29931642880, bytes 4096, bitmap no
entry offset 29931925504, bytes 4096, bitmap no
entry offset 29932732416, bytes 4096, bitmap no
entry offset 29933383680, bytes 4096, bitmap no
entry offset 29933412352, bytes 4096, bitmap no
entry offset 29933596672, bytes 4096, bitmap no
entry offset 29935316992, bytes 4096, bitmap no
entry offset 29938610176, bytes 4096, bitmap no
entry offset 29939154944, bytes 4096, bitmap no
entry offset 29944033280, bytes 4096, bitmap no
entry offset 29946318848, bytes 4096, bitmap no
entry offset 29964181504, bytes 4096, bitmap no
entry offset 29964828672, bytes 4096, bitmap no
entry offset 29966233600, bytes 4096, bitmap no
entry offset 29968302080, bytes 98304, bitmap yes
entry offset 29983170560, bytes 4096, bitmap no
entry offset 29984059392, bytes 4096, bitmap no
entry offset 29992976384, bytes 4096, bitmap no
entry offset 30008422400, bytes 4096, bitmap no
entry offset 30025895936, bytes 4096, bitmap no
entry offset 30034280448, bytes 4096, bitmap no
entry offset 30055174144, bytes 4096, bitmap no
entry offset 30067208192, bytes 4096, bitmap no
entry offset 30094012416, bytes 4096, bitmap no
entry offset 30098358272, bytes 4096, bitmap no
entry offset 30098722816, bytes 4096, bitmap no
entry offset 30102491136, bytes 4096, bitmap no
entry offset 30102519808, bytes 143360, bitmap yes
entry offset 30103207936, bytes 4096, bitmap no
entry offset 30103601152, bytes 4096, bitmap no
entry offset 30105415680, bytes 4096, bitmap no
entry offset 30112169984, bytes 4096, bitmap no
entry offset 30139326464, bytes 4096, bitmap no
entry offset 30173143040, bytes 4096, bitmap no
entry offset 30176014336, bytes 4096, bitmap no
entry offset 30202048512, bytes 4096, bitmap no
entry offset 30229487616, bytes 4096, bitmap no
entry offset 30230700032, bytes 4096, bitmap no
entry offset 30230777856, bytes 4096, bitmap no
entry offset 30232813568, bytes 4096, bitmap no
entry offset 30235348992, bytes 4096, bitmap no
entry offset 30236737536, bytes 49152, bitmap yes
entry offset 30241488896, bytes 4096, bitmap no
entry offset 30252662784, bytes 4096, bitmap no
entry offset 30370955264, bytes 49152, bitmap yes
entry offset 30425870336, bytes 4096, bitmap no
entry offset 30505172992, bytes 61440, bitmap yes
entry offset 30507831296, bytes 4096, bitmap no
entry offset 30639390720, bytes 8192, bitmap yes
entry offset 30760058880, bytes 4096, bitmap no
entry offset 30773608448, bytes 45056, bitmap yes
block group has cluster?: no
3 blocks of free space at or bigger than bytes is
block group 30907826176 has 536870912 bytes, 533860352 used 0 pinned 0 reserved
entry offset 30907826176, bytes 1441792, bitmap yes
entry offset 31042043904, bytes 995328, bitmap yes
entry offset 31176261632, bytes 212992, bitmap yes
entry offset 31310479360, bytes 8192, bitmap yes
block group has cluster?: no
3 blocks of free space at or bigger than bytes is
block group 31444697088 has 268435456 bytes, 266985472 used 0 pinned 0 reserved
entry offset 31444697088, bytes 1298432, bitmap yes
entry offset 31578914816, bytes 151552, bitmap yes
block group has cluster?: no
2 blocks of free space at or bigger than bytes is
block group 31713132544 has 268435456 bytes, 267300864 used 0 pinned 0 reserved
entry offset 31713132544, bytes 1093632, bitmap yes
entry offset 31847350272, bytes 40960, bitmap yes
block group has cluster?: no
1 blocks of free space at or bigger than bytes is
block group 31981568000 has 268435456 bytes, 268029952 used 0 pinned 0 reserved
entry offset 31981568000, bytes 360448, bitmap yes
entry offset 32115785728, bytes 45056, bitmap yes
block group has cluster

Re: btrfs on LVM: Out of space

2010-09-17 Thread Johannes Hirte

On Friday 10 September 2010 21:46:49 Marcel Lohmann wrote:
> Trying to use "-l 2048" during mkfs was rejected as being invalid. But
> who cares...?
> 
> Marcel

That's because btrfs supports only leafsize equal to pagesize for now.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: machine gets unresponsive during btrfs balance

2010-08-26 Thread Johannes Hirte

On Thursday 26 August 2010 15:39:25 Andreas Philipp wrote:
> On 26.08.2010 15:27, Johannes Hirte wrote:
> > Looks like another manifestation of the csum bug. Are you able to read all
> > files from the affected volume? Did you tried a balance with an 2.6.34 
> > kernel
> > after the test with 2.6.35?
> >
> Till now I did not see any unreadable files but I did not do a
> complete test. No, I did not try to balance with an 2.6.34 kernel. If
> it helps I can switch back and try.

I hope it helps to localize the error. It's still not clear where this starts 
an what kernels are affected.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: machine gets unresponsive during btrfs balance

2010-08-26 Thread Johannes Hirte

On Saturday 14 August 2010 00:11:55 Andreas Philipp wrote:
> On 12.08.2010 10:04, Yan, Zheng wrote:
> > On Thu, Aug 12, 2010 at 3:14 PM, Andreas Philipp
> >  wrote:
> >   
> >> Hi,
> >>
> >> I am using a btrfs filesystem created with raid0 for data and metadata
> >> for (temporary) storage of tv recordings from my vdr. The filesystem was
> >> created under kernel version 2.6.34. An initial btrfs balance command
> >> succeeded. Since I upgraded to 2.6.35-rcX and 2.6.35 btrfs balance no
> >> longer finishes but puts the machine in some unresponsive state.
> >> Unfortunately, I do not see any kernel oops or other debug information
> >> because even the display freezes. The last thing that happens are that
> >> those two lines are written to /var/log/messages:
> >> Aug 11 21:42:23 thor kernel: btrfs: found 62911 extents
> >> Aug 11 21:42:24 thor kernel: btrfs: relocating block group 1723913469952
> >> flags 9
> >> After that the machine becomes immediately unresponsive.
> >>
> >> As I did not see anything that might be related to my problem in the
> >> changelog for 2.6.35.1 I did not try again with this version.
> >>
> >> 
> > Do you have more than one machines? would you please setup netconsole
> > to see what happen.
> >   
> I have reproduced the error on v2.6.35.1 and recorded all kernel output
> with netconsole. The interesting point is that this time the machine did
> not crash but the btrfs balance segfaulted at exact the same position
> where the previous crashes had happened.

Looks like another manifestation of the csum bug. Are you able to read all 
files from the affected volume? Did you tried a balance with an 2.6.34 kernel 
after the test with 2.6.35?

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.36-rc1 btrfs still unstable

2010-08-25 Thread Johannes Hirte

On Monday 16 August 2010 16:17:54 Morten P.D. Stevens wrote:
> Hi Chris,
> 
> the other big question is:
> 
> Is btrfs with 2.6.36 really rockstable and ready to use in productive 
> environments?
> 
> Thanks
> 
> Morten

I don't think so. There is at least one checksum bug and ENOSPC problems are 
also still present.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel BUG at fs/btrfs/extent-tree.c:1353

2010-07-29 Thread Johannes Hirte

Am Donnerstag 22 Juli 2010, 20:07:23 schrieb Johannes Hirte:
> Am Montag 19 Juli 2010, 10:01:46 schrieb Miao Xie:
> > On Thu, 15 Jul 2010 20:14:51 +0200, Johannes Hirte wrote:
> > > Am Donnerstag 15 Juli 2010, 02:11:04 schrieb Dave Chinner:
> > >> On Wed, Jul 14, 2010 at 05:25:23PM +0200, Johannes Hirte wrote:
> > >>> Am Donnerstag 08 Juli 2010, 16:31:09 schrieb Chris Mason:
> > >>> I'm not sure if btrfs is to blame for this error. After the errors I
> > >>> switched to XFS on this system and got now this error:
> > >>> 
> > >>> ls -l .kde4/share/apps/akregator/data/
> > >>> ls: cannot access .kde4/share/apps/akregator/data/feeds.opml:
> > >>> Structure needs cleaning
> > >>> total 4
> > >>> ?? ? ???? feeds.opml
> > >> 
> > >> What is the error reported in dmesg when the XFS filesytem shuts down?
> > > 
> > > Nothing. I double checked the logs. There are only the messages when
> > > mounting the filesystem. No other errors are reported than the
> > > inaccessible file and the output from xfs_check.
> > 
> > Is there anything wrong with your disks or memory?
> > Sometimes the bad memory can break the filesystem. I have met this kind
> > of problem some time ago.
> 
> I don't think that's the case. I've checked the RAM with memtest86+ and got
> no errors. I got the errors with two different disks, the first one with
> btrfs the second one now with XFS. Before changing to the second disk,
> I've run badblocks on it to be sure it has no errors.

I think I've found it. The bug was introduced by 

commit 7f0e7bed936a0c422641a046551829a01341dd80
Author: Christoph Hellwig 
Date:   Tue Jun 8 18:14:34 2010 +0200

writeback: fix writeback completion notifications

The code dealing with bdi_work->state and completion of a bdi_work is a
major mess currently.  This patch makes sure we directly use one set of
flags to deal with it, and use it consistently, which means:

 - always notify about completion from the rcu callback.  We only ever
   wait for it from on-stack callers, so this simplification does not
   even cause a theoretical slowdown currently.  It also makes sure we
   don't miss out on the notification if we ever add other callers to
   wait for it.
 - make earlier completion notification depending on the on-stack
   allocation, not the sync mode.  If we introduce new callers that
   want to do WB_SYNC_NONE writeback from on-stack callers this will
   be nessecary.

Also rename bdi_wait_on_work_clear to bdi_wait_on_work_done and inline
a few small functions into their only caller to make the code
understandable.

Signed-off-by: Christoph Hellwig 
Signed-off-by: Jens Axboe 

and seems to be fixed by

commit 83ba7b071f30f7c01f72518ad72d5cd203c27502
Author: Christoph Hellwig 
Date:   Tue Jul 6 08:59:53 2010 +0200

writeback: simplify the write back thread queue

First remove items from work_list as soon as we start working on them.This
means we don't have to track any pending or visited state and can get
rid of all the RCU magic freeing the work items - we can simply free
them once the operation has finished.  Second use a real completion for
tracking synchronous requests - if the caller sets the completion pointer
we complete it, otherwise use it as a boolean indicator that we can free
the work item directly.  Third unify struct wb_writeback_args and struct
bdi_work into a single data structure, wb_writeback_work.  Previous we
set all parameters into a struct wb_writeback_args, copied it into
struct bdi_work, copied it again on the stack to use it there.  Instead
of just allocate one structure dynamically or on the stack and use it
all the way through the stack.

Signed-off-by: Christoph Hellwig 
Signed-off-by: Jens Axboe 

I was able to reproduce the bug by unpacking a big tar-file and deleting this 
files multiple times. Normally with btrfs the kernel crashed within 20 runs. 
After commit 83ba7b071f30f7c01f72518ad72d5cd203c27502 it survived more than 500 
runs.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel BUG at fs/btrfs/extent-tree.c:1353

2010-07-22 Thread Johannes Hirte

Am Montag 19 Juli 2010, 10:01:46 schrieb Miao Xie:
> On Thu, 15 Jul 2010 20:14:51 +0200, Johannes Hirte wrote:
> > Am Donnerstag 15 Juli 2010, 02:11:04 schrieb Dave Chinner:
> >> On Wed, Jul 14, 2010 at 05:25:23PM +0200, Johannes Hirte wrote:
> >>> Am Donnerstag 08 Juli 2010, 16:31:09 schrieb Chris Mason:
> >>> I'm not sure if btrfs is to blame for this error. After the errors I
> >>> switched to XFS on this system and got now this error:
> >>> 
> >>> ls -l .kde4/share/apps/akregator/data/
> >>> ls: cannot access .kde4/share/apps/akregator/data/feeds.opml: Structure
> >>> needs cleaning
> >>> total 4
> >>> ?? ? ???? feeds.opml
> >> 
> >> What is the error reported in dmesg when the XFS filesytem shuts down?
> > 
> > Nothing. I double checked the logs. There are only the messages when
> > mounting the filesystem. No other errors are reported than the
> > inaccessible file and the output from xfs_check.
> 
> Is there anything wrong with your disks or memory?
> Sometimes the bad memory can break the filesystem. I have met this kind of
> problem some time ago.

I don't think that's the case. I've checked the RAM with memtest86+ and got no 
errors. I got the errors with two different disks, the first one with btrfs the 
second one now with XFS. Before changing to the second disk, I've run 
badblocks on it to be sure it has no errors.

> 
> If there is no problem with your disk and memory, Could you tell us the
> parameter of mkfs.btrfs and mount?

I'm not sure what parameters I've used for mkbtrfs. It was either none ore '-m 
single'. mount parameters are only noatime. Some time ago I've played a little 
with max_inline.

On the actual disk with XFS I got now some more errors on my root-fs. Similar 
error on one file:

ls: cannot access /var/tmp/portage/app-
office/krita-2.2.1/work/krita-2.2.1/krita/image/tiles3/tests/dm_consistancy_test/dm_consistancy_test.pr:
 
Invalid argument

xfs_check shows on this fs:

localhost ~ # xfs_check /dev/sda1
agi unlinked bucket 10 is 7279754 in ag 0 (inode=7279754)
agi unlinked bucket 11 is 7279755 in ag 0 (inode=7279755)
dir 91466358 entry dm_consistancy_test.pr bad inode number 1862628266
dir 91466358 size is 36, should be 35
agi unlinked bucket 48 is 11677104 in ag 2 (inode=78785968)
agi unlinked bucket 49 is 11677105 in ag 2 (inode=78785969)
agi unlinked bucket 50 is 11677106 in ag 2 (inode=78785970)
agi unlinked bucket 51 is 11677107 in ag 2 (inode=78785971)
agi unlinked bucket 52 is 11677108 in ag 2 (inode=78785972)
agi unlinked bucket 53 is 11677109 in ag 2 (inode=78785973)
agi unlinked bucket 54 is 11677110 in ag 2 (inode=78785974)
agi unlinked bucket 55 is 11677111 in ag 2 (inode=78785975)
agi unlinked bucket 58 is 11677114 in ag 2 (inode=78785978)
agi unlinked bucket 59 is 11677115 in ag 2 (inode=78785979)
agi unlinked bucket 60 is 11677116 in ag 2 (inode=78785980)
agi unlinked bucket 61 is 11677117 in ag 2 (inode=78785981)
allocated inode 7279754 has 0 link count
allocated inode 7279755 has 0 link count
disconnected inode 91466360, nlink 1
allocated inode 78785968 has 0 link count
allocated inode 78785969 has 0 link count
allocated inode 78785970 has 0 link count
allocated inode 78785971 has 0 link count
allocated inode 78785972 has 0 link count
allocated inode 78785973 has 0 link count
allocated inode 78785974 has 0 link count
allocated inode 78785975 has 0 link count
allocated inode 78785978 has 0 link count
allocated inode 78785979 has 0 link count
allocated inode 78785980 has 0 link count
allocated inode 78785981 has 0 link count

And again I don't find any related message in dmesg.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Status of BTRFS

2010-07-16 Thread Johannes Hirte

Am Freitag 16 Juli 2010, 13:55:26 schrieb Edward Ned Harvey:
> Is this a good place to get a clue about the status of BTRFS?  Like ...  Is
> it usable yet, and stuff like that?
> 
> Thank you...

I wouldn't suggest to use it in productive environments. Especially as the 
error handling is very rudimentarily for now. If you run into errors, you 
won't be able to repair the filesystem. For productive environments there are 
also way to much situations where the whole kernel panics instead only the 
affected filesystem. Another construction area is still the ENOSPC handling.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel BUG at fs/btrfs/extent-tree.c:1353

2010-07-16 Thread Johannes Hirte

Am Donnerstag 15 Juli 2010, 20:14:51 schrieb Johannes Hirte:
> Am Donnerstag 15 Juli 2010, 02:11:04 schrieb Dave Chinner:
> > On Wed, Jul 14, 2010 at 05:25:23PM +0200, Johannes Hirte wrote:
> > > Am Donnerstag 08 Juli 2010, 16:31:09 schrieb Chris Mason:
> > > I'm not sure if btrfs is to blame for this error. After the errors I
> > > switched to XFS on this system and got now this error:
> > > 
> > > ls -l .kde4/share/apps/akregator/data/
> > > ls: cannot access .kde4/share/apps/akregator/data/feeds.opml: Structure
> > > needs cleaning
> > > total 4
> > > ?? ? ???? feeds.opml
> > 
> > What is the error reported in dmesg when the XFS filesytem shuts down?
> 
> Nothing. I double checked the logs. There are only the messages when
> mounting the filesystem. No other errors are reported than the
> inaccessible file and the output from xfs_check.

I'm running now a kernel with more debug options enabled and got this:

[ 6794.810935] 
[ 6794.810941] =
[ 6794.810955] [ INFO: inconsistent lock state ]
[ 6794.810966] 2.6.35-rc4-btrfs-debug #7
[ 6794.810975] -
[ 6794.810984] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
[ 6794.810996] kswapd0/361 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 6794.811006]  (&(&ip->i_iolock)->mr_lock#2){?+}, at: [] 
xfs_ilock+0x22/0x67
[ 6794.811039] {RECLAIM_FS-ON-W} state was registered at:
[ 6794.811046]   [] mark_held_locks+0x42/0x5e
[ 6794.811046]   [] lockdep_trace_alloc+0x99/0xb0
[ 6794.811046]   [] __alloc_pages_nodemask+0x6a/0x4a1
[ 6794.811046]   [] __page_cache_alloc+0x11/0x13
[ 6794.811046]   [] grab_cache_page_write_begin+0x47/0x81
[ 6794.811046]   [] block_write_begin_newtrunc+0x2e/0x9c
[ 6794.811046]   [] block_write_begin+0x23/0x5d
[ 6794.811046]   [] xfs_vm_write_begin+0x26/0x28
[ 6794.811046]   [] generic_file_buffered_write+0xb5/0x1bd
[ 6794.811046]   [] xfs_file_aio_write+0x40e/0x66d
[ 6794.811046]   [] do_sync_write+0x8b/0xc6
[ 6794.811046]   [] vfs_write+0x77/0xa4
[ 6794.811046]   [] sys_write+0x3c/0x5e
[ 6794.811046]   [] sysenter_do_call+0x12/0x36
[ 6794.811046] irq event stamp: 141369
[ 6794.811046] hardirqs last  enabled at (141369): [] 
_raw_spin_unlock_irqrestore+0x36/0x5b
[ 6794.811046] hardirqs last disabled at (141368): [] 
_raw_spin_lock_irqsave+0x14/0x68
[ 6794.811046] softirqs last  enabled at (141300): [] 
__do_softirq+0xfe/0x10d
[ 6794.811046] softirqs last disabled at (141295): [] 
do_softirq+0x2f/0x47
[ 6794.811046] 
[ 6794.811046] other info that might help us debug this:
[ 6794.811046] 2 locks held by kswapd0/361:
[ 6794.811046]  #0:  (shrinker_rwsem){..}, at: [] 
shrink_slab+0x25/0x13f
[ 6794.811046]  #1:  (&xfs_mount_list_lock){.-}, at: [] 
xfs_reclaim_inode_shrink+0x2a/0xe8
[ 6794.811046] 
[ 6794.811046] stack backtrace:
[ 6794.811046] Pid: 361, comm: kswapd0 Not tainted 2.6.35-rc4-btrfs-debug #7
[ 6794.811046] Call Trace:
[ 6794.811046]  [] ? printk+0xf/0x17
[ 6794.811046]  [] valid_state+0x134/0x142
[ 6794.811046]  [] mark_lock+0xd0/0x1e9
[ 6794.811046]  [] ? check_usage_forwards+0x0/0x5f
[ 6794.811046]  [] __lock_acquire+0x374/0xc80
[ 6794.811046]  [] ? sched_clock_local+0x12/0x121
[ 6794.811046]  [] ? sched_clock_cpu+0x122/0x133
[ 6794.811046]  [] lock_acquire+0x5f/0x76
[ 6794.811046]  [] ? xfs_ilock+0x22/0x67
[ 6794.811046]  [] down_write_nested+0x32/0x63
[ 6794.811046]  [] ? xfs_ilock+0x22/0x67
[ 6794.811046]  [] xfs_ilock+0x22/0x67
[ 6794.811046]  [] xfs_ireclaim+0x98/0xbb
[ 6794.811046]  [] ? up_write+0x16/0x2b
[ 6794.811046]  [] xfs_reclaim_inode+0x1a7/0x1b1
[ 6794.811046]  [] xfs_inode_ag_walk+0x77/0xbc
[ 6794.811046]  [] ? xfs_reclaim_inode+0x0/0x1b1
[ 6794.811046]  [] xfs_inode_ag_iterator+0x52/0x99
[ 6794.811046]  [] ? xfs_reclaim_inode_shrink+0x2a/0xe8
[ 6794.811046]  [] ? xfs_reclaim_inode+0x0/0x1b1
[ 6794.811046]  [] xfs_reclaim_inode_shrink+0x4b/0xe8
[ 6794.811046]  [] shrink_slab+0xd2/0x13f
[ 6794.811046]  [] kswapd+0x37d/0x4e9
[ 6794.811046]  [] ? autoremove_wake_function+0x0/0x2f
[ 6794.811046]  [] ? kswapd+0x0/0x4e9
[ 6794.811046]  [] kthread+0x60/0x65
[ 6794.811046]  [] ? kthread+0x0/0x65
[ 6794.811046]  [] kernel_thread_helper+0x6/0x10

Don't know if this is related to the problem.


regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: csum errors

2010-07-15 Thread Johannes Hirte

Am Donnerstag 15 Juli 2010, 21:35:51 schrieb Chris Mason:
> On Thu, Jul 15, 2010 at 09:32:12PM +0200, Johannes Hirte wrote:
> > Am Donnerstag 15 Juli 2010, 21:03:09 schrieb Chris Mason:
> > > On Thu, Jul 15, 2010 at 08:30:17PM +0200, Johannes Hirte wrote:
> > > > Am Dienstag 13 Juli 2010, 14:23:58 schrieb Johannes Hirte:
> > > > > ino 1959333 off 898342912 csum 4271223884 private 4271223883
> > > > 
> > > > I think, this is a different error. I've only seen them on
> > > > filesystems from my Opteron system. It seems that the recorded csums
> > > > are wrong and it looks to me like rounding errors. The data itself
> > > > should be correct, as I've tested one affected file via md5sum
> > > > against the original on another filesystem. Any ideas what is going
> > > > wrong here?
> > > 
> > > Are you doing data mirroring?
> > 
> > No, I don't.
> > 
> > > We can map that block and do a raw read off the device to see what the
> > > data blocks actually contain.
> > 
> > I've modified the btrfs-source a little to get the data. In inode.c I've
> 
> > changed the code to:
> Great.   The bad csums are all just one bit off, that can't be an
> accident. When were they written (which kernel?).  Did you boot a 32
> bit kernel on there at any time?

No, I don't have a bootable 32bit installation on this system. I've tested it 
now with a 32bit system by dumping the whole filesystem to an external drive 
and mounting this to a 32bit system. The result was the same.

The affected files were written by different kernels. I think at least 2.6.34, 
2.6.35-rc3 and 2.6.35-rc4 should be involved, perhaps 2.6.33 too. I'll try to 
figure it out more exactly.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: csum errors

2010-07-15 Thread Johannes Hirte

Am Donnerstag 15 Juli 2010, 21:03:09 schrieb Chris Mason:
> On Thu, Jul 15, 2010 at 08:30:17PM +0200, Johannes Hirte wrote:
> > Am Dienstag 13 Juli 2010, 14:23:58 schrieb Johannes Hirte:
> > > ino 1959333 off 898342912 csum 4271223884 private 4271223883
> > 
> > I think, this is a different error. I've only seen them on filesystems
> > from my Opteron system. It seems that the recorded csums are wrong and
> > it looks to me like rounding errors. The data itself should be correct,
> > as I've tested one affected file via md5sum against the original on
> > another filesystem. Any ideas what is going wrong here?
> 
> Are you doing data mirroring?

No, I don't.

> We can map that block and do a raw read off the device to see what the
> data blocks actually contain.

I've modified the btrfs-source a little to get the data. In inode.c I've 
changed the code to:


csum = btrfs_csum_data(root, kaddr + offset, csum,  end - start + 1);
btrfs_csum_final(csum, (char *)&csum);
if (csum != private)
if (printk_ratelimit()) {
printk(KERN_INFO "csum != private; ino %lu off %llu "
"csum %u private %llu\n", page->mapping->host->i_ino,
(unsigned long long)start, csum,
(unsigned long long)private);
}
// goto zeroit;

kunmap_atomic(kaddr, KM_USER0);

This way I could read the files with wrong csum too. As I wrote, I've compared 
the md5sum from one file with a copy on an other filesystem. As they are the 
same, at least for this file the data should be correct. The big question is, 
why do the csums differ?

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

csum errors

2010-07-15 Thread Johannes Hirte

Am Dienstag 13 Juli 2010, 14:23:58 schrieb Johannes Hirte:
> On the Opteron system I got now csum errors. I've synced some data from the
> netbook to the Opteron yesteray. After hitting ENOSPC with 4GB free, I've
> run 'btrfs-vol -b' on this fs in hope to get some more free space. It
> worked but the command failed and I found in dmesg:
> 
> btrfs csum failed ino 339 off 935280640 csum 337776576 private 337776575
> btrfs csum failed ino 339 off 935280640 csum 337776576 private 337776575
> btrfs csum failed ino 339 off 935280640 csum 337776576 private 337776575
> btrfs csum failed ino 339 off 935280640 csum 337776576 private 337776575
> 
> So I've tested the new synced data by syncing them to another disk on the
> Optoern system (XFS). As I've expected (or better feared), some data wasn't
> readable and I found more csum errors in dmesg:
> 
> btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
> btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
> btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
> btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
> btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
> btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
> btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
> btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
> btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
> btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
> btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
> btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
> btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
> btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
> btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
> btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
> btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
> btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
> btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
> btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
> btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
> btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
> btrfs csum failed ino 1959333 off 252362752 csum 686735346 private
> 686735345 btrfs csum failed ino 1959333 off 252362752 csum 686735346
> private 686735345 btrfs csum failed ino 1959333 off 252362752 csum
> 686735346 private 686735345 btrfs csum failed ino 1959333 off 252362752
> csum 686735346 private 686735345 btrfs csum failed ino 1959333 off
> 252362752 csum 686735346 private 686735345 btrfs csum failed ino 1959333
> off 252362752 csum 686735346 private 686735345 btrfs csum failed ino
> 1959333 off 651108352 csum 2851505977 private 2851505976 btrfs csum failed
> ino 1959333 off 651108352 csum 2851505977 private 2851505976 btrfs csum
> failed ino 1959333 off 651108352 csum 2851505977 private 2851505976 btrfs
> csum failed ino 1959333 off 651108352 csum 2851505977 private 2851505976
> btrfs csum failed ino 1959333 off 651108352 csum 2851505977 private
> 2851505976 btrfs csum failed ino 1959333 off 651108352 csum 2851505977
> private 2851505976 btrfs csum failed ino 1959333 off 898342912 csum
> 4271223884 private 4271223883 btrfs csum failed ino 1959333 off 898342912
> csum 4271223884 private 4271223883 btrfs csum failed ino 1959333 off
> 898342912 csum 4271223884 private 4271223883 btrfs csum failed ino 1959333
> off 898342912 csum 4271223884 private 4271223883 btrfs csum failed ino
> 1959333 off 898342912 csum 4271223884 private 4271223883 btrfs csum failed
> ino 1959333 off 898342912 csum 4271223884 private 4271223883

I think, this is a different error. I've only seen them on filesystems from my 
Opteron system. It seems that the recorded csums are wrong and it looks to me 
like rounding errors. The data itself should be correct, as I've tested one 
affected file via md5sum against the original on another filesystem.
Any ideas what is going wrong here?

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel BUG at fs/btrfs/extent-tree.c:1353

2010-07-15 Thread Johannes Hirte

Am Donnerstag 15 Juli 2010, 02:11:04 schrieb Dave Chinner:
> On Wed, Jul 14, 2010 at 05:25:23PM +0200, Johannes Hirte wrote:
> > Am Donnerstag 08 Juli 2010, 16:31:09 schrieb Chris Mason:
> > I'm not sure if btrfs is to blame for this error. After the errors I
> > switched to XFS on this system and got now this error:
> > 
> > ls -l .kde4/share/apps/akregator/data/
> > ls: cannot access .kde4/share/apps/akregator/data/feeds.opml: Structure
> > needs cleaning
> > total 4
> > ?? ? ???? feeds.opml
> 
> What is the error reported in dmesg when the XFS filesytem shuts down?

Nothing. I double checked the logs. There are only the messages when mounting 
the filesystem. No other errors are reported than the inaccessible file and the 
output from xfs_check.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel BUG at fs/btrfs/extent-tree.c:1353

2010-07-14 Thread Johannes Hirte

Am Donnerstag 08 Juli 2010, 16:31:09 schrieb Chris Mason:
> Neither Yan nor I have been able to reproduce this locally, but a few
> people have now hit it.  Johannes, are you available to try out a
> debugging kernel to try and track this down?
> 
> -chris
> 
> On Thu, Jul 08, 2010 at 04:27:23PM +0200, Johannes Hirte wrote:
> > When doing a 'rm -r /var/tmp/portage/sys-devel' I get the following Oops:
> > 
> > [ cut here ]
> > kernel BUG at fs/btrfs/extent-tree.c:1353!
> > invalid opcode:  [#1] PREEMPT SMP
> > last sysfs file:
> > /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0A:00/power_supply/BAT0/charge_
> > full Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event
> > snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss nfs lockd nfs_acl
> > auth_rpcgss sunrpc sco rfcomm bnep l2cap crc16 xts gf128mul usb_storage
> > dm_crypt dm_mod coretemp hwmon acpi_cpufreq mperf snd_hda_codec_realtek
> > uvcvideo iwl3945 snd_hda_intel snd_hda_codec iwlcore videodev r8169
> > snd_hwdep btusb snd_pcm v4l1_compat mac80211 snd_timer bluetooth snd mii
> > cfg80211 soundcore sg rfkill ac i2c_i801 snd_page_alloc uhci_hcd battery
> > [last unloaded: microcode]
> > 
> > Pid: 2358, comm: rm Not tainted 2.6.35-rc4 #32 M912/M912
> > EIP: 0060:[] EFLAGS: 00010202 CPU: 1
> > EIP is at lookup_inline_extent_backref+0xf2/0x406
> > EAX: 0001 EBX: 0007 ECX:  EDX: 
> > ESI: 0004 EDI: f7268150 EBP: 0004 ESP: f5aa5d08
> > 
> >  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> > 
> > Process rm (pid: 2358, ti=f5aa4000 task=f6f0fa70 task.ti=f5aa4000)
> > 
> > Stack:
> >  f702f8c0 f744e080 f665f380 00b0    f6c80f00
> > 
> > <0> f744e080 c10ec226 e98acfff f6c98000 1001 0e987000 0004
> >  <0> 0850 040e9870 a800 1000  0007
> >  0e987000
> > 
> > Call Trace:
> >  [] ? set_extent_dirty+0x19/0x1d
> >  [] ? __btrfs_free_extent+0xda/0x675
> >  [] ? run_clustered_refs+0x699/0x6d7
> >  [] ? btrfs_mark_buffer_dirty+0xa3/0xef
> >  [] ? btrfs_find_ref_cluster+0xf9/0x13a
> >  [] ? btrfs_run_delayed_refs+0xbf/0x155
> >  [] ? __btrfs_end_transaction+0x53/0x16c
> >  [] ? btrfs_delete_inode+0x166/0x17e
> >  [] ? get_parent_ip+0x8/0x19
> >  [] ? generic_delete_inode+0x6f/0xbd
> >  [] ? iput+0x46/0x48
> >  [] ? do_unlinkat+0xc7/0x109
> >  [] ? get_parent_ip+0x8/0x19
> >  [] ? fput+0x12/0x15c
> >  [] ? dnotify_flush+0x41/0xc2
> >  [] ? filp_close+0x4c/0x52
> >  [] ? sys_close+0x62/0x9b
> >  [] ? sysenter_do_call+0x12/0x26
> > 
> > Code: 80 4e 68 02 8d 4c 24 43 89 f8 6a 01 ff 74 24 1c ff 74 24 08 8b 54
> > 24 38 e8 01 c2 ff ff 83 c4 0c 83 f8 00 0f 8c e1 02 00 00 74 02 <0f> 0b
> > 8b 04 24 8b 34 24 8b 00 8b 56 20 89 44 24 08 e8 2e fa ff
> > EIP: [] lookup_inline_extent_backref+0xf2/0x406 SS:ESP
> > 0068:f5aa5d08 ---[ end trace d97601f0b455ca72 ]---
> > note: rm[2358] exited with preempt_count 2
> > BUG: scheduling while atomic: rm/2358/0x1003
> > Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
> > snd_seq_device snd_pcm_oss snd_mixer_oss nfs lockd nfs_acl auth_rpcgss
> > sunrpc sco rfcomm bnep l2cap crc16 xts gf128mul usb_storage dm_crypt
> > dm_mod coretemp hwmon acpi_cpufreq mperf snd_hda_codec_realtek uvcvideo
> > iwl3945 snd_hda_intel snd_hda_codec iwlcore videodev r8169 snd_hwdep
> > btusb snd_pcm v4l1_compat mac80211 snd_timer bluetooth snd mii cfg80211
> > soundcore sg rfkill ac i2c_i801 snd_page_alloc uhci_hcd battery [last
> > unloaded: microcode]
> > Pid: 2358, comm: rm Tainted: G  D 2.6.35-rc4 #32
> > 
> > Call Trace:
> >  [] ? schedule+0x88/0x332
> >  [] ? __cond_resched+0xf/0x19
> >  [] ? _cond_resched+0x12/0x18
> >  [] ? unmap_vmas+0x4e7/0x534
> >  [] ? exit_mmap+0x64/0xa4
> >  [] ? mmput+0x21/0x96
> >  [] ? exit_mm+0xe7/0xf0
> >  [] ? _raw_spin_unlock_irqrestore+0x1a/0x24
> >  [] ? hrtimer_try_to_cancel+0x31/0x3a
> >  [] ? do_exit+0x17b/0x57d
> >  [] ? kmsg_dump+0x81/0xf9
> >  [] ? do_invalid_op+0x0/0x76
> >  [] ? oops_end+0x72/0x75
> >  [] ? do_invalid_op+0x69/0x76
> >  [] ? lookup_inline_extent_backref+0xf2/0x406
> >  [] ? generic_bin_search.clone.0+0x145/0x150
> >  [] ? btrfs_cow_block+0x106/0x112
> >  [] ? bin_search+0x37/0x3d
> >  [] ? btrfs_search_slot+0x405/0x477
> >  [] ? error_code+0x66/0x6c
> >  [] ? do_invalid_op+0x0/0x76
> >  [] ? lookup_inline_extent_back

Re: kernel BUG at fs/btrfs/extent-tree.c:1353

2010-07-13 Thread Johannes Hirte

Am Sonntag 11 Juli 2010, 14:28:09 schrieb Johannes Hirte:
...
> I've three systems running with btrfs, a dual Opteron (252), a Pentium 4
> system and a netbook with N270 Atom. The netbook is the only one that shows
> the errors. It's also the only system where I'm using gcc-4.5. Perhaps it's
> related, but I doubt it's the only reason as I'm using gcc-4.5 since May.

On the Opteron system I got now csum errors. I've synced some data from the 
netbook to the Opteron yesteray. After hitting ENOSPC with 4GB free, I've run 
'btrfs-vol -b' on this fs in hope to get some more free space. It worked but 
the command failed and I found in dmesg:

btrfs csum failed ino 339 off 935280640 csum 337776576 private 337776575
btrfs csum failed ino 339 off 935280640 csum 337776576 private 337776575
btrfs csum failed ino 339 off 935280640 csum 337776576 private 337776575
btrfs csum failed ino 339 off 935280640 csum 337776576 private 337776575

So I've tested the new synced data by syncing them to another disk on the 
Optoern system (XFS). As I've expected (or better feared), some data wasn't 
readable and I found more csum errors in dmesg:

btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
btrfs csum failed ino 1849137 off 368640 csum 3354885689 private 3354885688
btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
btrfs csum failed ino 1912210 off 5095424 csum 847944548 private 847944547
btrfs csum failed ino 1959333 off 252362752 csum 686735346 private 686735345
btrfs csum failed ino 1959333 off 252362752 csum 686735346 private 686735345
btrfs csum failed ino 1959333 off 252362752 csum 686735346 private 686735345
btrfs csum failed ino 1959333 off 252362752 csum 686735346 private 686735345
btrfs csum failed ino 1959333 off 252362752 csum 686735346 private 686735345
btrfs csum failed ino 1959333 off 252362752 csum 686735346 private 686735345
btrfs csum failed ino 1959333 off 651108352 csum 2851505977 private 2851505976
btrfs csum failed ino 1959333 off 651108352 csum 2851505977 private 2851505976
btrfs csum failed ino 1959333 off 651108352 csum 2851505977 private 2851505976
btrfs csum failed ino 1959333 off 651108352 csum 2851505977 private 2851505976
btrfs csum failed ino 1959333 off 651108352 csum 2851505977 private 2851505976
btrfs csum failed ino 1959333 off 651108352 csum 2851505977 private 2851505976
btrfs csum failed ino 1959333 off 898342912 csum 4271223884 private 4271223883
btrfs csum failed ino 1959333 off 898342912 csum 4271223884 private 4271223883
btrfs csum failed ino 1959333 off 898342912 csum 4271223884 private 4271223883
btrfs csum failed ino 1959333 off 898342912 csum 4271223884 private 4271223883
btrfs csum failed ino 1959333 off 898342912 csum 4271223884 private 4271223883
btrfs csum failed ino 1959333 off 898342912 csum 4271223884 private 4271223883

I suspect something goes horribly wrong on writing to disc within btrfs. On 
the netbook I got missing blocks, on the Opteron System bad csums. Both 
systems are running linux-2.6.35-rc4, the netbook with gcc-4.5.0 the Opteron 
system with gcc-4.4.4. I'll test the P4 system later, if there are similar 
errors too.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel BUG at fs/btrfs/extent-tree.c:1353

2010-07-11 Thread Johannes Hirte

It's getting worse. The /home partition is now affected too. I get the Oops on 
simple unmounting the fs. btrfsck gives me this output on this fs:

btrfsck /dev/mapper/sdb3 
leaf 123780497408 items 49 free space 271 generation 62207 owner 2
fs uuid 7f013285-88d8-452f-a139-7d44bffd14b6
chunk uuid 365526c9-e209-46a1-8963-3157306d9e05
item 0 key (123780108288 EXTENT_ITEM 4096) itemoff 3944 itemsize 51
extent refs 1 gen 45309 flags 2
tree block key (836797 6c 0) level 1
tree block backref root 5
item 1 key (123780112384 EXTENT_ITEM 4096) itemoff 3893 itemsize 51
extent refs 1 gen 45309 flags 2
tree block key (836955 c 416051) level 0
tree block backref root 5
item 2 key (123780116480 EXTENT_ITEM 4096) itemoff 3842 itemsize 51
extent refs 1 gen 24599 flags 2
tree block key (18446744073709551606 80 122869407744) level 0
tree block backref root 7
item 3 key (123780120576 EXTENT_ITEM 4096) itemoff 3791 itemsize 51
extent refs 1 gen 49958 flags 2
tree block key (979249 1 0) level 0
tree block backref root 5
item 4 key (123780124672 EXTENT_ITEM 4096) itemoff 3740 itemsize 51
extent refs 1 gen 62191 flags 2
tree block key (1220 1 0) level 0
tree block backref root 5
item 5 key (123780128768 EXTENT_ITEM 4096) itemoff 3689 itemsize 51
extent refs 1 gen 54817 flags 2
tree block key (1001168 c 455590) level 0
tree block backref root 5
item 6 key (123780132864 EXTENT_ITEM 4096) itemoff 3638 itemsize 51
extent refs 1 gen 62201 flags 2
tree block key (28712 1 0) level 1
tree block backref root 5
item 7 key (123780136960 EXTENT_ITEM 4096) itemoff 3587 itemsize 51
extent refs 1 gen 62191 flags 2
tree block key (34645 c 33037) level 0
tree block backref root 5
item 8 key (123780141056 EXTENT_ITEM 4096) itemoff 3536 itemsize 51
extent refs 1 gen 50007 flags 2
tree block key (31007 60 4044) level 0
tree block backref root 5
item 9 key (123780145152 EXTENT_ITEM 4096) itemoff 3485 itemsize 51
extent refs 1 gen 62202 flags 2
tree block key (123644329984 a8 4096) level 0
tree block backref root 2
item 10 key (123780149248 EXTENT_ITEM 4096) itemoff 3434 itemsize 51
extent refs 1 gen 62202 flags 2
tree block key (123644854272 a8 4096) level 0
tree block backref root 2
item 11 key (123780153344 EXTENT_ITEM 4096) itemoff 3383 itemsize 51
extent refs 1 gen 62202 flags 2
tree block key (123645849600 a8 4096) level 0
tree block backref root 2
item 12 key (123780157440 EXTENT_ITEM 4096) itemoff 3332 itemsize 51
extent refs 1 gen 62207 flags 2
tree block key (123411308544 a8 4096) level 2
tree block backref root 2
item 13 key (123780161536 EXTENT_ITEM 4096) itemoff 3281 itemsize 51
extent refs 1 gen 62200 flags 2
tree block key (1325101 c 1264) level 0
tree block backref root 5
item 14 key (123780165632 EXTENT_ITEM 4096) itemoff 3230 itemsize 51
extent refs 1 gen 56401 flags 2
tree block key (59621 1 0) level 1
tree block backref root 5
item 15 key (123780169728 EXTENT_ITEM 4096) itemoff 3179 itemsize 51
extent refs 1 gen 24613 flags 2
tree block key (18446744073709551606 80 125996056576) level 0
tree block backref root 7
item 16 key (123780173824 EXTENT_ITEM 4096) itemoff 3128 itemsize 51
extent refs 1 gen 62189 flags 2
tree block key (1324334 1 0) level 0
tree block backref root 5
item 17 key (123780177920 EXTENT_ITEM 4095) itemoff 3077 itemsize 51
extent refs 1 gen 62207 flags 2
tree block key (123682791424 a8 4096) level 1
tree block backref root 2
item 18 key (123780182016 EXTENT_ITEM 4096) itemoff 3026 itemsize 51
extent refs 1 gen 62202 flags 2
tree block key (123648741376 a8 4096) level 0
tree block backref root 2
item 19 key (123780186112 EXTENT_ITEM 4096) itemoff 2975 itemsize 51
extent refs 1 gen 62201 flags 2
tree block key (28854 54 1781866506) level 0
tree block backref root 5
item 20 key (123780190208 EXTENT_ITEM 4096) itemoff 2924 itemsize 51
extent refs 1 gen 62192 flags 2
tree block key

Re: kernel BUG at fs/btrfs/extent-tree.c:1353

2010-07-08 Thread Johannes Hirte

Am Donnerstag 08 Juli 2010, 16:31:09 schrieb Chris Mason:
> Neither Yan nor I have been able to reproduce this locally, but a few
> people have now hit it.  Johannes, are you available to try out a
> debugging kernel to try and track this down?

Sure, just tell me what to do. Is it enough to recompile the kernel with debug 
options enabled or are special debug patches necessary?

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

kernel BUG at fs/btrfs/extent-tree.c:1353

2010-07-08 Thread Johannes Hirte

When doing a 'rm -r /var/tmp/portage/sys-devel' I get the following Oops:

[ cut here ]
kernel BUG at fs/btrfs/extent-tree.c:1353!
invalid opcode:  [#1] PREEMPT SMP 
last sysfs file: 
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0A:00/power_supply/BAT0/charge_full
Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq 
snd_seq_device snd_pcm_oss snd_mixer_oss nfs lockd nfs_acl auth_rpcgss sunrpc 
sco rfcomm bnep l2cap crc16 xts gf128mul usb_storage dm_crypt dm_mod coretemp 
hwmon acpi_cpufreq mperf snd_hda_codec_realtek uvcvideo iwl3945 snd_hda_intel 
snd_hda_codec iwlcore videodev r8169 snd_hwdep btusb snd_pcm v4l1_compat 
mac80211 snd_timer bluetooth snd mii cfg80211 soundcore sg rfkill ac i2c_i801 
snd_page_alloc uhci_hcd battery [last unloaded: microcode]

Pid: 2358, comm: rm Not tainted 2.6.35-rc4 #32 M912/M912
EIP: 0060:[] EFLAGS: 00010202 CPU: 1
EIP is at lookup_inline_extent_backref+0xf2/0x406
EAX: 0001 EBX: 0007 ECX:  EDX: 
ESI: 0004 EDI: f7268150 EBP: 0004 ESP: f5aa5d08
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process rm (pid: 2358, ti=f5aa4000 task=f6f0fa70 task.ti=f5aa4000)
Stack:
 f702f8c0 f744e080 f665f380 00b0    f6c80f00
<0> f744e080 c10ec226 e98acfff f6c98000 1001 0e987000 0004 
<0> 0850 040e9870 a800 1000  0007  0e987000
Call Trace:
 [] ? set_extent_dirty+0x19/0x1d
 [] ? __btrfs_free_extent+0xda/0x675
 [] ? run_clustered_refs+0x699/0x6d7
 [] ? btrfs_mark_buffer_dirty+0xa3/0xef
 [] ? btrfs_find_ref_cluster+0xf9/0x13a
 [] ? btrfs_run_delayed_refs+0xbf/0x155
 [] ? __btrfs_end_transaction+0x53/0x16c
 [] ? btrfs_delete_inode+0x166/0x17e
 [] ? get_parent_ip+0x8/0x19
 [] ? generic_delete_inode+0x6f/0xbd
 [] ? iput+0x46/0x48
 [] ? do_unlinkat+0xc7/0x109
 [] ? get_parent_ip+0x8/0x19
 [] ? fput+0x12/0x15c
 [] ? dnotify_flush+0x41/0xc2
 [] ? filp_close+0x4c/0x52
 [] ? sys_close+0x62/0x9b
 [] ? sysenter_do_call+0x12/0x26
Code: 80 4e 68 02 8d 4c 24 43 89 f8 6a 01 ff 74 24 1c ff 74 24 08 8b 54 24 38 
e8 
01 c2 ff ff 83 c4 0c 83 f8 00 0f 8c e1 02 00 00 74 02 <0f> 0b 8b 04 24 8b 34 24 
8b 00 8b 56 20 89 44 24 08 e8 2e fa ff 
EIP: [] lookup_inline_extent_backref+0xf2/0x406 SS:ESP 0068:f5aa5d08
---[ end trace d97601f0b455ca72 ]---
note: rm[2358] exited with preempt_count 2
BUG: scheduling while atomic: rm/2358/0x1003
Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq 
snd_seq_device snd_pcm_oss snd_mixer_oss nfs lockd nfs_acl auth_rpcgss sunrpc 
sco rfcomm bnep l2cap crc16 xts gf128mul usb_storage dm_crypt dm_mod coretemp 
hwmon acpi_cpufreq mperf snd_hda_codec_realtek uvcvideo iwl3945 snd_hda_intel 
snd_hda_codec iwlcore videodev r8169 snd_hwdep btusb snd_pcm v4l1_compat 
mac80211 snd_timer bluetooth snd mii cfg80211 soundcore sg rfkill ac i2c_i801 
snd_page_alloc uhci_hcd battery [last unloaded: microcode]
Pid: 2358, comm: rm Tainted: G  D 2.6.35-rc4 #32
Call Trace:
 [] ? schedule+0x88/0x332
 [] ? __cond_resched+0xf/0x19
 [] ? _cond_resched+0x12/0x18
 [] ? unmap_vmas+0x4e7/0x534
 [] ? exit_mmap+0x64/0xa4
 [] ? mmput+0x21/0x96
 [] ? exit_mm+0xe7/0xf0
 [] ? _raw_spin_unlock_irqrestore+0x1a/0x24
 [] ? hrtimer_try_to_cancel+0x31/0x3a
 [] ? do_exit+0x17b/0x57d
 [] ? kmsg_dump+0x81/0xf9
 [] ? do_invalid_op+0x0/0x76
 [] ? oops_end+0x72/0x75
 [] ? do_invalid_op+0x69/0x76
 [] ? lookup_inline_extent_backref+0xf2/0x406
 [] ? generic_bin_search.clone.0+0x145/0x150
 [] ? btrfs_cow_block+0x106/0x112
 [] ? bin_search+0x37/0x3d
 [] ? btrfs_search_slot+0x405/0x477
 [] ? error_code+0x66/0x6c
 [] ? do_invalid_op+0x0/0x76
 [] ? lookup_inline_extent_backref+0xf2/0x406
 [] ? set_extent_dirty+0x19/0x1d
 [] ? __btrfs_free_extent+0xda/0x675
 [] ? run_clustered_refs+0x699/0x6d7
 [] ? btrfs_mark_buffer_dirty+0xa3/0xef
 [] ? btrfs_find_ref_cluster+0xf9/0x13a
 [] ? btrfs_run_delayed_refs+0xbf/0x155
 [] ? __btrfs_end_transaction+0x53/0x16c
 [] ? btrfs_delete_inode+0x166/0x17e
 [] ? get_parent_ip+0x8/0x19
 [] ? generic_delete_inode+0x6f/0xbd
 [] ? iput+0x46/0x48
 [] ? do_unlinkat+0xc7/0x109
 [] ? get_parent_ip+0x8/0x19
 [] ? fput+0x12/0x15c
 [] ? dnotify_flush+0x41/0xc2
 [] ? filp_close+0x4c/0x52
 [] ? sys_close+0x62/0x9b
 [] ? sysenter_do_call+0x12/0x26
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Still ENOSPC problems with 2.6.35-rc3

2010-06-17 Thread Johannes Hirte

Am Donnerstag 17 Juni 2010, 02:47:07 schrieb Yan, Zheng:
> On Thu, Jun 17, 2010 at 7:56 AM, Johannes Hirte
> 
>  wrote:
> > Am Donnerstag 17 Juni 2010, 01:12:54 schrieb Yan, Zheng:
> >> On Thu, Jun 17, 2010 at 1:48 AM, Johannes Hirte
> >> 
> >>  wrote:
> >> > With kernel-2.6.34 I run into the ENOSPC problems that where reported
> >> > on this list recently. The filesystem was somewhat over 90% full and
> >> > most operations on it caused a Oops. I was able to delete files by
> >> > trial and error and freed up half of the filesystem space. Operation
> >> > on the other files still caused an Oops.
> >> > 
> >> > For 2.6.35 there went some patches in, that addressed this problem.
> >> > Sadly they don't fix it but only avoid the Oops. A simple 'ls' on
> >> > this filesystem results in
> >> 
> >> To avoid ENOSPC oops, btrfs in 2.6.35 reserves more metadata space for
> >> system use than older btrfs. If the FS has already ran out of metadata
> >> space, using btrfs in 2.6.35 doesn't help.
> >> 
> >> Yan, Zheng
> > 
> > So how can this be fixed/avoided? There must be some free metadata space,
> > since I was able to delete files, more than 20Gig, mostly small files.
> > Also from my understanding, when freeing space by deleting files,
> > metadata space should be freed. Or do I get something wrong here?
> > 2.6.35 does change something, since I can delete more files, where 2.6.34
> > does Oops. But you're right, it doesn't help at all. So, where is this
> > space and why it can't be used?
> 
> what will happen if you keep deleting files using 2.6.35?

With 2.6.35 I'm able to continue deleting files, even those where 2.6.34 would 
Oops. It's slow and give me many of this warnings in dmesg but they get 
deleted. I didn't tried it to the end, but I can do if you want. I've saved 
the affected filesystem to a separate partition, so I can test with it.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Still ENOSPC problems with 2.6.35-rc3

2010-06-16 Thread Johannes Hirte

Am Donnerstag 17 Juni 2010, 01:12:54 schrieb Yan, Zheng:
> On Thu, Jun 17, 2010 at 1:48 AM, Johannes Hirte
> 
>  wrote:
> > With kernel-2.6.34 I run into the ENOSPC problems that where reported on
> > this list recently. The filesystem was somewhat over 90% full and most
> > operations on it caused a Oops. I was able to delete files by trial and
> > error and freed up half of the filesystem space. Operation on the other
> > files still caused an Oops.
> > 
> > For 2.6.35 there went some patches in, that addressed this problem. Sadly
> > they don't fix it but only avoid the Oops. A simple 'ls' on this
> > filesystem results in
> 
> To avoid ENOSPC oops, btrfs in 2.6.35 reserves more metadata space for
> system use than older btrfs. If the FS has already ran out of metadata
> space, using btrfs in 2.6.35 doesn't help.
> 
> Yan, Zheng

So how can this be fixed/avoided? There must be some free metadata space, since 
I was able to delete files, more than 20Gig, mostly small files. Also from my 
understanding, when freeing space by deleting files, metadata space should be 
freed. Or do I get something wrong here? 
2.6.35 does change something, since I can delete more files, where 2.6.34 does 
Oops. But you're right, it doesn't help at all. So, where is this space and 
why it can't be used? 

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 11/11] btrfs: The file argument for fsync() is never null

2010-06-16 Thread Johannes Hirte

Am Dienstag 15 Juni 2010, 02:08:20 schrieb Chris Mason:
> On Mon, Jun 14, 2010 at 11:45:40PM +0200, Johannes Hirte wrote:
> > Am Montag 14 Juni 2010, 23:16:01 schrieb Christoph Hellwig:
> > > On Mon, Jun 14, 2010 at 11:11:20PM +0200, Dan Carpenter wrote:
> > > > > Looks like you've applied the patch to a far too old kernel.  It
> > > > > can't be NULL for quite a while already.
> > > > 
> > > > You're the expert, but it looks like it could be null in 2.6.34 like
> > > > he says.  I'm just looking at vfs_fsync_range() in
> > > > "git show v2.6.34:fs/sync.c".
> > > 
> > > 2.6.34 is far too old.
> > 
> > For the changes yes, but not for working. I needed the btrfs fixes
> > without all the other bugs introduced with 2.6.35-rc. I was to careless
> > and pulled to much changes in. My fault.
> 
> Well, my fault.  I usually keep the btrfs-unstable tree against one
> release old, and the users have come to expect it.
> 
> I'll make a .34 branch that works.
> 
> -chris

What about backporting only the important patches to the stable series? Or 
would this be to much work for a still experimental filesystem?

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Still ENOSPC problems with 2.6.35-rc3

2010-06-16 Thread Johannes Hirte

With kernel-2.6.34 I run into the ENOSPC problems that where reported on this 
list recently. The filesystem was somewhat over 90% full and most operations on 
it caused a Oops. I was able to delete files by trial and error and freed up 
half of the filesystem space. Operation on the other files still caused an Oops.

For 2.6.35 there went some patches in, that addressed this problem. Sadly they 
don't fix it but only avoid the Oops. A simple 'ls' on this filesystem results 
in

[ cut here ]
WARNING: at fs/btrfs/extent-tree.c:3441 btrfs_block_rsv_check+0x10c/0x13e()
Hardware name: To Be Filled By O.E.M.
Modules linked in: snd_seq_midi snd_emu10k1_synth snd_emux_synth 
snd_seq_virmidi snd_seq_midi_emul snd_seq_oss snd_seq_midi_event snd_seq 
snd_pcm_oss snd_mixer_oss radeon ttm drm_kms_helper drm i2c_algo_bit 
snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm snd_seq_device 
snd_timer snd_page_alloc snd_util_mem snd_hwdep snd amd64_edac_mod sata_sil sg 
sr_mod uhci_hcd ohci_hcd edac_core edac_mce_amd k8temp i2c_amd8111 i2c_amd756 
hwmon
Pid: 26973, comm: ls Not tainted 2.6.35-rc3 #1
Call Trace:
 [] ? warn_slowpath_common+0x78/0x8c
 [] ? btrfs_block_rsv_check+0x10c/0x13e
 [] ? __btrfs_end_transaction+0x9f/0x1b1
 [] ? btrfs_dirty_inode+0x58/0xf9
 [] ? __mark_inode_dirty+0x25/0x149
 [] ? touch_atime+0xfc/0x125
 [] ? filldir+0x0/0xc3
 [] ? vfs_readdir+0x76/0x9c
 [] ? sys_getdents+0x7d/0xcd
 [] ? page_fault+0x1f/0x30
 [] ? system_call_fastpath+0x16/0x1b
---[ end trace 4aa882f64f792d16 ]---
block_rsv size 654311424 reserved 67809280 freed 0 0
[ cut here ]
WARNING: at fs/btrfs/extent-tree.c:3441 btrfs_block_rsv_check+0x10c/0x13e()
Hardware name: To Be Filled By O.E.M.
Modules linked in: snd_seq_midi snd_emu10k1_synth snd_emux_synth 
snd_seq_virmidi snd_seq_midi_emul snd_seq_oss snd_seq_midi_event snd_seq 
snd_pcm_oss snd_mixer_oss radeon ttm drm_kms_helper drm i2c_algo_bit 
snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm snd_seq_device 
snd_timer snd_page_alloc snd_util_mem snd_hwdep snd amd64_edac_mod sata_sil sg 
sr_mod uhci_hcd ohci_hcd edac_core edac_mce_amd k8temp i2c_amd8111 i2c_amd756 
hwmon
Pid: 26970, comm: btrfs-transacti Tainted: GW   2.6.35-rc3 #1
Call Trace:
 [] ? warn_slowpath_common+0x78/0x8c
 [] ? btrfs_block_rsv_check+0x10c/0x13e
 [] ? __btrfs_end_transaction+0x9f/0x1b1
 [] ? btrfs_commit_transaction+0xf4/0x5fd
 [] ? enqueue_task+0x39/0x47
 [] ? mutex_lock+0xd/0x31
 [] ? autoremove_wake_function+0x0/0x2a
 [] ? transaction_kthread+0x16d/0x213
 [] ? transaction_kthread+0x0/0x213
 [] ? kthread+0x75/0x7d
 [] ? kernel_thread_helper+0x4/0x10
 [] ? kthread+0x0/0x7d
 [] ? kernel_thread_helper+0x0/0x10
---[ end trace 4aa882f64f792d17 ]---
block_rsv size 654311424 reserved 67809280 freed 0 0
[ cut here ]
WARNING: at fs/btrfs/extent-tree.c:3441 btrfs_block_rsv_check+0x10c/0x13e()
Hardware name: To Be Filled By O.E.M.
Modules linked in: snd_seq_midi snd_emu10k1_synth snd_emux_synth 
snd_seq_virmidi snd_seq_midi_emul snd_seq_oss snd_seq_midi_event snd_seq 
snd_pcm_oss snd_mixer_oss radeon ttm drm_kms_helper drm i2c_algo_bit 
snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm snd_seq_device 
snd_timer snd_page_alloc snd_util_mem snd_hwdep snd amd64_edac_mod sata_sil sg 
sr_mod uhci_hcd ohci_hcd edac_core edac_mce_amd k8temp i2c_amd8111 i2c_amd756 
hwmon
Pid: 26973, comm: ls Tainted: GW   2.6.35-rc3 #1
Call Trace:
 [] ? warn_slowpath_common+0x78/0x8c
 [] ? btrfs_block_rsv_check+0x10c/0x13e
 [] ? __btrfs_end_transaction+0x9f/0x1b1
 [] ? start_transaction+0x15f/0x1c4
 [] ? btrfs_dirty_inode+0x65/0xf9
 [] ? __mark_inode_dirty+0x25/0x149
 [] ? touch_atime+0xfc/0x125
 [] ? filldir+0x0/0xc3
 [] ? vfs_readdir+0x76/0x9c
 [] ? sys_getdents+0x7d/0xcd
 [] ? page_fault+0x1f/0x30
 [] ? system_call_fastpath+0x16/0x1b
---[ end trace 4aa882f64f792d18 ]---
block_rsv size 654311424 reserved 67809280 freed 0 0
btrfs: fail to dirty  inode 256 error -28
[ cut here ]
WARNING: at fs/btrfs/extent-tree.c:3441 btrfs_block_rsv_check+0x10c/0x13e()
Hardware name: To Be Filled By O.E.M.
Modules linked in: snd_seq_midi snd_emu10k1_synth snd_emux_synth 
snd_seq_virmidi snd_seq_midi_emul snd_seq_oss snd_seq_midi_event snd_seq 
snd_pcm_oss snd_mixer_oss radeon ttm drm_kms_helper drm i2c_algo_bit 
snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm snd_seq_device 
snd_timer snd_page_alloc snd_util_mem snd_hwdep snd amd64_edac_mod sata_sil sg 
sr_mod uhci_hcd ohci_hcd edac_core edac_mce_amd k8temp i2c_amd8111 i2c_amd756 
hwmon
Pid: 26973, comm: ls Tainted: GW   2.6.35-rc3 #1
Call Trace:
 [] ? warn_slowpath_common+0x78/0x8c
 [] ? btrfs_block_rsv_check+0x10c/0x13e
 [] ? __btrfs_end_transaction+0x9f/0x1b1
 [] ? btrfs_dirty_inode+0x58/0xf9
 [] ? __mark_inode_dirty+0x25/0x149
 [] ? touch_atime+0xfc/0x125
 [] ? sys_readlinkat+0x4f/0x81
 [] ? system_call_fastpath+0x16/0x1b
---[ end trace 4aa882f64f792d19 ]---
bloc

Re: [patch 11/11] btrfs: The file argument for fsync() is never null

2010-06-14 Thread Johannes Hirte

Am Montag 14 Juni 2010, 23:16:01 schrieb Christoph Hellwig:
> On Mon, Jun 14, 2010 at 11:11:20PM +0200, Dan Carpenter wrote:
> > > Looks like you've applied the patch to a far too old kernel.  It can't
> > > be NULL for quite a while already.
> > 
> > You're the expert, but it looks like it could be null in 2.6.34 like he
> > says.  I'm just looking at vfs_fsync_range() in
> > "git show v2.6.34:fs/sync.c".
> 
> 2.6.34 is far too old.

For the changes yes, but not for working. I needed the btrfs fixes without all 
the other bugs introduced with 2.6.35-rc. I was to careless and pulled to much 
changes in. My fault.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 11/11] btrfs: The file argument for fsync() is never null

2010-06-14 Thread Johannes Hirte

Am Samstag 29 Mai 2010, 11:49:07 schrieb Dan Carpenter:
> The "file" argument for fsync is never null so we can remove this check.
> 
> What drew my attention here is that 7ea8085910e: "drop unused dentry
> argument to ->fsync" introduced an unconditional dereference at the
> start of the function and that generated a smatch warning.
> 
> Signed-off-by: Dan Carpenter 
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 787b50a..e252d23 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1140,7 +1140,7 @@ int btrfs_sync_file(struct file *file, int datasync)
>   /*
>* ok we haven't committed the transaction yet, lets do a commit
>*/
> - if (file && file->private_data)
> + if (file->private_data)
>   btrfs_ioctl_trans_end(file);
> 
>   trans = btrfs_start_transaction(root, 0);

I think you're wrong here. I've run into a kernel null pointer dereference at 
this point with a NFS exported btrfs filesystem:


BUG: unable to handle kernel NULL pointer dereference at 0098
IP: [] btrfs_sync_file+0xa7/0x15a
PGD 2a72e067 PUD 1c29f067 PMD 0 
Oops:  [#1] SMP 
last sysfs file: /sys/devices/pci:00/:00:01.0/:01:00.0/power_method
CPU 1 
Modules linked in: snd_seq_midi snd_emu10k1_synth snd_emux_synth 
snd_seq_virmidi snd_seq_midi_emul snd_seq_oss snd_seq_midi_event snd_seq 
snd_pcm_oss snd_mixer_oss radeon ttm drm_kms_helper drm i2c_algo_bit 
snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm snd_seq_device 
snd_timer snd_page_alloc snd_util_mem snd_hwdep snd amd64_edac_mod edac_core 
uhci_hcd ohci_hcd sata_sil edac_mce_amd sg sr_mod k8temp i2c_amd756 
i2c_amd8111 hwmon

Pid: 2330, comm: nfsd Not tainted 2.6.34-btrfs-2-drm #1 TYAN Tiger K8W Dual 
AMD Opteron, S2875/To Be Filled By O.E.M.
RIP: 0010:[]  [] btrfs_sync_file+0xa7/0x15a
RSP: :880119e9bcf0  EFLAGS: 00010202
RAX:  RBX: 88011e4da000 RCX: 0020
RDX: 0df8 RSI:  RDI: 88011e495b28
RBP: 88011c30eb78 R08: 8141bab0 R09: 
R10: 880119e9bc50 R11: 88011c30eb78 R12: 88011c305240
R13:  R14: 8141bab0 R15: 880119d1d040
FS:  2ac4c0279180() GS:88000188() knlGS:f74726d0
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0098 CR3: 15c84000 CR4: 06f0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process nfsd (pid: 2330, threadinfo 880119e9a000, task 880119e185e0)
Stack:
 8800273da3d8 0002 0003 
<0> 88011c305240  88011c30ec90 810b1a99
<0> 880119d1d040 8800  880023e63240
Call Trace:
 [] ? vfs_fsync_range+0x7a/0xa7
 [] ? nfsd4_create_clid_dir+0x12d/0x15d
 [] ? nfsd4_open_confirm+0xec/0x118
 [] ? nfsd4_proc_compound+0x1fd/0x3b5
 [] ? nfsd_dispatch+0xdf/0x1b5
 [] ? svc_process+0x41d/0x703
 [] ? default_wake_function+0x0/0xf
 [] ? nfsd+0xe1/0x125
 [] ? nfsd+0x0/0x125
 [] ? kthread+0x75/0x7d
 [] ? kernel_thread_helper+0x4/0x10
 [] ? kthread+0x0/0x7d
 [] ? kernel_thread_helper+0x0/0x10
Code: 8b bb 28 01 00 00 89 44 24 08 48 81 c7 28 1b 00 00 e8 1e 79 1f 00 8b 44 
24 08 e9 b4 00 00 00 48 81 c7 28 1b 00 00 e8 09 79 1f 00 <49> 83 bd 98 00 00 
00 00 74 08 4c 89 ef e8 06 75 01 00 31 f6 48 
RIP  [] btrfs_sync_file+0xa7/0x15a
 RSP 
CR2: 0098
---[ end trace 5c7989eaf4eda923 ]---

addr2line showed me exact this change you made.

It happend with a linux-2.6.34 with the latest btrfs-changes on top.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: task imap:2958 blocked for more than 120 seconds

2010-03-07 Thread Johannes Hirte

Am Sonntag 10 Januar 2010 21:05:46 schrieb Johannes Hirte:
> I've observed this hanging task now several times. Not sure when this
> started, but 2.6.32 is affected too, IIRC. I don't have a test pattern for
> this. Dovecot imap triggers this from time to time. I've enabled
> CONFIG_DETECT_HUNG_TASK now and got this two tasks which hang:
> 
> INFO: task imap:2958 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> imap  D  0  2958   2653 0x
>  88008caf5a28 0046  810544cf
>  88008caf5998 0001 88008caf5fd8 88008caf9530
>  de78 001d2700 001d2700 88008caf9530
> Call Trace:
>  [] ? trace_hardirqs_off+0xd/0xf
>  [] ? trace_hardirqs_on_caller+0x10c/0x130
>  [] ? sync_page+0x0/0x48
>  [] io_schedule+0x38/0x4d
>  [] sync_page+0x44/0x48
>  [] __wait_on_bit_lock+0x41/0x8a
>  [] __lock_page+0x61/0x68
>  [] ? wake_bit_function+0x0/0x2e
>  [] filemap_fault+0xea/0x345
>  [] __do_fault+0x50/0x3d3
>  [] handle_mm_fault+0x32f/0x65d
>  [] ? do_page_fault+0xf4/0x26f
>  [] ? __down_read_trylock+0x46/0x4e
>  [] ? down_read_trylock+0x3f/0x49
>  [] ? do_page_fault+0xf4/0x26f
>  [] do_page_fault+0x257/0x26f
>  [] page_fault+0x1f/0x30
>  [] ? might_fault+0x57/0xa7
>  [] ? btrfs_copy_from_user+0x4f/0x113
>  [] ? btrfs_copy_from_user+0xde/0x113
>  [] btrfs_file_write+0x439/0x6fe
>  [] vfs_write+0xad/0x14e
>  [] ? trace_hardirqs_on_caller+0x10c/0x130
>  [] sys_pwrite64+0x55/0x74
>  [] system_call_fastpath+0x16/0x1b
> 2 locks held by imap/2958:
>  #0:  (&sb->s_type->i_mutex_key#4){+.+.+.}, at: []
> btrfs_file_write+0x169/0x6fe
>  #1:  (&mm->mmap_sem){++}, at: []
> do_page_fault+0xf4/0x26f INFO: task flush-btrfs-2:2783 blocked for more
> than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> flush-btrfs-2 D  0  2783  2 0x
>  88010cdcf9d0 0046  810544cf
>  88010cdcf940  88010cdcffd8 88010cd18290
>  de78 001d2700 001d2700 88010cd18290
> Call Trace:
>  [] ? trace_hardirqs_off+0xd/0xf
>  [] ? trace_hardirqs_on_caller+0x10c/0x130
>  [] ? sync_page+0x0/0x48
>  [] io_schedule+0x38/0x4d
>  [] sync_page+0x44/0x48
>  [] __wait_on_bit_lock+0x41/0x8a
>  [] ? find_get_pages_tag+0x0/0x130
>  [] __lock_page+0x61/0x68
>  [] ? wake_bit_function+0x0/0x2e
>  [] T.858+0xf1/0x2cd
>  [] ? sched_clock_cpu+0xc6/0xd4
>  [] ? sched_clock_local+0x1c/0x82
>  [] ? sched_clock_cpu+0xc6/0xd4
>  [] ? trace_hardirqs_off+0xd/0xf
>  [] extent_writepages+0x3f/0x54
>  [] ? btrfs_get_extent+0x0/0x7ee
>  [] btrfs_writepages+0x22/0x24
>  [] do_writepages+0x1f/0x28
>  [] writeback_single_inode+0xf1/0x2f0
>  [] writeback_inodes_wb+0x3a9/0x4b2
>  [] wb_writeback+0x12b/0x1af
>  [] wb_do_writeback+0x17f/0x195
>  [] ? wb_do_writeback+0x8b/0x195
>  [] bdi_writeback_task+0x2b/0x84
>  [] ? bdi_start_fn+0x0/0xcf
>  [] bdi_start_fn+0x71/0xcf
>  [] ? bdi_start_fn+0x0/0xcf
>  [] kthread+0x7a/0x82
>  [] kernel_thread_helper+0x4/0x10
>  [] ? restore_args+0x0/0x30
>  [] ? kthread+0x0/0x82
>  [] ? kernel_thread_helper+0x0/0x10
> 1 lock held by flush-btrfs-2/2783:
>  #0:  (&type->s_umount_key#20){..}, at: []
> writeback_inodes_wb+0x2d4/0x4b2
> 
> regards,
>   Johannes

It happend again, today I found in the logs:

INFO: task imap:2590 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
imap  D 88002828ddf8 0  2590   2472 0x
 88011fa53000 0086  8800282923a0
 88011b7cc480  880028294c58 880117a5e000
 fffdcf81 880117a5e280 00012340 00012340
Call Trace:
 [] ? sync_page+0x0/0x46
 [] ? io_schedule+0x35/0x48
 [] ? sync_page+0x41/0x46
 [] ? __wait_on_bit_lock+0x3c/0x85
 [] ? __lock_page+0x5d/0x63
 [] ? wake_bit_function+0x0/0x2e
 [] ? filemap_fault+0xcc/0x305
 [] ? __do_fault+0x52/0x3f0
 [] ? handle_mm_fault+0x346/0x6b6
 [] ? do_page_fault+0x264/0x280
 [] ? page_fault+0x1f/0x30
 [] ? btrfs_copy_from_user+0x50/0x109
 [] ? btrfs_copy_from_user+0xd2/0x109
 [] ? btrfs_file_write+0x448/0x6db
 [] ? vfs_write+0xa8/0x14c
 [] ? sys_pwrite64+0x53/0x71
 [] ? system_call_fastpath+0x16/0x1b

and btrfsck said:

datengrab ~ # btrfsck /dev/mapper/sdb 
root 5 inode 842742 errors 400
found 186575216640 bytes used err is 1
total csum bytes: 181076596
total tree bytes: 1147568128
total fs tree bytes: 884981760
btree space waste bytes: 289677575
file

Re: [btrfs] kernel BUG at include/linux/spinlock.h:376!

2010-01-23 Thread Johannes Hirte

Am Donnerstag 14 Januar 2010 20:37:08 schrieb Chris Mason:
> On Thu, Jan 07, 2010 at 10:29:32PM +0100, Johannes Hirte wrote:
> > One of my btrfs filesystems gives the following bug message on access:
> > 
> > Jan  6 23:08:12 datengrab kernel: [ cut here ]
> > Jan  6 23:08:12 datengrab kernel: kernel BUG at
> > include/linux/spinlock.h:376! Jan  6 23:08:12 datengrab kernel: invalid
> > opcode:  [#1] SMP
> > Jan  6 23:08:12 datengrab kernel: last sysfs file:
> > /sys/devices/pci:00/:00:18.3/temp1_input
> > Jan  6 23:08:12 datengrab kernel: CPU 1
> > Jan  6 23:08:12 datengrab kernel: Pid: 2837, comm: btrfs-endio-wri Not
> > tainted 2.6.33-rc3-00033-g03b7675 #12 TYAN Tiger K8W Dual AMD Opteron,
> > S2875/To Be Filled
> > By O.E.M.
> > Jan  6 23:08:12 datengrab kernel: RIP: 0010:[] 
> > [] btrfs_assert_tree_locked+0x16/0x1c
> 
> Well, we really should have this tree block locked, but
> btrfs_mark_extent_written is doing some special things.  Is the trace
> always the same?

Sorry for the long delay. Yes the trace was always the same. I can't test 
patches, since I'm not working on the corrupted FS anymore. The bug only 
occurred on the corrupted filesystem. But as I've seen, Yan Zheng has tracked 
it down (commit 6c7d54ac87f338c479d9729e8392eca3f76e11e1). 

I still suspect that the FS corruption was caused by this bug. It didn't 
happened again. If so, I'll report.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [btrfs] kernel BUG at include/linux/spinlock.h:376!

2010-01-14 Thread Johannes Hirte

Am Donnerstag 07 Januar 2010 22:29:32 schrieb Johannes Hirte:
> One of my btrfs filesystems gives the following bug message on access:
> 
> Jan  6 23:08:12 datengrab kernel: [ cut here ]
> Jan  6 23:08:12 datengrab kernel: kernel BUG at
> include/linux/spinlock.h:376! Jan  6 23:08:12 datengrab kernel: invalid
> opcode:  [#1] SMP
> Jan  6 23:08:12 datengrab kernel: last sysfs file:
> /sys/devices/pci:00/:00:18.3/temp1_input
> Jan  6 23:08:12 datengrab kernel: CPU 1
> Jan  6 23:08:12 datengrab kernel: Pid: 2837, comm: btrfs-endio-wri Not
> tainted 2.6.33-rc3-00033-g03b7675 #12 TYAN Tiger K8W Dual AMD Opteron,
> S2875/To Be Filled
> By O.E.M.
> Jan  6 23:08:12 datengrab kernel: RIP: 0010:[] 
> [] btrfs_assert_tree_locked+0x16/0x1c
> Jan  6 23:08:12 datengrab kernel: RSP: 0018:8800237b5a50  EFLAGS:
> 00010246 Jan  6 23:08:12 datengrab kernel: RAX: 0404 RBX:
> 88011f444ea0 RCX: 8800
> Jan  6 23:08:12 datengrab kernel: RDX: 0004 RSI:
> 88011c219000 RDI: 8800829b3c00
> Jan  6 23:08:12 datengrab kernel: RBP: 8800237b5a50 R08:
> 0016 R09: 8800237b5a30
> Jan  6 23:08:12 datengrab kernel: R10: 8800237b5a28 R11:
> 0191 R12: 88011c219000
> Jan  6 23:08:12 datengrab kernel: R13: 000c R14:
> 0001 R15: 88011981e740
> Jan  6 23:08:12 datengrab kernel: FS:  7f2c79ac8700()
> GS:88002b40() knlGS:
> Jan  6 23:08:12 datengrab kernel: CS:  0010 DS:  ES:  CR0:
> 8005003b
> Jan  6 23:08:12 datengrab kernel: CR2: 026300a0 CR3:
> 000116b7f000 CR4: 06f0
> Jan  6 23:08:12 datengrab kernel: DR0:  DR1:
>  DR2: 
> Jan  6 23:08:12 datengrab kernel: DR3:  DR6:
> 0ff0 DR7: 0400
> Jan  6 23:08:12 datengrab kernel: Process btrfs-endio-wri (pid: 2837,
> threadinfo 8800237b4000, task 8800235037e0)
> Jan  6 23:08:12 datengrab kernel: Stack:
> Jan  6 23:08:12 datengrab kernel: 8800237b5ac0 81154ded
> 012c 000c
> Jan  6 23:08:12 datengrab kernel: <0> 8816 000181150b93
> 0ce3 0f66
> Jan  6 23:08:12 datengrab kernel: <0> 88007ff44000 8800829b3d00
>  88011f444ea0
> Jan  6 23:08:12 datengrab kernel: Call Trace:
> Jan  6 23:08:12 datengrab kernel: []
> push_leaf_left+0x9f/0x158 Jan  6 23:08:12 datengrab kernel:
> [] btrfs_del_items+0x363/0x48f Jan  6 23:08:12 datengrab
> kernel: []
> btrfs_mark_extent_written+0x53b/0x55f
> Jan  6 23:08:12 datengrab kernel: [] ?
> trace_hardirqs_on+0xd/0xf Jan  6 23:08:12 datengrab kernel:
> [] ? mutex_unlock+0x9/0xb Jan  6 23:08:12 datengrab
> kernel: []
> btrfs_finish_ordered_io+0x176/0x247
> Jan  6 23:08:12 datengrab kernel: [] ?
> trace_hardirqs_off+0xd/0xf Jan  6 23:08:12 datengrab kernel:
> []
> btrfs_writepage_end_io_hook+0x15/0x17
> Jan  6 23:08:12 datengrab kernel: []
> end_bio_extent_writepage+0xa9/0x154
> Jan  6 23:08:12 datengrab kernel: [] ?
> trace_hardirqs_on_caller+0x10c/0x130
> Jan  6 23:08:12 datengrab kernel: [] bio_endio+0x26/0x28
> Jan  6 23:08:12 datengrab kernel: []
> end_workqueue_fn+0x10c/0x11b
> Jan  6 23:08:12 datengrab kernel: []
> worker_loop+0x175/0x44d Jan  6 23:08:12 datengrab kernel:
> [] ? worker_loop+0x0/0x44d Jan  6 23:08:12 datengrab
> kernel: [] kthread+0x7a/0x82 Jan  6 23:08:12 datengrab
> kernel: []
> kernel_thread_helper+0x4/0x10
> Jan  6 23:08:12 datengrab kernel: [] ?
> restore_args+0x0/0x30 Jan  6 23:08:12 datengrab kernel:
> [] ? kthread+0x0/0x82 Jan  6 23:08:12 datengrab kernel:
> [] ?
> kernel_thread_helper+0x0/0x10
> Jan  6 23:08:12 datengrab kernel: Code: c8 ff 48 81 c4 88 00 00 00 5b 41 5c
> 41 5d 41 5e 41 5f c9 c3 90 f6 47 38 02 55 48 89 e5 75 10 8b 47 70 89 c2 c1
> fa 08 38 c
> 2 75 04 <0f> 0b eb fe c9 c3 55 31 c0 65 48 8b 14 25 48 b5 00 00 48 89 e5
> Jan  6 23:08:12 datengrab kernel: RIP  []
> btrfs_assert_tree_locked+0x16/0x1c
> Jan  6 23:08:12 datengrab kernel: RSP 
> Jan  6 23:08:12 datengrab kernel: ---[ end trace 96d932f09da027f6 ]---
> 
> It only happens on write access. I was able to copy all the data to another
> drive without any error. The filesystem is damaged, btrfsck gives
> 
> root 5 inode 6969680 errors 2000
> found 191511994368 bytes used err is 1
> total csum bytes: 186404900
> total tree bytes: 629936128
> total fs tree bytes: 388333568
> btree space waste bytes: 146015924
> file data blocks allocated: 191957340160
>  referenced 190751694848
> Btrfs v0.19-4-gab8fb4c
> 
> It's the btrfs-co

Re: task imap:2958 blocked for more than 120 seconds

2010-01-13 Thread Johannes Hirte

Am Sonntag 10 Januar 2010 21:19:26 schrieb Chris Mason:
> On Sun, Jan 10, 2010 at 09:05:46PM +0100, Johannes Hirte wrote:
> > I've observed this hanging task now several times. Not sure when this
> > started, but 2.6.32 is affected too, IIRC. I don't have a test pattern
> > for this. Dovecot imap triggers this from time to time. I've enabled
> > CONFIG_DETECT_HUNG_TASK now
> 
> > and got this two tasks which hang:
> You're stuck on a read, could you please do a sysrq-w when this happens?

Will do so when it happens again.

> Also, do you eventually recover or are you stuck forever?

I didn't wait too long when it happened, so I'm not sure.  The longest time 
I've waited was 20-30min, until reboot without recover. So either it's stuck 
forever or it takes really long to recover.

And one question I have: How do you identify a read in this call trace? 

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

task imap:2958 blocked for more than 120 seconds

2010-01-10 Thread Johannes Hirte

I've observed this hanging task now several times. Not sure when this started, 
but 2.6.32 is affected too, IIRC. I don't have a test pattern for this. Dovecot 
imap triggers this from time to time. I've enabled CONFIG_DETECT_HUNG_TASK now 
and got this two tasks which hang:

INFO: task imap:2958 blocked for more than 120 seconds. 
   
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.   
   
imap  D  0  2958   2653 0x  
   
 88008caf5a28 0046  810544cf
   
 88008caf5998 0001 88008caf5fd8 88008caf9530
   
 de78 001d2700 001d2700 88008caf9530
   
Call Trace: 
   
 [] ? trace_hardirqs_off+0xd/0xf  
   
 [] ? trace_hardirqs_on_caller+0x10c/0x130
   
 [] ? sync_page+0x0/0x48  
   
 [] io_schedule+0x38/0x4d 
   
 [] sync_page+0x44/0x48   
   
 [] __wait_on_bit_lock+0x41/0x8a  
   
 [] __lock_page+0x61/0x68 
   
 [] ? wake_bit_function+0x0/0x2e  
   
 [] filemap_fault+0xea/0x345  
   
 [] __do_fault+0x50/0x3d3 
   
 [] handle_mm_fault+0x32f/0x65d   
   
 [] ? do_page_fault+0xf4/0x26f
   
 [] ? __down_read_trylock+0x46/0x4e   
   
 [] ? down_read_trylock+0x3f/0x49 
   
 [] ? do_page_fault+0xf4/0x26f
   
 [] do_page_fault+0x257/0x26f 
   
 [] page_fault+0x1f/0x30  
   
 [] ? might_fault+0x57/0xa7   
   
 [] ? btrfs_copy_from_user+0x4f/0x113 
   
 [] ? btrfs_copy_from_user+0xde/0x113 
   
 [] btrfs_file_write+0x439/0x6fe  
   
 [] vfs_write+0xad/0x14e  
   
 [] ? trace_hardirqs_on_caller+0x10c/0x130
   
 [] sys_pwrite64+0x55/0x74
   
 [] system_call_fastpath+0x16/0x1b
   
2 locks held by imap/2958:  
   
 #0:  (&sb->s_type->i_mutex_key#4){+.+.+.}, at: [] 
btrfs_file_write+0x169/0x6fe  
 #1:  (&mm->mmap_sem){++}, at: [] 
do_page_fault+0xf4/0x26f
   
INFO: task flush-btrfs-2:2783 blocked for more than 120 seconds.
   
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

Re: Still Problems with /dev/btrfs-control

2010-01-09 Thread Johannes Hirte

Am Samstag 09 Januar 2010 12:05:34 schrieb Goffredo Baroncelli:
> Hi Michael
> 
> On Saturday 09 January 2010, Dipl.-Ing. Michael Niederle wrote:
> > Thanks for the quick reply!
> > 
> > But I still have problems with btrfsctl:
> > > stat /dev/btrfs-control
> > > 
> >   File: `/dev/btrfs-control'
> >   Size: 0   Blocks: 0  IO Block: 4096   block special file
> > 
> > Device: ch/12d  Inode: 659848  Links: 1 Device type: a,3e
> 
> Ok, two things:
> 
> 1) btrfs-control is a *character* device and _not_ a *block device*
> 2) on my system it is allocated under 10,55 (major/minor).

This must be checked on the target machine, as the minor number is allocated 
dynamically.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[btrfs] kernel BUG at include/linux/spinlock.h:376!

2010-01-07 Thread Johannes Hirte

One of my btrfs filesystems gives the following bug message on access:

Jan  6 23:08:12 datengrab kernel: [ cut here ]
Jan  6 23:08:12 datengrab kernel: kernel BUG at include/linux/spinlock.h:376!
Jan  6 23:08:12 datengrab kernel: invalid opcode:  [#1] SMP
Jan  6 23:08:12 datengrab kernel: last sysfs file: 
/sys/devices/pci:00/:00:18.3/temp1_input
Jan  6 23:08:12 datengrab kernel: CPU 1
Jan  6 23:08:12 datengrab kernel: Pid: 2837, comm: btrfs-endio-wri Not tainted 
2.6.33-rc3-00033-g03b7675 #12 TYAN Tiger K8W Dual AMD Opteron, S2875/To Be 
Filled
By O.E.M.
Jan  6 23:08:12 datengrab kernel: RIP: 0010:[]  
[] 
btrfs_assert_tree_locked+0x16/0x1c
Jan  6 23:08:12 datengrab kernel: RSP: 0018:8800237b5a50  EFLAGS: 00010246
Jan  6 23:08:12 datengrab kernel: RAX: 0404 RBX: 88011f444ea0 
RCX: 8800
Jan  6 23:08:12 datengrab kernel: RDX: 0004 RSI: 88011c219000 
RDI: 8800829b3c00
Jan  6 23:08:12 datengrab kernel: RBP: 8800237b5a50 R08: 0016 
R09: 8800237b5a30
Jan  6 23:08:12 datengrab kernel: R10: 8800237b5a28 R11: 0191 
R12: 88011c219000
Jan  6 23:08:12 datengrab kernel: R13: 000c R14: 0001 
R15: 88011981e740
Jan  6 23:08:12 datengrab kernel: FS:  7f2c79ac8700() 
GS:88002b40() knlGS:
Jan  6 23:08:12 datengrab kernel: CS:  0010 DS:  ES:  CR0: 
8005003b
Jan  6 23:08:12 datengrab kernel: CR2: 026300a0 CR3: 000116b7f000 
CR4: 06f0
Jan  6 23:08:12 datengrab kernel: DR0:  DR1:  
DR2: 
Jan  6 23:08:12 datengrab kernel: DR3:  DR6: 0ff0 
DR7: 0400
Jan  6 23:08:12 datengrab kernel: Process btrfs-endio-wri (pid: 2837, 
threadinfo 8800237b4000, task 8800235037e0)
Jan  6 23:08:12 datengrab kernel: Stack:
Jan  6 23:08:12 datengrab kernel: 8800237b5ac0 81154ded 
012c 
000c
Jan  6 23:08:12 datengrab kernel: <0> 8816 000181150b93 
0ce3 0f66
Jan  6 23:08:12 datengrab kernel: <0> 88007ff44000 8800829b3d00 
 88011f444ea0
Jan  6 23:08:12 datengrab kernel: Call Trace:
Jan  6 23:08:12 datengrab kernel: [] push_leaf_left+0x9f/0x158
Jan  6 23:08:12 datengrab kernel: [] 
btrfs_del_items+0x363/0x48f
Jan  6 23:08:12 datengrab kernel: [] 
btrfs_mark_extent_written+0x53b/0x55f
Jan  6 23:08:12 datengrab kernel: [] ? 
trace_hardirqs_on+0xd/0xf
Jan  6 23:08:12 datengrab kernel: [] ? mutex_unlock+0x9/0xb
Jan  6 23:08:12 datengrab kernel: [] 
btrfs_finish_ordered_io+0x176/0x247
Jan  6 23:08:12 datengrab kernel: [] ? 
trace_hardirqs_off+0xd/0xf
Jan  6 23:08:12 datengrab kernel: [] 
btrfs_writepage_end_io_hook+0x15/0x17
Jan  6 23:08:12 datengrab kernel: [] 
end_bio_extent_writepage+0xa9/0x154
Jan  6 23:08:12 datengrab kernel: [] ? 
trace_hardirqs_on_caller+0x10c/0x130
Jan  6 23:08:12 datengrab kernel: [] bio_endio+0x26/0x28
Jan  6 23:08:12 datengrab kernel: [] 
end_workqueue_fn+0x10c/0x11b
Jan  6 23:08:12 datengrab kernel: [] worker_loop+0x175/0x44d
Jan  6 23:08:12 datengrab kernel: [] ? worker_loop+0x0/0x44d
Jan  6 23:08:12 datengrab kernel: [] kthread+0x7a/0x82
Jan  6 23:08:12 datengrab kernel: [] 
kernel_thread_helper+0x4/0x10
Jan  6 23:08:12 datengrab kernel: [] ? restore_args+0x0/0x30
Jan  6 23:08:12 datengrab kernel: [] ? kthread+0x0/0x82
Jan  6 23:08:12 datengrab kernel: [] ? 
kernel_thread_helper+0x0/0x10
Jan  6 23:08:12 datengrab kernel: Code: c8 ff 48 81 c4 88 00 00 00 5b 41 5c 41 
5d 41 5e 41 5f c9 c3 90 f6 47 38 02 55 48 89 e5 75 10 8b 47 70 89 c2 c1 fa 08 
38 c
2 75 04 <0f> 0b eb fe c9 c3 55 31 c0 65 48 8b 14 25 48 b5 00 00 48 89 e5
Jan  6 23:08:12 datengrab kernel: RIP  [] 
btrfs_assert_tree_locked+0x16/0x1c
Jan  6 23:08:12 datengrab kernel: RSP 
Jan  6 23:08:12 datengrab kernel: ---[ end trace 96d932f09da027f6 ]---

It only happens on write access. I was able to copy all the data to another 
drive without any error. The filesystem is damaged, btrfsck gives

root 5 inode 6969680 errors 2000
found 191511994368 bytes used err is 1
total csum bytes: 186404900
total tree bytes: 629936128
total fs tree bytes: 388333568
btree space waste bytes: 146015924
file data blocks allocated: 191957340160
 referenced 190751694848
Btrfs v0.19-4-gab8fb4c

It's the btrfs-code from 2.6.33 with the following additional patches:

diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
index 2e9e699..3a3a96d 100644
--- a/fs/btrfs/acl.c
+++ b/fs/btrfs/acl.c
@@ -111,13 +111,15 @@ static int btrfs_set_acl(struct btrfs_trans_handle 
*trans,

switch (type) {
case ACL_TYPE_ACCESS:
-   mode = inode->i_mode;
-   ret = posix_acl_equiv_mode(acl, &mode);
-   if (ret < 0)
-   return ret;
-   ret = 0;
-   inode->i_mode = mode;
name = POSIX_ACL_XATTR_ACCE

Re: What protection does btrfs checksumming currently give? (Was Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck))

2010-01-07 Thread Johannes Hirte

Am Donnerstag 07 Januar 2010 20:29:49 schrieb jim owens:
> Steve Freitas wrote:
> > Hi all,
> >
> > I was under the mistaken impression that btrfs checksumming, in its
> > current default configuration, protected your data from bitrot. It
> > appears this is not the case:
> >
> > On Wed, 2010-01-06 at 18:24 +0100, Johannes Hirte wrote:
> >> Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas:
> >>> So please correct me if I have some mistaken assumptions. I thought
> >>> btrfs would be tolerant of that -- if a block failed the checksum test,
> >>> it would reconstruct and remap it.
> >>
> >> Only if enough redundancy is left. And with the default setup btrfs is
> >> only mirroring the metadata not the data.
> >
> > So can someone please tell me what the current state-of-the-art is of
> > data protection with btrfs? Does it differ with single-device versus
> > multiple-device configurations? Is it possible to enable data
> > checksumming now? Under what conditions? And will it do what a naive
> > user would expect it to do, namely, correct for diverse kinds of errors
> > in your storage subsystem? If not, what does it do? Etc...
> 
> First, understand that a checksum only says "this block is good or bad".
> 
> The checksum can not be used to "reconstruct" the data.
> 
> Checksums are present for all btrfs blocks unless you explicitly shut
> them off with mount/ioctl/fcntl options.
> 
> To have a good copy you can use as a replacement block, you must
> use either btrfs raid1 or raid10.  You can use raid1 with 1 drive,
> in a mode called "dup" where both copies are made to that device.
> 
> By default with 1 drive, btrfs uses "dup" for metadata and 1 copy
> (nodup) for file data blocks. To get file data "dup", you just use
> "mkfs.btrfs -d raid1".
> 
> If you have btrfs raid, it will find the good block on a read, but
> AFAIK we don't have tools yet to automatically reallocate the bad one.
> 
> jim

Additionally I repeat the suggestion from Sander, check your drive for bad 
blocks. It sounds very likely that your drive is bad and you will get into 
trouble again with the new created FS. And the Oops you've posted smells like 
a bug in btrfs code.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)

2010-01-06 Thread Johannes Hirte

Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas:
> Hi Sander,
> 
> On Wed, 2010-01-06 at 08:52 +0100, Sander wrote:
> > I don't have your original mail, but I think I remember you mentioned a
> > lot of bad sectors on that disk reported by SMART.
> >
> > If that is indeed the case it might be dificult for the people who might
> > be able to help you, to help you.
> 
> Thanks for your  response. You're correct about the bad sector warning.
> So please correct me if I have some mistaken assumptions. I thought
> btrfs would be tolerant of that -- if a block failed the checksum test,
> it would reconstruct and remap it. 
Only if enough redundancy is left. And with the default setup btrfs is only 
mirroring the metadata not the data.

> (Also, I assumed that if a drive
> hadn't filled its bad sector remapping table, it could handle it at the
> hardware level, and SMART's warning was just that -- a warning, not a
> dire pronouncement of utter unsuitability -- but that's something else.)

Bad sectors are only remapped by the drive on write time. As long as this 
isn't the case, they are only marked as pending. As you have written, that 
SMART detected many bad blocks, I suspect the FS is really damaged. And as 
btrfsck is limited, I don't think it can fix this.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] check for NULL pointer dereference in btrfs_set_acl

2010-01-05 Thread Johannes Hirte

Check for for NULL pointer in btrfs_set_acl and omit calling 
posix_acl_equiv_mode in this case to avoid NULL pointer dereference there. 

Signed-off-by: Johannes Hirte 

diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
index 2e9e699..3a3a96d 100644
--- a/fs/btrfs/acl.c
+++ b/fs/btrfs/acl.c
@@ -111,13 +111,15 @@ static int btrfs_set_acl(struct btrfs_trans_handle 
*trans,

switch (type) {
case ACL_TYPE_ACCESS:
-   mode = inode->i_mode;
-   ret = posix_acl_equiv_mode(acl, &mode);
-   if (ret < 0)
-   return ret;
-   ret = 0;
-   inode->i_mode = mode;
name = POSIX_ACL_XATTR_ACCESS;
+   if (acl) {
+   mode = inode->i_mode;
+   ret = posix_acl_equiv_mode(acl, &mode);
+   if (ret < 0)
+   return ret;
+   ret = 0;
+   inode->i_mode = mode;
+   }
break;
case ACL_TYPE_DEFAULT:
if (!S_ISDIR(inode->i_mode))
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.33-rc2+ bug in fs/btrfs/ordered-data.c:672

2010-01-03 Thread Johannes Hirte

Am Samstag 02 Januar 2010 23:34:17 schrieb Carlos R. Mafra:
> On Sa  2.Jan'10 at 11:32:27 +0100, Carlos R. Mafra wrote:
> > I started testing btrfs for my /home a few days ago and yesterday
> > I hit a kernel bug, using 2.6.33-rc2-00187-g08d869a.
> >
> > I wasn't doing any stress test with it, I was simply watching a
> > DVD with xine while chrome was open in another workspace.
> 
> Today I hit the bug twice, and it was always with chrome. I've
> seen the other two reports in kerneloops.org involving
> btrfs_ordered_update_i_size() and chrome was there too.
> 
> And now I saved the dmesg in the ext3 partition, so now there
> is no need for the photo,
> 
> 
> [13450.613952] [ cut here ]
> [13450.613957] kernel BUG at fs/btrfs/ordered-data.c:672!
> [13450.613960] invalid opcode:  [#1] SMP
> [13450.613963] last sysfs file:
>  /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq [13450.613966] CPU 1
> [13450.613970] Pid: 3372, comm: chrome Not tainted
>  2.6.33-rc2-fs-00187-g08d869a #293 VAIO/VGN-FZ240E [13450.613973] RIP:
>  0010:[]  []
>  btrfs_ordered_update_i_size+0x237/0x3d0 [13450.613982] RSP:
>  0018:880071355da8  EFLAGS: 00010287
> [13450.613984] RAX: 88007f0c0728 RBX:  RCX:
>  880077c0c6a8 [13450.613986] RDX: 002b5000 RSI:
>   RDI: 880077c0c6b8 [13450.613989] RBP:
>  880071355e18 R08:  R09: 
>  [13450.613991] R10:  R11:  R12:
>  8800729dd260 [13450.613993] R13: 002b4b24 R14:
>   R15: 002b4b24 [13450.613996] FS: 
>  7f2b3db68910() GS:880001b0() knlGS:
>  [13450.613999] CS:  0010 DS:  ES:  CR0: 80050033
> [13450.614001] CR2: 7f2b3c2a4000 CR3: 71167000 CR4:
>  06e0 [13450.614003] DR0:  DR1:
>   DR2:  [13450.614005] DR3:
>   DR6: 0ff0 DR7: 0400
>  [13450.614008] Process chrome (pid: 3372, threadinfo 880071354000,
>  task 88007d380640) [13450.614010] Stack:
> [13450.614011]  8800729dd0c0  f000
>  0fff [13450.614015] <0> 8800729dd0f0 88007c7cc000
>  8800729dd0c0 8800729dd170 [13450.614019] <0> 880071355e18
>  880071355ee8 8800729dd260  [13450.614023] Call
>  Trace:
> [13450.614028]  [] btrfs_setattr+0x17d/0x270
> [13450.614033]  [] notify_change+0x104/0x2e0
> [13450.614037]  [] do_truncate+0x5f/0x90
> [13450.614041]  [] ? vfs_write+0x132/0x180
> [13450.614044]  [] sys_ftruncate+0xe9/0x130
> [13450.614049]  [] system_call_fastpath+0x16/0x1b
> [13450.614050] Code: 0f 1f 40 00 eb 88 66 0f 1f 44 00 00 49 8b 84 24 38 ff
>  ff ff 48 85 c0 74 1b 48 8b 50 98 49 39 d5 72 12 48 03 50 a8 49 39 d5 73 09
>  <0f> 0b eb fe 0f 1f 44 00 00 49 8b 94 24 30 ff ff ff 48 89 55 b8
>  [13450.614082] RIP  []
>  btrfs_ordered_update_i_size+0x237/0x3d0 [13450.614087]  RSP
>  
> [13450.614090] ---[ end trace 74172209f4d15206 ]---

Sounds like the bug reported here: http://article.gmane.org/gmane.comp.file-
systems.btrfs/4332/match=btrfs+fails+randomly. Can you try the patch provided 
in that thread?

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: (Little) Patch about null dereference with acl and posix.

2009-12-11 Thread Johannes Hirte

Am Mittwoch 18 November 2009 22:28:27 schrieb briaeros007:
> Hello,
> For some days, i've got oops on my system and i've investigate it a bit.
> The trouble was with  "posix_acl_equiv_mode" , and for some reason
> (corrupted metadata ?) btrfs sometimes call it with "acl"==NULL
> This function doesn't like it.
> So in my patch I've first put a little error protection around the
> call, and then avoid to call btrfs_set_acl with acl=NULL.
> 
> I'm not sure if it's ok with best practice, but i've done the test
> which produce the oops, and know it doesn't (but some csum failed.
> Well if my btrfs is corrupted, it's comprehensible).
> 
> The patch is the following.
> 
> diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
> index 3616042..f8ade24 100644
> --- a/fs/btrfs/acl.c
> +++ b/fs/btrfs/acl.c
> @@ -111,7 +111,8 @@ static int btrfs_set_acl(struct inode *inode,
> struct posix_acl *acl, int type)
> switch (type) {
> case ACL_TYPE_ACCESS:
> mode = inode->i_mode;
> -   ret = posix_acl_equiv_mode(acl, &mode);
> +   if (acl && mode)
> +   ret = posix_acl_equiv_mode(acl, &mode);
> if (ret < 0)
> return ret;
> ret = 0;
> @@ -165,12 +166,13 @@ static int btrfs_xattr_set_acl(struct inode
> *inode, int type,
> } else if (IS_ERR(acl)) {
> return PTR_ERR(acl);
> }
> +   else
> +   {
> +   ret = btrfs_set_acl(inode, acl, type);
> +   posix_acl_release(acl);
> +   }
> }
> 
> -   ret = btrfs_set_acl(inode, acl, type);
> -
> -   posix_acl_release(acl);
> -
> return ret;
>  }

Shouldn't this go upstream and into stable review?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 0/2] grub-0.97: btrfs support

2009-12-11 Thread Johannes Hirte

Am Freitag 11 Dezember 2009 16:27:54 schrieb Edward Shishkin:
> Johannes Hirte wrote:
> > Am Freitag 11 Dezember 2009 12:17:29 schrieb Edward Shishkin:
> >> Johannes Hirte wrote:
> >>> Am Freitag 11 Dezember 2009 00:15:46 schrieb Johannes Hirte:
> >>>> Am Freitag 25 September 2009 00:06:23 schrieb Edward Shishkin:
> >>>>> Hello everyone.
> >>>>
> >>>> ...
> >>>>
> >>>>> The following patches are for Fedora 10(**).
> >>>>> The distro-independent package will be put to kernel.org a bit later.
> >>>>>
> >>>>>
> >>>>> All comments, bugreports, etc. are welcome as usual.
> >>>>
> >>>> Ok, I have another comment/bugreport *g*.
> >>>>
> >>>> I'm testing this patch with gentoo, so the grub sources are not
> >>>> identicaly the same. With this patches applied, grub is unable to
> >>>> detect JFS or XFS filesystems. XFS is reported as unknown, JFS is
> >>>> reported as btrfs. Reiserfs and ext2/3 are detected as expected.
> >>
> >> Yes, this patch is for Fedora. For other distros
> >> some issues are possible, so please be careful..
> >
> > I've also tested now with the fedora sources. There is the same bug. The
> > btrfs patch breaks the filesystem detection. All filesystems after btrfs
> > in fsys_table aren't detected. Moving btrfs to the end of fsys_table is a
> > workaround but will interfere with FFS. So this should better be fixed in
> > the btrfs-part of grub, so that it:
> >
> > a) doesn't missdetect a JFS filesystem as btrfs
> > b) doesn't break the detection for remaining filesystems in the array.
> 
> Hello.
> 
> Yes, I confirm that xfs, etc. file systems are not detected,
> but missdetection jfs as btrfs looks rather fantastic :)
> 
> Please, try the attached patch. Report if any problems.

The patch works, but the problem with misdetected JFS filesystem still 
persists. It happens if the device contained a btrfs filesystem before. I 
assume that the JFS super block starts later on the device as the btrfs one do 
and jfs_mkfs doesn't clean the space ahead of the JFS super block. So if a JFS 
filesystem is created on a device that contained a btrfs before, btrfs_mount 
still detects the beginning of the old btrfs super block and reads crap later 
on.
To avoid this, btrfs detection could be placed after JFS. Are there any 
objections against this?


regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 0/2] grub-0.97: btrfs support

2009-12-11 Thread Johannes Hirte

Am Freitag 11 Dezember 2009 12:17:29 schrieb Edward Shishkin:
> Johannes Hirte wrote:
> > Am Freitag 11 Dezember 2009 00:15:46 schrieb Johannes Hirte:
> >   
> >> Am Freitag 25 September 2009 00:06:23 schrieb Edward Shishkin:
> >> 
> >>> Hello everyone.
> >>>   
> >> ...
> >>
> >> 
> >>> The following patches are for Fedora 10(**).
> >>> The distro-independent package will be put to kernel.org a bit later.
> >>>
> >>>
> >>> All comments, bugreports, etc. are welcome as usual.
> >>>   
> >> Ok, I have another comment/bugreport *g*.
> >>
> >> I'm testing this patch with gentoo, so the grub sources are not identicaly
> >>  the same. With this patches applied, grub is unable to detect JFS or XFS
> >>  filesystems. XFS is reported as unknown, JFS is reported as btrfs.
> >>  Reiserfs and ext2/3 are detected as expected.
> >> 
> 
> Yes, this patch is for Fedora. For other distros
> some issues are possible, so please be careful..

I've also tested now with the fedora sources. There is the same bug. The btrfs
patch breaks the filesystem detection. All filesystems after btrfs in 
fsys_table 
aren't detected. Moving btrfs to the end of fsys_table is a workaround but will
interfere with FFS. So this should better be fixed in the btrfs-part of grub,
so that it:

a) doesn't missdetect a JFS filesystem as btrfs
b) doesn't break the detection for remaining filesystems in the array.


regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 0/2] grub-0.97: btrfs support

2009-12-10 Thread Johannes Hirte

Am Freitag 11 Dezember 2009 00:15:46 schrieb Johannes Hirte:
> Am Freitag 25 September 2009 00:06:23 schrieb Edward Shishkin:
> > Hello everyone.
> 
> ...
> 
> > The following patches are for Fedora 10(**).
> > The distro-independent package will be put to kernel.org a bit later.
> >
> >
> > All comments, bugreports, etc. are welcome as usual.
> 
> Ok, I have another comment/bugreport *g*.
> 
> I'm testing this patch with gentoo, so the grub sources are not identicaly
>  the same. With this patches applied, grub is unable to detect JFS or XFS
>  filesystems. XFS is reported as unknown, JFS is reported as btrfs.
>  Reiserfs and ext2/3 are detected as expected.

A possible solution is to put FSYS_BTRFS on the end of struct fsys_entry 
fsys_table. I've tested with FSYS_BTFS as the second last entry, the last is 
still FFS.

diff -Nru grub-0.97-r9/stage2/disk_io.c grub-0.97-r10/stage2/disk_io.c
--- grub-0.97-r9/stage2/disk_io.c   2009-12-10 23:41:37.0 +0100
+++ grub-0.97-r10/stage2/disk_io.c  2009-12-11 00:50:51.555007247 +0100
@@ -79,6 +79,9 @@
 # ifdef FSYS_ISO9660
   {"iso9660", iso9660_mount, iso9660_read, iso9660_dir, 0, 0},
 # endif
+# ifdef FSYS_BTRFS
+  {"btrfs", btrfs_mount, btrfs_read, btrfs_dir, 0, btrfs_embed},
+# endif
   /* XX FFS should come last as it's superblock is commonly crossing tracks
  on floppies from track 1 to 2, while others only use 1.  */
 # ifdef FSYS_FFS

With this order, XFS and JFS filesystems are identified correct. But I think, 
this is just a workaround.


regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 2/2] grub-0.97: btrfs multidevice configuration support

2009-12-10 Thread Johannes Hirte

Am Dienstag 03 November 2009 01:59:39 schrieb Edward Shishkin:
> Johannes Hirte wrote:
> >  Why is the btrfs code
> > dealing with network devices at all?
> 
> Why not? :)

I don't see the possiblity to get a btrfs filesystem this way. So as far as I 
understand this, it's complete useless. The CD support doesn't look very 
usefull too to me. It's possible to put a btrfs filesystem on a CD or DVD. But 
that seems rather theoretical.

regards,
  Johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 0/2] grub-0.97: btrfs support

2009-12-10 Thread Johannes Hirte

Am Freitag 25 September 2009 00:06:23 schrieb Edward Shishkin:
> Hello everyone.
... 
> The following patches are for Fedora 10(**).
> The distro-independent package will be put to kernel.org a bit later.
> 
> 
> All comments, bugreports, etc. are welcome as usual.

Ok, I have another comment/bugreport *g*.

I'm testing this patch with gentoo, so the grub sources are not identicaly the 
same. With this patches applied, grub is unable to detect JFS or XFS 
filesystems. XFS is reported as unknown, JFS is reported as btrfs. Reiserfs and 
ext2/3 are detected as expected.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: (Little) Patch about null dereference with acl and posix.

2009-12-01 Thread Johannes Hirte

Am Mittwoch 18 November 2009 22:28:27 schrieb briaeros007:
> Hello,
> For some days, i've got oops on my system and i've investigate it a bit.
> The trouble was with  "posix_acl_equiv_mode" , and for some reason
> (corrupted metadata ?) btrfs sometimes call it with "acl"==NULL
> This function doesn't like it.
> So in my patch I've first put a little error protection around the
> call, and then avoid to call btrfs_set_acl with acl=NULL.
> 
> I'm not sure if it's ok with best practice, but i've done the test
> which produce the oops, and know it doesn't (but some csum failed.
> Well if my btrfs is corrupted, it's comprehensible).
> 
> The patch is the following.
> 
> diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
> index 3616042..f8ade24 100644
> --- a/fs/btrfs/acl.c
> +++ b/fs/btrfs/acl.c
> @@ -111,7 +111,8 @@ static int btrfs_set_acl(struct inode *inode,
> struct posix_acl *acl, int type)
> switch (type) {
> case ACL_TYPE_ACCESS:
> mode = inode->i_mode;
> -   ret = posix_acl_equiv_mode(acl, &mode);
> +   if (acl && mode)
> +   ret = posix_acl_equiv_mode(acl, &mode);
> if (ret < 0)
> return ret;
> ret = 0;
> @@ -165,12 +166,13 @@ static int btrfs_xattr_set_acl(struct inode
> *inode, int type,
> } else if (IS_ERR(acl)) {
> return PTR_ERR(acl);
> }
> +   else
> +   {
> +   ret = btrfs_set_acl(inode, acl, type);
> +   posix_acl_release(acl);
> +   }
> }
> 
> -   ret = btrfs_set_acl(inode, acl, type);
> -
> -   posix_acl_release(acl);
> -
> return ret;
>  }

Thanx for this fix. I think I run into the same bug with rdiff-backup:

Dec  1 19:09:11 datengrab kernel: BUG: unable to handle kernel NULL pointer 
dereference at 0004
Dec  1 19:09:11 datengrab kernel: IP: [] 
posix_acl_equiv_mode+0x0/0x90
Dec  1 19:09:11 datengrab kernel: PGD 3609f067 PUD 7163067 PMD 0
Dec  1 19:09:11 datengrab kernel: Oops:  [#1] SMP
Dec  1 19:09:11 datengrab kernel: last sysfs file: 
/sys/devices/pci:00/:00:18.3/temp1_input
Dec  1 19:09:11 datengrab kernel: CPU 0
Dec  1 19:09:11 datengrab kernel: Modules linked in: usb_storage snd_seq_midi 
snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_seq_oss 
snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss aes_x86_64 aes_generic 
xts gf128mul dm_crypt snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm 
snd_seq_device snd_timer snd_page_alloc snd_util_mem snd_hwdep sr_mod snd 
fglrx(P) k8temp sata_sil hwmon i2c_amd756 i2c_amd8111 ohci_hcd sg uhci_hcd
Dec  1 19:09:11 datengrab kernel: Pid: 22720, comm: rdiff-backup Tainted: P 
  
2.6.31.6-fglrx2 #2 To Be Filled By O.E.M.
Dec  1 19:09:11 datengrab kernel: RIP: 0010:[]  
[] 
posix_acl_equiv_mode+0x0/0x90
Dec  1 19:09:11 datengrab kernel: RSP: 0018:88001e051d60  EFLAGS: 00010246
Dec  1 19:09:11 datengrab kernel: RAX: 41c0 RBX:  
RCX: 
Dec  1 19:09:11 datengrab kernel: RDX: 8000 RSI: 88001e051d84 
RDI: 
Dec  1 19:09:11 datengrab kernel: RBP: 880056b315e8 R08: 880056b315e8 
R09: 
880056b315e8
Dec  1 19:09:11 datengrab kernel: R10: 88001e051e38 R11: 4000 
R12: 
Dec  1 19:09:11 datengrab kernel: R13: 8000 R14:  
R15: 

Dec  1 19:09:11 datengrab kernel: FS:  7f6b1e98d700() 
GS:8800015e() knlGS:
Dec  1 19:09:11 datengrab kernel: CS:  0010 DS:  ES:  CR0: 
80050033
Dec  1 19:09:11 datengrab kernel: CR2: 0004 CR3: 4c641000 
CR4: 06f0
Dec  1 19:09:11 datengrab kernel: DR0:  DR1:  
DR2: 
Dec  1 19:09:11 datengrab kernel: DR3:  DR6: 0ff0 
DR7: 0400
Dec  1 19:09:11 datengrab kernel: Process rdiff-backup (pid: 22720, threadinfo 
88001e05, task 88005b0f3980)
Dec  1 19:09:11 datengrab kernel: Stack:
Dec  1 19:09:11 datengrab kernel: 81171176 880056b315e8 
8109d127 
880013e8b033
Dec  1 19:09:11 datengrab kernel: <0> 41c01e051e98 880056b315e8 
880056b315e8 8000
Dec  1 19:09:11 datengrab kernel: <0>  880056b316a0 
 
Dec  1 19:09:11 datengrab kernel: Call Trace:
Dec  1 19:09:11 datengrab kernel: [] ? 
btrfs_set_acl+0x5a/0x1ab
Dec  1 19:09:11 datengrab kernel: [] ? dput+0x2c/0x145
Dec  1 19:09:11 datengrab kernel: [] ? 
btrfs_xattr_set_acl+0x40/0x68
Dec  1 19:09:11 datengrab kernel: [] ? 
btrfs_removexattr+0x20/0x60
Dec  1 19:09:11 datengrab kernel: [] ? 
vfs_removexattr+0x78/0x104
Dec  1 19:09:11 datengrab kernel: [] ? removexattr+0x39/0x45
Dec  1 19:09:11 datengrab kernel: [] ? 
do_pat

1 2 >

1 - 100 of 110 matches

Mail list logo