AuFS freezes while copying files

2015-01-27 Thread Bogdan Kiselitsa

   Hi,

   I'm trying to use AuFS to pool storage with Arch linux running on HyperV.
   Unfortunately I'm getting freezes when writing many files via samba or
   rsync.  The write ops eventually freeze and I get kernel messages that
   kworkers or smbd have been blocked for 120 seconds, necessitating a hard
   reset of the vm to resolve the problem.

   I've tried using multiple filesystems under AuFS including xfs and ext4 and
   it seems like the problem is confined to writes via AuFS with more than 1
   drive in the pool.

   I'm happy to provide more details or debug further with some direction.

   Thanks!

   Details follow:

   # cat /sys/module/aufs/version

   3.18.1+-20150119
   # Uname -a:
   Linux bohdy-store 3.18.2-2-aufs_friendly #1 SMP PREEMPT Sat Jan 24 12:59:54
   AEDT 2015 x86_64 GNU/Linux
   # Systemd mount
   [Mount]
   What = none
   Where = /mnt/store
   Type = aufs
   Options =
   br:/mnt/disk1=rw:/mnt/disk2=rw:/mnt/disk3=rw:/mnt/disk4=rw:/mnt/disk5=rw,sum
   ,create=mfs
   # cat /proc/mounts
   rootfs / rootfs rw 0 0
   proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
   sys /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
   dev/devdevtmpfsrw,nosuid,relatime,size=1536700k,nr_inodes=384175,mode=755
   0 0
   run /run tmpfs rw,nosuid,nodev,relatime,mode=755 0 0
   /dev/sda1 / ext4 rw,relatime,data=ordered 0 0
   securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0
   0
   tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
   devpts/dev/ptsdevptsrw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000
   0 0
   tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
   cgroup /sys/fs/cgroup/systemd cgroup
   rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd
   -cgroups-agent,name=systemd 0 0
   pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
   cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0
   0
   cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0
   0
   cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
   cgroup /sys/fs/cgroup/cpu,cpuacct cgroup
   rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
   cgroup/sys/fs/cgroup/net_clscgrouprw,nosuid,nodev,noexec,relatime,net_cls
   0 0
   cgroup/sys/fs/cgroup/freezercgrouprw,nosuid,nodev,noexec,relatime,freezer
   0 0
   cgroup/sys/fs/cgroup/devicescgrouprw,nosuid,nodev,noexec,relatime,devices
   0 0
   systemd-1 /proc/sys/fs/binfmt_misc autofs
   rw,relatime,fd=22,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0
   tmpfs /tmp tmpfs rw 0 0
   hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
   debugfs /sys/kernel/debug debugfs rw,relatime 0 0
   mqueue /dev/mqueue mqueue rw,relatime 0 0
   configfs /sys/kernel/config configfs rw,relatime 0 0
   /dev/sdf1 /mnt/disk5 ext4 rw,relatime,data=ordered 0 0
   /dev/sde1 /mnt/disk4 ext4 rw,relatime,data=ordered 0 0
   /dev/sdc1 /mnt/disk2 ext4 rw,relatime,data=ordered 0 0
   /dev/sdd1 /mnt/disk3 ext4 rw,relatime,data=ordered 0 0
   /dev/sdb1 /mnt/disk1 ext4 rw,relatime,data=ordered 0 0
   none /mnt/store aufs rw,relatime,si=2973d8b31caf023d,create=mfs,sum 0 0
   none /srv/Fileshare aufs rw,relatime,si=2973d8b31caf023d,create=mfs,sum 0 0
   /dev/sdg1 /mnt/parity1 ext4 rw,relatime,data=ordered 0 0
   tmpfs /run/user/0 tmpfs rw,nosuid,nodev,relatime,size=307876k,mode=700 0 0
   /dev/sdh1 /mnt/bak ntfs
   ro,relatime,uid=0,gid=0,fmask=0177,dmask=077,nls=utf8,errors=continue,mft_zo
   ne_multiplier=1 0 0
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/

Re: AuFS freezes while copying files

2015-01-27 Thread sfjro

Hello Bogdan,

Bogdan Kiselitsa:
> I'm trying to use AuFS to pool storage with Arch linux running on HyperV.
> Unfortunately I'm getting freezes when writing many files via samba or
> rsync. The write ops eventually freeze and I get kernel messages that
> kworkers or smbd have been blocked for 120 seconds, necessitating a hard
> reset of the vm to resolve the problem.
>
> I've tried using multiple filesystems under AuFS including xfs and ext4 and
> it seems like the problem is confined to writes via AuFS with more than 1
> drive in the pool.


Currently I don't know what is wrong.
For the investigation, I'd like to stick into a single case which is
- all branches are ext4.
- write to aufs by rsync.
  $ rsync /somewhere/else /your/aufs

Because I am not so familiar to samba, xfs and HyperV, it is better to
exclude them.

Is this "ext4 + rsync" surely reproduce the problem?
If so, would you try strace to identify the systemcall where aufs hung
long time? It can be rsync process itself, its forked child process or
rsyncd process. I want to know which process hung.

And more importantly, would you explain more specifcally about "the
problem is confined to writes via AuFS with more than 1 drive in the
pool" you wrote?


J. R. Okajima

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/


Stale NFS file handle when using aufs on top of btrfs?

2015-01-27 Thread Dan Kegel
I've been using aufs on top of ext4 reliably for some time.

Recently I tried it on top of btrfs (stock Ubuntu 14.04), and my
builds are failing in rm -rf with:

/bin/rm: cannot remove
`/tmp/temp-lintian-lab-h2hbg82XOW/pool/o/oobleck.../unpacked':
Directory not empty
internal error: failed to remove unpacked directory of oobleck
...
/bin/rm: fts_read failed: Stale NFS file handle

Has anybody else seen this?

(A superficially similar problem was mentioned five years ago:
http://comments.gmane.org/gmane.linux.file-systems.aufs.user/2437
Likewise, there are superficially similar posts from docker users,
but they may be from user error.)

I see that Ubuntu 14.04's using kernel 3.13.mumble, and aufs no longer
supports 3.13 per
http://aufs.sourceforge.net/
Not sure how easy it'd be for me to run vanilla kernels on this box.
My best move
is probably to drop back to ext4 for now.

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/


Re: Stale NFS file handle when using aufs on top of btrfs?

2015-01-27 Thread Michael Johnson - MJ

   I see very similar behavior (and always have with AUFS) and it is easily
   reproducible.  Here is how to reproduce 100% of the time:
   $ cd /aufsdir;
   $ mkdir -p dir1/dir2;
   $ rm -rf dir1
   $ ls
   ls: cannot open directory .: Stale file handle
   To mke the error go away:
   $ cd .
   And now an ls works flawlessly.
   Basically,  any  time  you  have a directory with one or more files or
   directories inside of it, and then you recursively remove it while the
   working directory is the parent, subsequent attempts to access files inside
   of the parent will fail with the 'Stale file handle' error.
   For  the most part this does not impact me becuase most things are not
   working of files/directories that are a immediate descent of the current
   working directory.  But it would be nice if this problem did not occur.
   Hope this helps to at least understand the problem better and possible
   provide you with a workaround.

   On Tue, Jan 27, 2015 at 10:50 AM, Dan Kegel <[1]d...@kegel.com> wrote:

 I've been using aufs on top of ext4 reliably for some time.
 Recently I tried it on top of btrfs (stock Ubuntu 14.04), and my
 builds are failing in rm -rf with:
 /bin/rm: cannot remove
 `/tmp/temp-lintian-lab-h2hbg82XOW/pool/o/oobleck.../unpacked':
 Directory not empty
 internal error: failed to remove unpacked directory of oobleck
 ...
 /bin/rm: fts_read failed: Stale NFS file handle
 Has anybody else seen this?
 (A superficially similar problem was mentioned five years ago:
 [2]http://comments.gmane.org/gmane.linux.file-systems.aufs.user/2437
 Likewise, there are superficially similar posts from docker users,
 but they may be from user error.)
 I see that Ubuntu 14.04's using kernel 3.13.mumble, and aufs no longer
 supports 3.13 per
 [3]http://aufs.sourceforge.net/
 Not sure how easy it'd be for me to run vanilla kernels on this box.
 My best move
 is probably to drop back to ext4 for now.
 --
 
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is
 your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. [4]http://goparallel.sourceforge.net/

   --
   Michael Johnson - MJ

References

   1. mailto:d...@kegel.com
   2. http://comments.gmane.org/gmane.linux.file-systems.aufs.user/2437
   3. http://aufs.sourceforge.net/
   4. http://goparallel.sourceforge.net/
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/

Re: Stale NFS file handle when using aufs on top of btrfs?

2015-01-27 Thread Dan Kegel
Thanks for the short repro script.
This prevents debian packaging tools from working; seems like it'd be
good to fix it in aufs rather than adding a workaround into dpkg and
friends.

I believe this is bug #1: http://sourceforge.net/p/aufs/bugs/1/
- Dan

On Tue, Jan 27, 2015 at 3:47 PM, Michael Johnson - MJ  wrote:
> I see very similar behavior (and always have with AUFS) and it is easily
> reproducible.  Here is how to reproduce 100% of the time:
>
> $ cd /aufsdir;
> $ mkdir -p dir1/dir2;
> $ rm -rf dir1
> $ ls
> ls: cannot open directory .: Stale file handle
>
> To mke the error go away:
>
> $ cd .
>
> And now an ls works flawlessly.
>
> Basically, any time you have a directory with one or more files or
> directories inside of it, and then you recursively remove it while the
> working directory is the parent, subsequent attempts to access files inside
> of the parent will fail with the 'Stale file handle' error.
>
> For the most part this does not impact me becuase most things are not
> working of files/directories that are a immediate descent of the current
> working directory.  But it would be nice if this problem did not occur.
>
> Hope this helps to at least understand the problem better and possible
> provide you with a workaround.
>
> On Tue, Jan 27, 2015 at 10:50 AM, Dan Kegel  wrote:
>>
>> I've been using aufs on top of ext4 reliably for some time.
>>
>> Recently I tried it on top of btrfs (stock Ubuntu 14.04), and my
>> builds are failing in rm -rf with:
>>
>> /bin/rm: cannot remove
>> `/tmp/temp-lintian-lab-h2hbg82XOW/pool/o/oobleck.../unpacked':
>> Directory not empty
>> internal error: failed to remove unpacked directory of oobleck
>> ...
>> /bin/rm: fts_read failed: Stale NFS file handle
>>
>> Has anybody else seen this?
>>
>> (A superficially similar problem was mentioned five years ago:
>> http://comments.gmane.org/gmane.linux.file-systems.aufs.user/2437
>> Likewise, there are superficially similar posts from docker users,
>> but they may be from user error.)
>>
>> I see that Ubuntu 14.04's using kernel 3.13.mumble, and aufs no longer
>> supports 3.13 per
>> http://aufs.sourceforge.net/
>> Not sure how easy it'd be for me to run vanilla kernels on this box.
>> My best move
>> is probably to drop back to ext4 for now.
>>
>>
>> --
>> Dive into the World of Parallel Programming. The Go Parallel Website,
>> sponsored by Intel and developed in partnership with Slashdot Media, is
>> your
>> hub for all things parallel software development, from weekly thought
>> leadership blogs to news, videos, case studies, tutorials and more. Take a
>> look and join the conversation now. http://goparallel.sourceforge.net/
>
>
>
>
> --
> Michael Johnson - MJ

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/


Re: Stale NFS file handle when using aufs on top of btrfs?

2015-01-27 Thread Dan Kegel
... although if it's bug #1, I'm not sure why it just showed up for me
with btrfs and not ext4...

On Tue, Jan 27, 2015 at 4:03 PM, Dan Kegel  wrote:
> Thanks for the short repro script.
> This prevents debian packaging tools from working; seems like it'd be
> good to fix it in aufs rather than adding a workaround into dpkg and
> friends.
>
> I believe this is bug #1: http://sourceforge.net/p/aufs/bugs/1/
> - Dan
>
> On Tue, Jan 27, 2015 at 3:47 PM, Michael Johnson - MJ  wrote:
>> I see very similar behavior (and always have with AUFS) and it is easily
>> reproducible.  Here is how to reproduce 100% of the time:
>>
>> $ cd /aufsdir;
>> $ mkdir -p dir1/dir2;
>> $ rm -rf dir1
>> $ ls
>> ls: cannot open directory .: Stale file handle
>>
>> To mke the error go away:
>>
>> $ cd .
>>
>> And now an ls works flawlessly.
>>
>> Basically, any time you have a directory with one or more files or
>> directories inside of it, and then you recursively remove it while the
>> working directory is the parent, subsequent attempts to access files inside
>> of the parent will fail with the 'Stale file handle' error.
>>
>> For the most part this does not impact me becuase most things are not
>> working of files/directories that are a immediate descent of the current
>> working directory.  But it would be nice if this problem did not occur.
>>
>> Hope this helps to at least understand the problem better and possible
>> provide you with a workaround.
>>
>> On Tue, Jan 27, 2015 at 10:50 AM, Dan Kegel  wrote:
>>>
>>> I've been using aufs on top of ext4 reliably for some time.
>>>
>>> Recently I tried it on top of btrfs (stock Ubuntu 14.04), and my
>>> builds are failing in rm -rf with:
>>>
>>> /bin/rm: cannot remove
>>> `/tmp/temp-lintian-lab-h2hbg82XOW/pool/o/oobleck.../unpacked':
>>> Directory not empty
>>> internal error: failed to remove unpacked directory of oobleck
>>> ...
>>> /bin/rm: fts_read failed: Stale NFS file handle
>>>
>>> Has anybody else seen this?
>>>
>>> (A superficially similar problem was mentioned five years ago:
>>> http://comments.gmane.org/gmane.linux.file-systems.aufs.user/2437
>>> Likewise, there are superficially similar posts from docker users,
>>> but they may be from user error.)
>>>
>>> I see that Ubuntu 14.04's using kernel 3.13.mumble, and aufs no longer
>>> supports 3.13 per
>>> http://aufs.sourceforge.net/
>>> Not sure how easy it'd be for me to run vanilla kernels on this box.
>>> My best move
>>> is probably to drop back to ext4 for now.
>>>
>>>
>>> --
>>> Dive into the World of Parallel Programming. The Go Parallel Website,
>>> sponsored by Intel and developed in partnership with Slashdot Media, is
>>> your
>>> hub for all things parallel software development, from weekly thought
>>> leadership blogs to news, videos, case studies, tutorials and more. Take a
>>> look and join the conversation now. http://goparallel.sourceforge.net/
>>
>>
>>
>>
>> --
>> Michael Johnson - MJ

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/


Re: Stale NFS file handle when using aufs on top of btrfs?

2015-01-27 Thread sfjro

Hello Dan and Michael,

Dan Kegel:
> Thanks for the short repro script.

I will try Michael's to reproduce the problem on my test env.


> I believe this is bug #1: http://sourceforge.net/p/aufs/bugs/1/

It is doubtful since the versions, options, branch fs-types are all
different from yours.


J. R. Okajima

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/


Re: Stale NFS file handle when using aufs on top of btrfs?

2015-01-27 Thread sfjro

Michael Johnson - MJ:
> $ cd /aufsdir;
> $ mkdir -p dir1/dir2;
> $ rm -rf dir1
> $ ls
> ls: cannot open directory .: Stale file handle

At least, these steps succeeded on my test machine.

$ mkdir -p dir1/dir2
$ rm -ir dir1
rm: descend into directory `dir1'? y
rm: remove directory `dir1/dir2'? y
aufs au_new_inode:423:rm[4413]: Warning: Un-notified UDBA or repeatedly renamed 
dir, b0, btrfs, dir1, hi274, i11.
rm: remove directory `dir1'? y
$ ls
b_dst empty  f_src  mv501_a/  p_src|   sleep*
:::
(shows everything)

- latest aufs3.14
- u = rw + ro
/dev/ram1 /run/shm/ro ext2 ro,relatime,errors=continue,user_xattr,acl 0 0
/dev/ram0 /run/shm/rw btrfs rw,relatime,space_cache 0 0
none /run/shm/u aufs rw,relatime,si=a8fd9d47e41c7918 0 0

An interesting thing is, a warning about UDBA was produced. Which means
that by removing dir2, the something about dir1 was changed unexpectedly
and aufs detects it (and produced a warning). But I am not sure this is
related to your problem.

Anyway it will be a good help for me to investigate the problem if you
post these info.

- /proc/mounts (instead of the output of mount(8))
- /sys/module/aufs/*
- /sys/fs/aufs/* (if you have them)
- /debug/aufs/* (if you have them)
- linux kernel version
  if your kernel is not plain, for example modified by distributor,
  the url where i can download its source is necessary too.
- aufs version which was printed at loading the module or booting the
  system, instead of the date you downloaded.
- configuration (define/undefine CONFIG_AUFS_xxx)
- kernel configuration or /proc/config.gz (if you have it)


By the way, I have posted about some unusual behaviours of btrfs.
http://www.mail-archive.com/aufs-users@lists.sourceforge.net/msg02430.html
Again I begin thinking btrfs is not usable still as aufs branch.


J. R. Okajima

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/


Re: Stale NFS file handle when using aufs on top of btrfs?

2015-01-27 Thread sfjro

Dan Kegel:
> I see that Ubuntu 14.04's using kernel 3.13.mumble, and aufs no longer

Is this your Ubuntu 14.04?
2d22fc7 2014-10-09 UBUNTU: Ubuntu-3.13.0-38.65

According to git://kernel.ubuntu.com/ubuntu/ubuntu-trusty.git, it uses
aufs3.13-20140303. As always, ubuntu uses old version enough to make it
very hard for me suppport. But I am trying, as always, to support and
help users.

Anyway I use MJ's short steps to test this problem because it looks like
same to yours.


J. R. Okajima

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/


Re: Stale NFS file handle when using aufs on top of btrfs?

2015-01-27 Thread sfjro

> According to git://kernel.ubuntu.com/ubuntu/ubuntu-trusty.git, it uses
> aufs3.13-20140303. As always, ubuntu uses old version enough to make it
> very hard for me suppport. But I am trying, as always, to support and
> help users.

Ah, I might be confused, sorry.
If ubuntu-trusty.git is release on Apr 2014, then aufs3.13-20140303 is
not bad. It should be new at that time.


J. R. Okajima

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/


Re: Stale NFS file handle when using aufs on top of btrfs?

2015-01-27 Thread sfjro

sf...@users.sourceforge.net:
> Michael Johnson - MJ:
> > $ cd /aufsdir;
> > $ mkdir -p dir1/dir2;
> > $ rm -rf dir1
> > $ ls
> > ls: cannot open directory .: Stale file handle
>
> At least, these steps succeeded on my test machine.

I've tested the same steps on
- plain linux-3.13
- aufs3.13-20140303 (as Dan's ubuntu kernel)
- plus my local debug patch.

The result is unchanged, succeeded.
I am afraid some external things may be related, for example kernel
config, aufs-util in userspace, options for btrfs, etc...

It will be important as a first step to make the differences clear and
reproduce the problem on my side. Otherwise, I have to ask you to repeat
inserting a debug print, re-build and test.


J. R. Okajima

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/