Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
On 26/01/16 05:50, sf...@users.sourceforge.net wrote: > Sorry for my late reply. > > OmegaPhil: >> Suddenly the VM doesn't error anymore in the particular test I set up, >> so back on the server, I fiddled with the rsync init.d script and ran >> the daemon via 'strace -fv'. One EINVAL hit in the resulting file, here >> is it with some context: > > Thanks. > The trace shows that > - pid1293 opens "/omega1-home/" > - confirms the dir is aufs > - and then issues ioctl(2). > > It is good. > And the readdir(3) call by rsync is converted to ioctl(AUFS_CTL_RDU_INO) > or ioctl(AUFS_CTL_RDU) by libau.so, and it gets EINVAL. Still I don't > know why. But I'd ask you one more thing. Give strace '-e raw=ioctl'. It > will give us the raw parameters which will be helpful to debug libau.so > and aufs kernel module. > > Also I'd suggest you to try a tiny test for libau. Try > $ LD_PRELOAD=libau.so LIBAU=all strace -o /tmp/s -e raw=ioctl ls > /your/aufs/dir > and see the differences between those two strace outputs, one is by your > rsync and the other is by ls. Here is the raw ioctl trace: [pid 10412] chdir("/omega1-home/") = 0 [pid 10412] socketpair(PF_LOCAL, SOCK_STREAM, 0, [4, 6]) = 0 [pid 10412] fcntl(4, F_GETFL) = 0x2 (flags O_RDWR) [pid 10412] fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0 [pid 10412] fcntl(6, F_GETFL) = 0x2 (flags O_RDWR) [pid 10412] fcntl(6, F_SETFL, O_RDWR|O_NONBLOCK) = 0 [pid 10412] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f563688c9d0) = 10413 Process 10413 attached [pid 10412] close(6 [pid 10413] set_robust_list(0x7f563688c9e0, 24 [pid 10412] <... close resumed> ) = 0 [pid 10413] <... set_robust_list resumed> ) = 0 [pid 10412] lstat(".", [pid 10413] close(4 [pid 10412] <... lstat resumed> {st_dev=makedev(0, 34), st_ino=13, st_mode=S_IFDIR|0755, st_nlink=4, st_uid=0, st_gid =0, st_blksize=4096, st_blocks=0, st_size=66, st_atime=2016/01/25-21:10:15, st_mtime=2014/09/13-12:11:23, st_ctime=20 15/01/07-07:46:30}) = 0 [pid 10413] <... close resumed> ) = 0 [pid 10412] openat(AT_FDCWD, ".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC [pid 10413] select(6, [5], [], [5], {60, 0} [pid 10412] <... openat resumed> ) = 6 [pid 10412] brk(0x557f96466000) = 0x557f96466000 [pid 10412] fstatfs(6, {f_type=0x61756673, f_bsize=4096, f_blocks=3418641366, f_bfree=654794207, f_bavail=651742965, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=242, f_frsize=4096}) = 0 [pid 10412] ioctl(0x6, 0xc0404100, 0x74878360) = -1 (errno 22) [pid 10412] sendto(3, "<28>Jan 26 19:19:39 rsyncd[10412"..., 104, MSG_NOSIGNAL, NULL, 0) = 104 [pid 10412] fstatfs(6, {f_type=0x61756673, f_bsize=4096, f_blocks=3418641366, f_bfree=654794207, f_bavail=651742965, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=242, f_frsize=4096}) = 0 [pid 10412] futex(0x7f5635ab50a8, FUTEX_WAKE_PRIVATE, 2147483647) = 0 [pid 10412] close(6)= 0 [pid 10412] lstat(".", {st_dev=makedev(0, 34), st_ino=13, st_mode=S_IFDIR|0755, st_nlink=4, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=66, st_atime=2016/01/25-21:10:15, st_mtime=2014/09/13-12:11:23, st_ctime=2015/01/07-07:46:30}) = 0 See 'errno 22'. Sorry, was obsessing about rsync - ls also fails in the same way with the environment variables used... stat("/mnt/backups/omega1-home", {st_mode=S_IFDIR|0755, st_size=66, ...}) = 0 openat(AT_FDCWD, "/mnt/backups/omega1-home", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 fstatfs(3, {f_type=0x61756673, f_bsize=4096, f_blocks=3418641366, f_bfree=654792161, f_bavail=651740917, f_files=0, f _ffree=0, f_fsid={0, 0}, f_namelen=242, f_frsize=4096}) = 0 ioctl(0x3, 0xc0404100, 0x7ffd38fe6220) = -1 (errno 22) open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 4 fstat(4, {st_mode=S_IFREG|0644, st_size=2492, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc076108000 read(4, "# Locale name alias data base.\n#"..., 4096) = 2492 read(4, "", 4096) = 0 close(4)= 0 munmap(0x7fc076108000, 4096)= 0 > -- > >> Minor update - rsync/aufs has suddenly decided to fail all over the >> place with this supposed kernel memory problem, so even SSH+rsync is >> somewhat unusable currently. > > Do you mean the problem is not related to a specific kernel version? > > > J. R. Okajima That was referring to the previous kernel memory allocation failures, suddenly they take over everything rather than just a few directories (when I converted my backup script to go from rsync daemon
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
OmegaPhil: > Here is the raw ioctl trace: ::: > [pid 10412] ioctl(0x6, 0xc0404100, 0x74878360) =3D -1 (errno 22) ::: Ok, the value 0xc0404100 is correct. Just to make sure, I'd ask you to check your /sys/fs/aufs/config. Does it show you 'CONFIG_AUFS_RDU=y' line? J. R. Okajima -- Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=267308311=/4140
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
On 26/01/16 21:23, sf...@users.sourceforge.net wrote: > OmegaPhil: >> Here is the raw ioctl trace: > ::: >> [pid 10412] ioctl(0x6, 0xc0404100, 0x74878360) =3D -1 (errno 22) > ::: > > Ok, the value 0xc0404100 is correct. > Just to make sure, I'd ask you to check your /sys/fs/aufs/config. Does > it show you 'CONFIG_AUFS_RDU=y' line? > > > J. R. Okajima Well... that was it. What had happened was that I had started the libau testing with the previous kernel version aufs driver build, and technically as libau was still being 'tested', I hadn't written up the changes. So inbetween there is a kernel upgrade, and naturally I follow the normal notes to build the aufs driver... which have the standard Debian state of commented out CONFIG_AUFS_RDU... Its been a month or so so there was no way that'd come to mind as I build aufs (as far as my brain is concerned, libau is just aufs-util stuff), thus this screwup. I've just donated, sorry for wasting your time. Currently testing a single normal rsync daemon call atm, bound to succeed. Thanks signature.asc Description: OpenPGP digital signature -- Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=267308311=/4140
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
OmegaPhil: > Well... that was it. What had happened was that I had started the libau ::: I'm glad that your problem is solved! > I've just donated, sorry for wasting your time. Oh, thank you very much. Donation will surely make the life of aufs (and me) longer. I will add your name to the list in aufs README file. By the way, I'm really surprised that you wrote Japanese language!! and it made me smile. :-) J. R. Okajima -- Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=267308311=/4140
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
Sorry for my late reply. OmegaPhil: > Suddenly the VM doesn't error anymore in the particular test I set up, > so back on the server, I fiddled with the rsync init.d script and ran > the daemon via 'strace -fv'. One EINVAL hit in the resulting file, here > is it with some context: Thanks. The trace shows that - pid1293 opens "/omega1-home/" - confirms the dir is aufs - and then issues ioctl(2). It is good. And the readdir(3) call by rsync is converted to ioctl(AUFS_CTL_RDU_INO) or ioctl(AUFS_CTL_RDU) by libau.so, and it gets EINVAL. Still I don't know why. But I'd ask you one more thing. Give strace '-e raw=ioctl'. It will give us the raw parameters which will be helpful to debug libau.so and aufs kernel module. Also I'd suggest you to try a tiny test for libau. Try $ LD_PRELOAD=libau.so LIBAU=all strace -o /tmp/s -e raw=ioctl ls /your/aufs/dir and see the differences between those two strace outputs, one is by your rsync and the other is by ls. -- > Minor update - rsync/aufs has suddenly decided to fail all over the > place with this supposed kernel memory problem, so even SSH+rsync is > somewhat unusable currently. Do you mean the problem is not related to a specific kernel version? J. R. Okajima -- Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=267308311=/4140
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
On 21/01/16 19:39, OmegaPhil wrote: > On 20/01/16 22:01, sf...@users.sourceforge.net wrote: >> OmegaPhil: >>> It has now been some time since I got the kernel memory allocation >>> failures, so clearly the libau hack has fixed it - thanks. >> >> Glad to hear that! >> (Honestly speaking, I totally forgot about this issue) >> >> >>> In the manpage, please can you change 'If you have a directory which has >>> millions of files' to say 'tens of thousands of files', and it would be >>> useful to mention 'page allocation failure' somehow so that its easy for >> ::: >> >> How about the attached diff? > > The diff looks good, however for normal users it might be useful to > force them to think 'syslog', since normal programs will probably throw > a useless generic 'I/O error': > > 'You may meet "out of memory" message or "page allocation failure" due > to the memory fragmentation or real starvation' > > V > > 'A program using the directory may throw an "out of memory" error and/or > the kernel may output a "page allocation failure" associated with the > program in the syslog, due to memory fragmentation or real starvation' > > >>> rsync: readdir("/omega1-storage-4/." (in backups)): Invalid argument (22)= >> >> Hmm, won't you investigate it a little more? >> - which systemcall returned EINVAL(22)? >> - what parameter did rsync pass to the systemcall (or readdir)? >> >> And is your $LIBAU set to "all"? > > I did look into it on the rsync side, didn't look useful - see > https://download.samba.org/pub/unpacked/rsync/flist.c:send_directory, > the readdir is called on line 1739, with the error reported on 1771. > > Suddenly the VM doesn't error anymore in the particular test I set up, > so back on the server, I fiddled with the rsync init.d script and ran > the daemon via 'strace -fv'. One EINVAL hit in the resulting file, here > is it with some context: > > === > > [pid 1293] stat("/omega1-home/", {st_dev=makedev(0, 34), st_ino=273972, > st_mode=S_IFDIR|0755, st_nlink=4, st_uid=0, st_gid=0, st_blksize=4096, > st_blocks=0, st_size=66, st_atime=2016/01/20-20:44:12, > st_mtime=2014/09/13-12:11:23, st_ctime=2015/01/07-07:46:30}) = 0 > [pid 1293] chdir("/omega1-home/") = 0 > [pid 1293] socketpair(PF_LOCAL, SOCK_STREAM, 0, [4, 6]) = 0 > [pid 1293] fcntl(4, F_GETFL) = 0x2 (flags O_RDWR) > [pid 1293] fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > [pid 1293] fcntl(6, F_GETFL) = 0x2 (flags O_RDWR) > [pid 1293] fcntl(6, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > [pid 1293] clone(child_stack=0, > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, > child_tidptr=0x7f7b499149d0) = 1294 > Process 1294 attached > [pid 1293] close(6 > [pid 1294] set_robust_list(0x7f7b499149e0, 24 > [pid 1293] <... close resumed> ) = 0 > [pid 1294] <... set_robust_list resumed> ) = 0 > [pid 1293] lstat(".", > [pid 1294] close(4 > [pid 1293] <... lstat resumed> {st_dev=makedev(0, 34), st_ino=273972, > st_mode=S_IFDIR|0755, st_nlink=4, st_uid=0, st_gid=0, st_blksize=4096, > st_blocks=0, st_size=66, st_atime=2016/01/20-20:44:12, > st_mtime=2014/09/13-12:11:23, st_ctime=2015/01/07-07:46:30}) = 0 > [pid 1294] <... close resumed> ) = 0 > [pid 1293] openat(AT_FDCWD, ".", > O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC > [pid 1294] select(6, [5], [], [5], {60, 0} > [pid 1293] <... openat resumed> ) = 6 > [pid 1293] brk(0x564d6bf68000) = 0x564d6bf68000 > [pid 1293] fstatfs(6, {f_type=0x61756673, f_bsize=4096, > f_blocks=3418641366, f_bfree=652846041, f_bavail=649635649, f_files=0, > f_ffree=0, f_fsid={0, 0}, f_namelen=242, f_frsize=4096}) = 0 > [pid 1293] ioctl(6, _IOC(_IOC_READ|_IOC_WRITE, 0x41, 0x00, 0x40), > 0x7ffc3f394810) = -1 EINVAL (Invalid argument) > [pid 1293] sendto(3, "<28>Jan 21 19:11:31 rsyncd[1293]"..., 103, > MSG_NOSIGNAL, NULL, 0) = 103 > [pid 1293] fstatfs(6, {f_type=0x61756673, f_bsize=4096, > f_blocks=3418641366, f_bfree=652846041, f_bavail=649635649, f_files=0, > f_ffree=0, f_fsid={0, 0}, f_namelen=242, f_frsize=4096}) = 0 > [pid 1293] futex(0x7f7b48b3d0a8, FUTEX_WAKE_PRIVATE, 2147483647) = 0 > [pid 1293] close(6)= 0 > [pid 1293] lstat(".", {st_dev=makedev(0, 34), st_ino=273972, > st_mode=S_IFDIR|0755, st_nlink=4, st_uid=0, st_gid=0, st_blksize=4096, > st_blocks=0, st_size=66, st_atime=2016/01/20-20:44:12, > st_mtime=2014/09/13-12:11:23, st_ctime=2015/01/07-07:46:30}) = 0 > > === > > After lstating '.', rsync appears to go on and lstat the subdirectories. > I'm guessing that due to the failure being an ioctl call, it didn't > appear in the usual '-e trace=file' invocation? > > >>> This appears to have happened after I upgraded the kernel to v4.3.3-5, >> >> Is this version debian kernel pkg's? >> According to your post in last year, your system is >> 4.2.0-1-amd64 #1 SMP Debian 4.2.5-1
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
On 20/01/16 22:01, sf...@users.sourceforge.net wrote: > OmegaPhil: >> It has now been some time since I got the kernel memory allocation >> failures, so clearly the libau hack has fixed it - thanks. > > Glad to hear that! > (Honestly speaking, I totally forgot about this issue) > > >> In the manpage, please can you change 'If you have a directory which has >> millions of files' to say 'tens of thousands of files', and it would be >> useful to mention 'page allocation failure' somehow so that its easy for > ::: > > How about the attached diff? The diff looks good, however for normal users it might be useful to force them to think 'syslog', since normal programs will probably throw a useless generic 'I/O error': 'You may meet "out of memory" message or "page allocation failure" due to the memory fragmentation or real starvation' V 'A program using the directory may throw an "out of memory" error and/or the kernel may output a "page allocation failure" associated with the program in the syslog, due to memory fragmentation or real starvation' >> rsync: readdir("/omega1-storage-4/." (in backups)): Invalid argument (22)= > > Hmm, won't you investigate it a little more? > - which systemcall returned EINVAL(22)? > - what parameter did rsync pass to the systemcall (or readdir)? > > And is your $LIBAU set to "all"? I did look into it on the rsync side, didn't look useful - see https://download.samba.org/pub/unpacked/rsync/flist.c:send_directory, the readdir is called on line 1739, with the error reported on 1771. Suddenly the VM doesn't error anymore in the particular test I set up, so back on the server, I fiddled with the rsync init.d script and ran the daemon via 'strace -fv'. One EINVAL hit in the resulting file, here is it with some context: === [pid 1293] stat("/omega1-home/", {st_dev=makedev(0, 34), st_ino=273972, st_mode=S_IFDIR|0755, st_nlink=4, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=66, st_atime=2016/01/20-20:44:12, st_mtime=2014/09/13-12:11:23, st_ctime=2015/01/07-07:46:30}) = 0 [pid 1293] chdir("/omega1-home/") = 0 [pid 1293] socketpair(PF_LOCAL, SOCK_STREAM, 0, [4, 6]) = 0 [pid 1293] fcntl(4, F_GETFL) = 0x2 (flags O_RDWR) [pid 1293] fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0 [pid 1293] fcntl(6, F_GETFL) = 0x2 (flags O_RDWR) [pid 1293] fcntl(6, F_SETFL, O_RDWR|O_NONBLOCK) = 0 [pid 1293] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f7b499149d0) = 1294 Process 1294 attached [pid 1293] close(6 [pid 1294] set_robust_list(0x7f7b499149e0, 24 [pid 1293] <... close resumed> ) = 0 [pid 1294] <... set_robust_list resumed> ) = 0 [pid 1293] lstat(".", [pid 1294] close(4 [pid 1293] <... lstat resumed> {st_dev=makedev(0, 34), st_ino=273972, st_mode=S_IFDIR|0755, st_nlink=4, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=66, st_atime=2016/01/20-20:44:12, st_mtime=2014/09/13-12:11:23, st_ctime=2015/01/07-07:46:30}) = 0 [pid 1294] <... close resumed> ) = 0 [pid 1293] openat(AT_FDCWD, ".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC [pid 1294] select(6, [5], [], [5], {60, 0} [pid 1293] <... openat resumed> ) = 6 [pid 1293] brk(0x564d6bf68000) = 0x564d6bf68000 [pid 1293] fstatfs(6, {f_type=0x61756673, f_bsize=4096, f_blocks=3418641366, f_bfree=652846041, f_bavail=649635649, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=242, f_frsize=4096}) = 0 [pid 1293] ioctl(6, _IOC(_IOC_READ|_IOC_WRITE, 0x41, 0x00, 0x40), 0x7ffc3f394810) = -1 EINVAL (Invalid argument) [pid 1293] sendto(3, "<28>Jan 21 19:11:31 rsyncd[1293]"..., 103, MSG_NOSIGNAL, NULL, 0) = 103 [pid 1293] fstatfs(6, {f_type=0x61756673, f_bsize=4096, f_blocks=3418641366, f_bfree=652846041, f_bavail=649635649, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=242, f_frsize=4096}) = 0 [pid 1293] futex(0x7f7b48b3d0a8, FUTEX_WAKE_PRIVATE, 2147483647) = 0 [pid 1293] close(6)= 0 [pid 1293] lstat(".", {st_dev=makedev(0, 34), st_ino=273972, st_mode=S_IFDIR|0755, st_nlink=4, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=66, st_atime=2016/01/20-20:44:12, st_mtime=2014/09/13-12:11:23, st_ctime=2015/01/07-07:46:30}) = 0 === After lstating '.', rsync appears to go on and lstat the subdirectories. I'm guessing that due to the failure being an ioctl call, it didn't appear in the usual '-e trace=file' invocation? >> This appears to have happened after I upgraded the kernel to v4.3.3-5, > > Is this version debian kernel pkg's? > According to your post in last year, your system is > 4.2.0-1-amd64 #1 SMP Debian 4.2.5-1 (2015-10-27) x86_64 > GNU/Linux - Debian Testing standard kernel. > > If this problem is specific to debian v4.3.3-5 kernel, then I will try > finding the changes made in > 1. vanilla v4.3.3 > 2. debian v4.3.3-5 >
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
On 24/11/15 12:59, OmegaPhil wrote: > On 24/11/15 05:27, sf...@users.sourceforge.net wrote: >> OmegaPhil: >> ::: >>> repos, so I'm using this - currently the rsync init.d script has been >>> edited to export the right LD_PRELOAD and LIBAU values, and I've >>> confirmed the library has been loaded via: >>> >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= >>> =3D=3D=3D >>> >>> lsof -a -c rsync +D /usr/lib >>> >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= >>> =3D=3D=3D >>> >>> Looking at kern logs the problem happened twice so far this month and 4 >>> times in October, so a month's test should demonstrate this working. >> >> You don't have to wait a month. >> You already have a large dir. Try reproducing the problem by "ls >> /large_dir", "rsync --dry-run ..." or something. And then confirm >> setting LD_PRELOAD and LIBAU solves the problem. >> This simple test has an effect to detect something wrong in setting >> LD_PRELOAD and LIBAU (hopefully). >> I'd suggest you to try a simple test. Why? Because I have ever made >> mistakes in setting LD_PRELOAD and LIBAU. :-) >> >> >> J. R. Okajima > > > Ha, that would be nice - the problem is intermittent, only in the most > serious case (where ls was also affected) could it be reliably repeated > (but that situation has long since been cleaned up, haven't had it as > bad since). So saying that it failed twice in a month means that the > rsync daily backup worked ~28 times in the same period (for reference > the backup will cover ~4.3 million files/directories according to > locate). No issues so far with the current setup. > > I'm happy to sit and wait for the real issue to crop up, if it does then > I can be more aggressive with a test case. > > Unless you want me to try and force the issue? It has now been some time since I got the kernel memory allocation failures, so clearly the libau hack has fixed it - thanks. In the manpage, please can you change 'If you have a directory which has millions of files' to say 'tens of thousands of files', and it would be useful to mention 'page allocation failure' somehow so that its easy for others to search on (the problem affects programs interacting with aufs resulting in that message in the syslog, its not obvious what it means/who is responsible etc). Ironically I now have a separate issue with rsync running in daemon mode, which appears to be due to using libau: rsync: readdir("/omega1-storage-4/." (in backups)): Invalid argument (22) Almost every start of an rsync operation fails with this (presumably reading the base directory of the rsync-shared location immediately fails) - commenting out the libau stuff in the '/etc/init.d/rsync' script gets rid of the problem, but naturally I then hit up against the memory issues. It just affects aufs volumes. This appears to have happened after I upgraded the kernel to v4.3.3-5, and aufs at the same time (v4.3-20160111) - I was running off the aufs-tools package from Debian (1:3.2+20130722-1.1), so I built and installed my own aufs-util package from the latest source, however the problem still occurs. rsync hasn't changed since 7.03.15 (v3.1.1-3), and only using it as a daemon (i.e. with the rsync protocol) does the failure trigger - rsync over SSH works fine. Has anyone else had such problems? signature.asc Description: OpenPGP digital signature -- Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=267308311=/4140
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
On 20/01/16 20:41, OmegaPhil wrote: > On 24/11/15 12:59, OmegaPhil wrote: >> On 24/11/15 05:27, sf...@users.sourceforge.net wrote: >>> OmegaPhil: >>> ::: repos, so I'm using this - currently the rsync init.d script has been edited to export the right LD_PRELOAD and LIBAU values, and I've confirmed the library has been loaded via: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D lsof -a -c rsync +D /usr/lib =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D Looking at kern logs the problem happened twice so far this month and 4 times in October, so a month's test should demonstrate this working. >>> >>> You don't have to wait a month. >>> You already have a large dir. Try reproducing the problem by "ls >>> /large_dir", "rsync --dry-run ..." or something. And then confirm >>> setting LD_PRELOAD and LIBAU solves the problem. >>> This simple test has an effect to detect something wrong in setting >>> LD_PRELOAD and LIBAU (hopefully). >>> I'd suggest you to try a simple test. Why? Because I have ever made >>> mistakes in setting LD_PRELOAD and LIBAU. :-) >>> >>> >>> J. R. Okajima >> >> >> Ha, that would be nice - the problem is intermittent, only in the most >> serious case (where ls was also affected) could it be reliably repeated >> (but that situation has long since been cleaned up, haven't had it as >> bad since). So saying that it failed twice in a month means that the >> rsync daily backup worked ~28 times in the same period (for reference >> the backup will cover ~4.3 million files/directories according to >> locate). No issues so far with the current setup. >> >> I'm happy to sit and wait for the real issue to crop up, if it does then >> I can be more aggressive with a test case. >> >> Unless you want me to try and force the issue? > > > It has now been some time since I got the kernel memory allocation > failures, so clearly the libau hack has fixed it - thanks. > > In the manpage, please can you change 'If you have a directory which has > millions of files' to say 'tens of thousands of files', and it would be > useful to mention 'page allocation failure' somehow so that its easy for > others to search on (the problem affects programs interacting with aufs > resulting in that message in the syslog, its not obvious what it > means/who is responsible etc). > > Ironically I now have a separate issue with rsync running in daemon > mode, which appears to be due to using libau: > > > > rsync: readdir("/omega1-storage-4/." (in backups)): Invalid argument (22) > > > > Almost every start of an rsync operation fails with this (presumably > reading the base directory of the rsync-shared location immediately > fails) - commenting out the libau stuff in the '/etc/init.d/rsync' > script gets rid of the problem, but naturally I then hit up against the > memory issues. It just affects aufs volumes. > > This appears to have happened after I upgraded the kernel to v4.3.3-5, > and aufs at the same time (v4.3-20160111) - I was running off the > aufs-tools package from Debian (1:3.2+20130722-1.1), so I built and > installed my own aufs-util package from the latest source, however the > problem still occurs. > > rsync hasn't changed since 7.03.15 (v3.1.1-3), and only using it as a > daemon (i.e. with the rsync protocol) does the failure trigger - rsync > over SSH works fine. > > Has anyone else had such problems? Sorry forgot to mention - I set up a trivial rsync daemon + aufs setup in a Debian Testing VM, and it fails in the same way, so it isn't something peculiar to the normal Debian Testing server. signature.asc Description: OpenPGP digital signature -- Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=267308311=/4140
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
OmegaPhil: > It has now been some time since I got the kernel memory allocation > failures, so clearly the libau hack has fixed it - thanks. Glad to hear that! (Honestly speaking, I totally forgot about this issue) > In the manpage, please can you change 'If you have a directory which has > millions of files' to say 'tens of thousands of files', and it would be > useful to mention 'page allocation failure' somehow so that its easy for ::: How about the attached diff? > rsync: readdir("/omega1-storage-4/." (in backups)): Invalid argument (22)= Hmm, won't you investigate it a little more? - which systemcall returned EINVAL(22)? - what parameter did rsync pass to the systemcall (or readdir)? And is your $LIBAU set to "all"? > This appears to have happened after I upgraded the kernel to v4.3.3-5, Is this version debian kernel pkg's? According to your post in last year, your system is 4.2.0-1-amd64 #1 SMP Debian 4.2.5-1 (2015-10-27) x86_64 GNU/Linux - Debian Testing standard kernel. If this problem is specific to debian v4.3.3-5 kernel, then I will try finding the changes made in 1. vanilla v4.3.3 2. debian v4.3.3-5 particulary around ioctl(2). J. R. Okajima diff --git a/aufs.in.5 b/aufs.in.5 index 38b0884..f2e4875 100644 --- a/aufs.in.5 +++ b/aufs.in.5 @@ -304,7 +304,7 @@ The default value is \*[AUFS_DIRWH_DEF]. Specifies a size of internal VDIR block which is allocated at a time in byte. The VDIR block will be allocated several times when necessary. If your -directory has millions of files, you may want to expand this size. +directory has tens of thousands of files, you may want to expand this size. The default value is defined as \*[AUFS_RDBLK_DEF]. The size has to be lager than NAME_MAX (usually 255) and kmalloc\-able (the maximum limit depends on your system. at least 128KB is available @@ -324,7 +324,7 @@ Specifies a size of internal VDIR hash table which is used to compare the file names under the same named directory on multiple branches. The VDIR hash table will be allocated in readdir(3)/getdents(2), rmdir(2) and rename(2) for the existing target directory. If your -directory has millions of files, you may want to expand this size. +directory has tens of thousands of files, you may want to expand this size. The default value is defined as \*[AUFS_RDHASH_DEF]. The size has to be lager than zero, and it will be multiplied by 4 or 8 (for 32\-bit and 64\-bit respectively, currently). The result must be @@ -1307,7 +1307,7 @@ During building dir blocks, aufs creates hash list (hashed and divided by the entry is whiteout-ed by its upper branch or already listed. These values are suitable for normal environments. But you may have -millions of files or very long filenames under a single directory. For +tens of tousands of files or very long filenames under a single directory. For such cases, you may need to customize these values by specifying rdblk= and rdhash= aufs mount options. @@ -1330,7 +1330,7 @@ not, the number of comparison will be 4. 2 memory allocations and 4 comparison costs low (even if the directory is opened for a long time). So you do not need to customize. -If your directory has millions of files, the you will need to specify +If your directory has tens of thousands of files, the you will need to specify rdblk= and rdhash=. .nf @@ -1381,8 +1381,9 @@ If you use pathconf(3)/fpathconf(3) with _PC_LINK_MAX for aufs, you need to use libau.so. .SS VDIR/readdir(3) in user\-space (RDU) -If you have a directory which has millions of files, aufs VDIR consumes -much memory. You may meet "out of memory" message due to +If you have a directory which has tens of thousands of files, aufs VDIR consumes +much memory. You may meet "out of memory" message or "page allocation +failure" due to the memory fragmentation or real starvation. In this case, RDU (readdir(3) in user\-space) may help you. Because the kernel memory space cannot be swappable and consuming much -- Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=267308311=/4140
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
On 24/11/15 05:27, sf...@users.sourceforge.net wrote: > OmegaPhil: > ::: >> repos, so I'm using this - currently the rsync init.d script has been >> edited to export the right LD_PRELOAD and LIBAU values, and I've >> confirmed the library has been loaded via: >> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= >> =3D=3D=3D >> >> lsof -a -c rsync +D /usr/lib >> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= >> =3D=3D=3D >> >> Looking at kern logs the problem happened twice so far this month and 4 >> times in October, so a month's test should demonstrate this working. > > You don't have to wait a month. > You already have a large dir. Try reproducing the problem by "ls > /large_dir", "rsync --dry-run ..." or something. And then confirm > setting LD_PRELOAD and LIBAU solves the problem. > This simple test has an effect to detect something wrong in setting > LD_PRELOAD and LIBAU (hopefully). > I'd suggest you to try a simple test. Why? Because I have ever made > mistakes in setting LD_PRELOAD and LIBAU. :-) > > > J. R. Okajima Ha, that would be nice - the problem is intermittent, only in the most serious case (where ls was also affected) could it be reliably repeated (but that situation has long since been cleaned up, haven't had it as bad since). So saying that it failed twice in a month means that the rsync daily backup worked ~28 times in the same period (for reference the backup will cover ~4.3 million files/directories according to locate). No issues so far with the current setup. I'm happy to sit and wait for the real issue to crop up, if it does then I can be more aggressive with a test case. Unless you want me to try and force the issue? signature.asc Description: OpenPGP digital signature -- Go from Idea to Many App Stores Faster with Intel(R) XDK Give your users amazing mobile app experiences with Intel(R) XDK. Use one codebase in this all-in-one HTML5 development environment. Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs. http://pubads.g.doubleclick.net/gampad/clk?id=254741551=/4140
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
OmegaPhil: ::: > repos, so I'm using this - currently the rsync init.d script has been > edited to export the right LD_PRELOAD and LIBAU values, and I've > confirmed the library has been loaded via: > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= > =3D=3D=3D > > lsof -a -c rsync +D /usr/lib > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= > =3D=3D=3D > > Looking at kern logs the problem happened twice so far this month and 4 > times in October, so a month's test should demonstrate this working. You don't have to wait a month. You already have a large dir. Try reproducing the problem by "ls /large_dir", "rsync --dry-run ..." or something. And then confirm setting LD_PRELOAD and LIBAU solves the problem. This simple test has an effect to detect something wrong in setting LD_PRELOAD and LIBAU (hopefully). I'd suggest you to try a simple test. Why? Because I have ever made mistakes in setting LD_PRELOAD and LIBAU. :-) J. R. Okajima -- Go from Idea to Many App Stores Faster with Intel(R) XDK Give your users amazing mobile app experiences with Intel(R) XDK. Use one codebase in this all-in-one HTML5 development environment. Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs. http://pubads.g.doubleclick.net/gampad/clk?id=254741551=/4140
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
On 20/11/15 23:42, sf...@users.sourceforge.net wrote: > Hello OmegaPhil, > > OmegaPhil: >> For some time now (across kernel versions, aufs versions, machines, etc) >> I have noticed kernel memory allocation failures intermittently when >> something lists a large aufs directory (the most recent example having >> 20,000 files in it). My aufs usage is to have multiple equally-important > ::: > > For those who has such many files in a single dir, aufs provides > readdir-in-userspace feature. In other words, the memory allocation is > done in user-space instead of kernel-space. > Try enabling CONFIG_AUFS_RDU, rebuild your kernel, and in userspace set > LD_PRELOAD and LIBAU environment variables. See aufs manual and > aufs-util manual for details. > > > J. R. Okajima Thanks - have just read this in the manpage - I don't have 'millions of files' in one directory (although there'll be over a million files in the whole volume), and the box itself has ~4GB of RAM (2GB used, which includes the kernel slab) so there shouldn't be any excuse of actually running out of memory. Anyway, I'm happy to look into readdir-in-userspace regardless - currently I use the stock Debian kernels (CONFIG_AUFS_RDU isn't part of that source which is already patched to run aufs) so I don't have to rebuild the kernel, but this should just be recompiling the aufs driver which is fine. I'll get on with that now, thanks. signature.asc Description: OpenPGP digital signature --
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
On 21/11/15 14:25, OmegaPhil wrote: > On 20/11/15 23:42, sf...@users.sourceforge.net wrote: >> Hello OmegaPhil, >> >> OmegaPhil: >>> For some time now (across kernel versions, aufs versions, machines, etc) >>> I have noticed kernel memory allocation failures intermittently when >>> something lists a large aufs directory (the most recent example having >>> 20,000 files in it). My aufs usage is to have multiple equally-important >> ::: >> >> For those who has such many files in a single dir, aufs provides >> readdir-in-userspace feature. In other words, the memory allocation is >> done in user-space instead of kernel-space. >> Try enabling CONFIG_AUFS_RDU, rebuild your kernel, and in userspace set >> LD_PRELOAD and LIBAU environment variables. See aufs manual and >> aufs-util manual for details. >> >> >> J. R. Okajima > > > Thanks - have just read this in the manpage - I don't have 'millions of > files' in one directory (although there'll be over a million files in > the whole volume), and the box itself has ~4GB of RAM (2GB used, which > includes the kernel slab) so there shouldn't be any excuse of actually > running out of memory. > > Anyway, I'm happy to look into readdir-in-userspace regardless - > currently I use the stock Debian kernels (CONFIG_AUFS_RDU isn't part of > that source which is already patched to run aufs) so I don't have to > rebuild the kernel, but this should just be recompiling the aufs driver > which is fine. I'll get on with that now, thanks. libau.so comes with the Debian aufs-tools package which is still in the repos, so I'm using this - currently the rsync init.d script has been edited to export the right LD_PRELOAD and LIBAU values, and I've confirmed the library has been loaded via: lsof -a -c rsync +D /usr/lib Looking at kern logs the problem happened twice so far this month and 4 times in October, so a month's test should demonstrate this working. signature.asc Description: OpenPGP digital signature --
Re: Kernel memory allocation failures involving large aufs directories (4.2-20151102)
Hello OmegaPhil, OmegaPhil: > For some time now (across kernel versions, aufs versions, machines, etc) > I have noticed kernel memory allocation failures intermittently when > something lists a large aufs directory (the most recent example having > 20,000 files in it). My aufs usage is to have multiple equally-important ::: For those who has such many files in a single dir, aufs provides readdir-in-userspace feature. In other words, the memory allocation is done in user-space instead of kernel-space. Try enabling CONFIG_AUFS_RDU, rebuild your kernel, and in userspace set LD_PRELOAD and LIBAU environment variables. See aufs manual and aufs-util manual for details. J. R. Okajima --