On 20/01/16 20:41, OmegaPhil wrote: > On 24/11/15 12:59, OmegaPhil wrote: >> On 24/11/15 05:27, sf...@users.sourceforge.net wrote: >>> OmegaPhil: >>> ::: >>>> repos, so I'm using this - currently the rsync init.d script has been >>>> edited to export the right LD_PRELOAD and LIBAU values, and I've >>>> confirmed the library has been loaded via: >>>> >>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= >>>> =3D=3D=3D >>>> >>>> lsof -a -c rsync +D /usr/lib >>>> >>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= >>>> =3D=3D=3D >>>> >>>> Looking at kern logs the problem happened twice so far this month and 4 >>>> times in October, so a month's test should demonstrate this working. >>> >>> You don't have to wait a month. >>> You already have a large dir. Try reproducing the problem by "ls >>> /large_dir", "rsync --dry-run ..." or something. And then confirm >>> setting LD_PRELOAD and LIBAU solves the problem. >>> This simple test has an effect to detect something wrong in setting >>> LD_PRELOAD and LIBAU (hopefully). >>> I'd suggest you to try a simple test. Why? Because I have ever made >>> mistakes in setting LD_PRELOAD and LIBAU. :-) >>> >>> >>> J. R. Okajima >> >> >> Ha, that would be nice - the problem is intermittent, only in the most >> serious case (where ls was also affected) could it be reliably repeated >> (but that situation has long since been cleaned up, haven't had it as >> bad since). So saying that it failed twice in a month means that the >> rsync daily backup worked ~28 times in the same period (for reference >> the backup will cover ~4.3 million files/directories according to >> locate). No issues so far with the current setup. >> >> I'm happy to sit and wait for the real issue to crop up, if it does then >> I can be more aggressive with a test case. >> >> Unless you want me to try and force the issue? > > > It has now been some time since I got the kernel memory allocation > failures, so clearly the libau hack has fixed it - thanks. > > In the manpage, please can you change 'If you have a directory which has > millions of files' to say 'tens of thousands of files', and it would be > useful to mention 'page allocation failure' somehow so that its easy for > others to search on (the problem affects programs interacting with aufs > resulting in that message in the syslog, its not obvious what it > means/who is responsible etc). > > Ironically I now have a separate issue with rsync running in daemon > mode, which appears to be due to using libau: > > ==================================================================== > > rsync: readdir("/omega1-storage-4/." (in backups)): Invalid argument (22) > > ==================================================================== > > Almost every start of an rsync operation fails with this (presumably > reading the base directory of the rsync-shared location immediately > fails) - commenting out the libau stuff in the '/etc/init.d/rsync' > script gets rid of the problem, but naturally I then hit up against the > memory issues. It just affects aufs volumes. > > This appears to have happened after I upgraded the kernel to v4.3.3-5, > and aufs at the same time (v4.3-20160111) - I was running off the > aufs-tools package from Debian (1:3.2+20130722-1.1), so I built and > installed my own aufs-util package from the latest source, however the > problem still occurs. > > rsync hasn't changed since 7.03.15 (v3.1.1-3), and only using it as a > daemon (i.e. with the rsync protocol) does the failure trigger - rsync > over SSH works fine. > > Has anyone else had such problems?
Sorry forgot to mention - I set up a trivial rsync daemon + aufs setup in a Debian Testing VM, and it fails in the same way, so it isn't something peculiar to the normal Debian Testing server.
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140