Eddie Horng: > It seems all tasks are trying to lock sbinfo->si_xib_mtx, but log shows > nobody is holding it. I also got similar problem in my codefs, but it's > very rare, not like this case now I can almost always reproduce it. To > isolate issue, I changed ro branch from codefs to ext4 and still can > reproduce. > List reproduce step as below, maybe can give you some hint: > - The codebase is an android source package > - make -j128 (found -j32 can also repro) > - ctrl-c to interrupt build > - the build system -- ninja will kill all subprocesses > - In most case a clang++.real process will left running forever with high > cpu%
It looks a livelock (instead of a deadlock). Still I cannot see the whole scenario, but I've found a suspicious, ah no, "related" commit in linux-v4.10-rc7 5abf186 2017-02-03 mm, fs: check for fatal signals in do_generic_file_read() For aufs, the side effect of this commit is very similar to your problem. It will cause a livelock in aufs and will happen in reading from aufs XINO files. There was a similar commit in linux-v4.3, but it was "write to XINO" instead of "read from XINO" case. 296291c 2015-10-23 mm: make sendfile(2) killable and I fixed it by the commit 5e439ff 2016-01-05 aufs: for 4.3, XINO handles EINTR from the dying process which made aufs4.3 retries "write to XINO" in another context after EINTR. Hmm, if the cause of your problem is the endless loop after EINTR in "read from XINO", then it should be solved by the same approach which aufs4.3. It requilres a small conversion "write to / read from" XINO. The conditions to reproduce the problem should be like these. - linux-v4.10-rc7 and later - aufs has many files (over 32K) - the process is killed by SIGKILL I've tried, but I could not reproduce the problem in the beginning... I'd ask you to test this patch. If you can, please try. J. R. Okajima
a.patch.bz2
Description: BZip2 compressed data
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot