Test blacklisted for older kernels

** Changed in: ubuntu-kernel-tests
       Status: New => Fix Released

** Changed in: ubuntu-kernel-tests
     Assignee: (unassigned) => Po-Hsu Lin (cypressyew)

You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs

  fanotify07 in LTP syscall test generates kernel trace with T/X kernel

Status in ubuntu-kernel-tests:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Trusty:
  Won't Fix
Status in linux source package in Xenial:
  Won't Fix

Bug description:
  BugLink: https://bugs.launchpad.net/bugs/1775165


  When userspace tasks which are processing fanotify permission events act 
  incorrectly, the fsnotify_mark_srcu SRCU is held indefinitely which causes
  the whole notification subsystem to hang. 

  This has been seen in production, and it can also be seen when running the 
  Linux Test Project testsuite, specifically fanotify07. 


  Instead of holding the SRCU lock while waiting for userspace to respond, 
  which may never happen, or not in the order we are expecting, we drop the 
  fsnotify_mark_srcu SRCU lock before waiting for userspace response, and then 
  reacquire the lock again when userspace responds.

  The fixes are from a series of upstream commits:

  05f0e38724e8449184acd8fbf0473ee5a07adc6c (cherry-pick)
  9385a84d7e1f658bb2d96ab798393e4b16268aaa (backport)
  abc77577a669f424c5d0c185b9994f2621c52aa4 (backport)

  The following are upstream commits necessary for the fixes to

  35e481761cdc688dbee0ef552a13f49af8eba6cc (backport)
  0918f1c309b86301605650c836ddd2021d311ae2 (cherry-pick)


  You can reproduce the problem pretty quickly with the Linux Test

  Steps (with root):
    1. sudo apt-get install git xfsprogs -y
    2. git clone --depth=1 https://github.com/linux-test-project/ltp.git
    3. cd ltp
    4. make autotools
    5. ./configure
    6. make; make install
    7. cd /opt/ltp
    8. echo -e "fanotify07 fanotify07 \nfanotify08 fanotify08" > /tmp/jobs
    9. ./runltp -f /tmp/jobs
  On a stock Xenial kernel, the system will hang, and the testcase will look 

  tag=fanotify07 stime=1554326200
  cmdline="fanotify07 "
  tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
  Test timeouted, sending SIGKILL!
  Test timeouted, sending SIGKILL!
  Test timeouted, sending SIGKILL!
  Test timeouted, sending SIGKILL!
  Test timeouted, sending SIGKILL!
  Test timeouted, sending SIGKILL!
  Test timeouted, sending SIGKILL!
  Test timeouted, sending SIGKILL!
  Test timeouted, sending SIGKILL!
  Test timeouted, sending SIGKILL!
  Test timeouted, sending SIGKILL!
  Cannot kill test processes!
  Congratulation, likely test hit a kernel bug.
  Exitting uncleanly...
  duration=350 termination_type=exited termination_id=1 corefile=no
  cutime=0 cstime=0

  Looking at dmesg, we see the following call stack

  [  790.772792] LTP: starting fanotify07 (fanotify07 )
  [  960.140455] INFO: task fsnotify_mark:36 blocked for more than 120 seconds.
  [  960.140867]       Not tainted 4.4.0-142-generic #168-Ubuntu
  [  960.141185] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  960.141498] fsnotify_mark   D ffff8800b6703c98     0    36      2 
  [  960.141516]  ffff8800b6703c98 ffff88013a558a00 ffff8800b7797000 
  [  960.141524]  ffff8800b6704000 7fffffffffffffff ffff8800b6703de0 
  [  960.141528]  0000000000000000 ffff8800b6703cb0 ffffffff8185cb45 
  [  960.141532] Call Trace:
  [  960.141580]  [<ffffffff8185cb45>] schedule+0x35/0x80
  [  960.141588]  [<ffffffff818600f4>] schedule_timeout+0x1b4/0x270
  [  960.141617]  [<ffffffff810f57ac>] ? mod_timer+0x10c/0x240
  [  960.141621]  [<ffffffff8185c60d>] ? __schedule+0x30d/0x810
  [  960.141625]  [<ffffffff8185d652>] wait_for_completion+0xb2/0x190
  [  960.141636]  [<ffffffff810b1f10>] ? wake_up_q+0x70/0x70
  [  960.141641]  [<ffffffff810eb140>] __synchronize_srcu+0x100/0x1a0
  [  960.141645]  [<ffffffff810ea400>] ? 
  [  960.141664]  [<ffffffff81260870>] ? fsnotify_put_mark+0x40/0x40
  [  960.141669]  [<ffffffff810eb204>] synchronize_srcu+0x24/0x30
  [  960.141672]  [<ffffffff812608f4>] fsnotify_mark_destroy+0x84/0x130
  [  960.141680]  [<ffffffff810ca000>] ? wake_atomic_t_function+0x60/0x60
  [  960.141691]  [<ffffffff810a6227>] kthread+0xe7/0x100
  [  960.141694]  [<ffffffff8185c601>] ? __schedule+0x301/0x810
  [  960.141699]  [<ffffffff810a6140>] ? kthread_create_on_node+0x1e0/0x1e0
  [  960.141703]  [<ffffffff818618e5>] ret_from_fork+0x55/0x80
  [  960.141706]  [<ffffffff810a6140>] ? kthread_create_on_node+0x1e0/0x1e0

  The vanilla 4.4 kernel also shows the same call stack.

  On a patched kernel, the test will pass successfully, and there will be no
  messages in dmesg. 

  [Regression Potential]

  This makes modifications to how locking is performed in fsnotify / fanotify 
  there may be some cause for regression. Running all fanotify Linux Test 
  tests shows that there are no extra failures caused by the patches, and 
  fewer failures are seen due to the bugfix. 

  Running the entire Linux Test Project testsuite actually works and runs to 
  completion, somewhich doesn't happen in a unpatched kernel since it will hang
  on the fanotify07 test.

  The patches are taken from upstream, and all necessary commits have been taken
  into account, so I am happy with the potential risks and that testing has been

To manage notifications about this bug go to:

Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to     : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp

Reply via email to