http://bugzilla.kernel.org/show_bug.cgi?id=14235
Summary: SRP initiator lockup Product: Drivers Version: 2.5 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: high Priority: P1 Component: Infiniband/RDMA AssignedTo: drivers_infiniband-r...@kernel-bugs.osdl.org ReportedBy: bart.vanass...@gmail.com Regression: No If an SRP target processes SRP I/O slow enough, the SRP initiator locks up. This issue is 100% reproducible with the following setup: Target: * Kernel 2.6.30.4 with SCST patches applied and kernel debugging enabled. * SCST r1153 with EXTRA_CFLAGS += -DCONFIG_SCST_TRACING -DCONFIG_SCST_DEBUG -g added in srpt/src/Makefile and with EXTRA_CFLAGS += -DCONFIG_SCST_TRACING added in scst/src/Makefile. * ib_srpt loaded with kernel module parameters thread=0 and processing_delay_in_us=500. Initiator: * Kernel 2.6.31.1 with kernel debugging enabled. * SRP login has been performed as follows: rmmod ib_srp; modprobe ib_srp; ibsrpdm -c | while read target_info; do echo "${target_info}"; echo "${target_info}" > /sys/class/infiniband_srp/srp-mlx4_0-1/add_target; done * After SRP login succeeded the following fio command was started: fio --rw=rw --bs=64M --rwmixread=100 --numjobs=1 --iodepth=1 --sync=0 --direct=1 --ioengine=sync --filename=/dev/${srp_initiator_device} --name=test --loops=1000 --runtime=600 --size=2G After a few minutes fio locked up (I/O rate dropped from 1500 MB/s to 0 MB/s) and the following kernel message started appearing periodically: INFO: task fio:6389 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fio D 0000000000000000 0 6389 6388 0x00000000 ffff880071dc5bd8 0000000000000046 ffff880071dc5b08 000000018107764d 0000000000012cc0 000000000000de20 0000000000000001 ffff880070cd8000 ffff880070cd83b0 0000000100000000 000000010001193e ffff88007fb99050 Call Trace: [<ffffffff812ec5e5>] ? _spin_unlock_irqrestore+0x65/0x80 [<ffffffff812e9b37>] io_schedule+0x37/0x50 [<ffffffff8110cff2>] __blockdev_direct_IO+0x692/0xd80 [<ffffffff810e0357>] ? get_super+0x27/0xc0 [<ffffffff8110b169>] blkdev_direct_IO+0x49/0x50 [<ffffffff8110a1f0>] ? blkdev_get_blocks+0x0/0xc0 [<ffffffff810a1799>] generic_file_aio_read+0x679/0x690 [<ffffffff810dc35a>] ? __dentry_open+0x13a/0x340 [<ffffffff810de091>] do_sync_read+0xf1/0x140 [<ffffffff810775ed>] ? trace_hardirqs_on_caller+0x14d/0x1a0 [<ffffffff810662f0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff810775ed>] ? trace_hardirqs_on_caller+0x14d/0x1a0 [<ffffffff8107764d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff810ded28>] vfs_read+0xc8/0x180 [<ffffffff810deed0>] sys_read+0x50/0x90 [<ffffffff8100be6b>] system_call_fastpath+0x16/0x1b no locks held by fio/6389. -- Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. _______________________________________________ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general