Thank you for reporting this.

At first look I'm very surprised, in that the libiscsi error handling
seems to have always had this non-irqsafe session locking going on.

I'm looking at it, but no answers right away I'm afraid.

Thanks,
Chris

On Wed, Jan 31, 2018 at 04:42:19PM -0800, Andrei Vagin wrote:
> Hello,
> 
> Today I found this warning in a kernel log. Intially I saw this warning on
> rhel7 kernel, but then I reproduced it on the 4.15 upsteam kernel too.
> 
> [ 2528.738454] =====================================================
> [ 2528.744679] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
> [ 2528.749891] 4.15.0-00006-g4548e6f42022-dirty #418 Not tainted
> [ 2528.754356] -----------------------------------------------------
> [ 2528.759643] kworker/0:1H/100 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
> [ 2528.764608]  (&(&session->frwd_lock)->rlock){+.-.}, at: 
> [<0000000097d8de0f>] iscsi_eh_cmd_timed_out+0x3d/0x2b0
> [ 2528.769164] 
>                and this task is already holding:
> [ 2528.771313]  (&(&q->__queue_lock)->rlock){-.-.}, at: [<00000000a105bf06>] 
> blk_timeout_work+0x22/0x120
> [ 2528.773825] which would create a new lock dependency:
> [ 2528.775216]  (&(&q->__queue_lock)->rlock){-.-.} -> 
> (&(&session->frwd_lock)->rlock){+.-.}
> [ 2528.777071] 
>                but this new dependency connects a HARDIRQ-irq-safe lock:
> [ 2528.778945]  (&(&q->__queue_lock)->rlock){-.-.}
> [ 2528.778948] 
>                ... which became HARDIRQ-irq-safe at:
> [ 2528.781204]   _raw_spin_lock_irqsave+0x3c/0x4b
> [ 2528.781966]   cfq_idle_slice_timer+0x2f/0x100
> [ 2528.782705]   __hrtimer_run_queues+0xdc/0x440
> [ 2528.783448]   hrtimer_interrupt+0xb1/0x210
> [ 2528.784149]   smp_apic_timer_interrupt+0x6d/0x260
> [ 2528.784954]   apic_timer_interrupt+0xac/0xc0
> [ 2528.785679]   native_safe_halt+0x2/0x10
> [ 2528.786280]   default_idle+0x1f/0x180
> [ 2528.786806]   do_idle+0x166/0x1e0
> [ 2528.787288]   cpu_startup_entry+0x6f/0x80
> [ 2528.787874]   start_secondary+0x186/0x1e0
> [ 2528.788448]   secondary_startup_64+0xa5/0xb0
> [ 2528.789070] 
>                to a HARDIRQ-irq-unsafe lock:
> [ 2528.789913]  (&(&session->frwd_lock)->rlock){+.-.}
> [ 2528.789915] 
>                ... which became HARDIRQ-irq-unsafe at:
> [ 2528.791548] ...
> [ 2528.791553]   _raw_spin_lock_bh+0x34/0x40
> [ 2528.792366]   iscsi_conn_setup+0x166/0x220
> [ 2528.792960]   iscsi_tcp_conn_setup+0x10/0x40
> [ 2528.793526]   iscsi_sw_tcp_conn_create+0x1b/0x160
> [ 2528.794111]   iscsi_if_rx+0xf9f/0x1580
> [ 2528.794542]   netlink_unicast+0x1f7/0x2f0
> [ 2528.795105]   netlink_sendmsg+0x2c9/0x3c0
> [ 2528.795636]   sock_sendmsg+0x30/0x40
> [ 2528.796068]   ___sys_sendmsg+0x269/0x2c0
> [ 2528.796544]   __sys_sendmsg+0x51/0x90
> [ 2528.797028]   entry_SYSCALL_64_fastpath+0x25/0x9c
> [ 2528.797595] 
>                other info that might help us debug this:
> 
> [ 2528.798868]  Possible interrupt unsafe locking scenario:
> 
> [ 2528.799923]        CPU0                    CPU1
> [ 2528.801121]        ----                    ----
> [ 2528.801769]   lock(&(&session->frwd_lock)->rlock);
> [ 2528.802509]                                local_irq_disable();
> [ 2528.803450]                                
> lock(&(&q->__queue_lock)->rlock);
> [ 2528.804599]                                
> lock(&(&session->frwd_lock)->rlock);
> [ 2528.806565]   <Interrupt>
> [ 2528.807321]     lock(&(&q->__queue_lock)->rlock);
> [ 2528.808699] 
>                 *** DEADLOCK ***
> 
> [ 2528.810112] 3 locks held by kworker/0:1H/100:
> [ 2528.811200]  #0:  ((wq_completion)"kblockd"){+.+.}, at: 
> [<00000000bab60c8d>] process_one_work+0x1a1/0x640
> [ 2528.812832]  #1:  ((work_completion)(&q->timeout_work)#2){+.+.}, at: 
> [<00000000bab60c8d>] process_one_work+0x1a1/0x640
> [ 2528.814187]  #2:  (&(&q->__queue_lock)->rlock){-.-.}, at: 
> [<00000000a105bf06>] blk_timeout_work+0x22/0x120
> [ 2528.815396] 
>                the dependencies between HARDIRQ-irq-safe lock and the holding 
> lock:
> [ 2528.816514] -> (&(&q->__queue_lock)->rlock){-.-.} ops: 5426 {
> [ 2528.817357]    IN-HARDIRQ-W at:
> [ 2528.817903]                     _raw_spin_lock_irqsave+0x3c/0x4b
> [ 2528.818792]                     cfq_idle_slice_timer+0x2f/0x100
> [ 2528.820365]                     __hrtimer_run_queues+0xdc/0x440
> [ 2528.821161]                     hrtimer_interrupt+0xb1/0x210
> [ 2528.821882]                     smp_apic_timer_interrupt+0x6d/0x260
> [ 2528.822690]                     apic_timer_interrupt+0xac/0xc0
> [ 2528.823436]                     native_safe_halt+0x2/0x10
> [ 2528.824120]                     default_idle+0x1f/0x180
> [ 2528.824800]                     do_idle+0x166/0x1e0
> [ 2528.825427]                     cpu_startup_entry+0x6f/0x80
> [ 2528.826162]                     start_secondary+0x186/0x1e0
> [ 2528.826870]                     secondary_startup_64+0xa5/0xb0
> [ 2528.827603]    IN-SOFTIRQ-W at:
> [ 2528.828015]                     _raw_spin_lock_irqsave+0x3c/0x4b
> [ 2528.828780]                     scsi_end_request+0xea/0x1e0
> [ 2528.829530]                     scsi_io_completion+0x22f/0x610
> [ 2528.830300]                     blk_done_softirq+0xa4/0xd0
> [ 2528.831047]                     __do_softirq+0xc5/0x4a4
> [ 2528.831728]                     irq_exit+0xf1/0x100
> [ 2528.832358]                     do_IRQ+0x89/0x100
> [ 2528.832997]                     ret_from_intr+0x0/0x22
> [ 2528.833671]                     native_safe_halt+0x2/0x10
> [ 2528.834375]                     default_idle+0x1f/0x180
> [ 2528.835045]                     do_idle+0x166/0x1e0
> [ 2528.835686]                     cpu_startup_entry+0x6f/0x80
> [ 2528.836419]                     start_kernel+0x44a/0x46a
> [ 2528.837121]                     secondary_startup_64+0xa5/0xb0
> [ 2528.838296]    INITIAL USE at:
> [ 2528.839025]                    _raw_spin_lock_irq+0x31/0x40
> [ 2528.840332]                    blkcg_init_queue+0x85/0x1a0
> [ 2528.841301]                    blk_alloc_queue_node+0x2dd/0x340
> [ 2528.842254]                    blk_mq_init_queue+0x1b/0x60
> [ 2528.843727]                    loop_add+0x122/0x310
> [ 2528.844386]                    loop_init+0x12d/0x167
> [ 2528.845060]                    do_one_initcall+0x4b/0x18c
> [ 2528.845750]                    kernel_init_freeable+0x1fb/0x296
> [ 2528.846526]                    kernel_init+0xa/0x100
> [ 2528.847226]                    ret_from_fork+0x3a/0x50
> [ 2528.847886]  }
> [ 2528.848109]  ... key      at: [<000000006e6fba44>] __key.54024+0x0/0x8
> [ 2528.848934]  ... acquired at:
> [ 2528.849316]    _raw_spin_lock+0x25/0x30
> [ 2528.849804]    iscsi_eh_cmd_timed_out+0x3d/0x2b0
> [ 2528.850421]    scsi_times_out+0x6e/0x200
> [ 2528.850961]    blk_rq_timed_out+0x1b/0x50
> [ 2528.851501]    blk_timeout_work+0xd4/0x120
> [ 2528.852030]    process_one_work+0x21d/0x640
> [ 2528.852581]    worker_thread+0x2e/0x380
> [ 2528.853086]    kthread+0x111/0x130
> [ 2528.853538]    ret_from_fork+0x3a/0x50
> 
> [ 2528.854283] 
>                the dependencies between the lock to be acquired
> [ 2528.854284]  and HARDIRQ-irq-unsafe lock:
> [ 2528.855708] -> (&(&session->frwd_lock)->rlock){+.-.} ops: 2734 {
> [ 2528.856473]    HARDIRQ-ON-W at:
> [ 2528.856884]                     _raw_spin_lock_bh+0x34/0x40
> [ 2528.857588]                     iscsi_conn_setup+0x166/0x220
> [ 2528.858309]                     iscsi_tcp_conn_setup+0x10/0x40
> [ 2528.859049]                     iscsi_sw_tcp_conn_create+0x1b/0x160
> [ 2528.859837]                     iscsi_if_rx+0xf9f/0x1580
> [ 2528.860516]                     netlink_unicast+0x1f7/0x2f0
> [ 2528.861223]                     netlink_sendmsg+0x2c9/0x3c0
> [ 2528.861931]                     sock_sendmsg+0x30/0x40
> [ 2528.862581]                     ___sys_sendmsg+0x269/0x2c0
> [ 2528.863278]                     __sys_sendmsg+0x51/0x90
> [ 2528.863923]                     entry_SYSCALL_64_fastpath+0x25/0x9c
> [ 2528.864684]    IN-SOFTIRQ-W at:
> [ 2528.865078]                     _raw_spin_lock+0x25/0x30
> [ 2528.865725]                     iscsi_check_transport_timeouts+0x21/0x1a0
> [ 2528.866557]                     call_timer_fn+0x96/0x300
> [ 2528.867211]                     run_timer_softirq+0x450/0x560
> [ 2528.867918]                     __do_softirq+0xc5/0x4a4
> [ 2528.868557]                     irq_exit+0xf1/0x100
> [ 2528.869159]                     smp_apic_timer_interrupt+0x8e/0x260
> [ 2528.869924]                     apic_timer_interrupt+0xac/0xc0
> [ 2528.870645]                     native_safe_halt+0x2/0x10
> [ 2528.871366]                     default_idle+0x1f/0x180
> [ 2528.872043]                     do_idle+0x166/0x1e0
> [ 2528.872700]                     cpu_startup_entry+0x6f/0x80
> [ 2528.873418]                     start_secondary+0x186/0x1e0
> [ 2528.874127]                     secondary_startup_64+0xa5/0xb0
> [ 2528.874833]    INITIAL USE at:
> [ 2528.875219]                    _raw_spin_lock_bh+0x34/0x40
> [ 2528.875930]                    iscsi_conn_setup+0x166/0x220
> [ 2528.876607]                    iscsi_tcp_conn_setup+0x10/0x40
> [ 2528.877310]                    iscsi_sw_tcp_conn_create+0x1b/0x160
> [ 2528.878066]                    iscsi_if_rx+0xf9f/0x1580
> [ 2528.878702]                    netlink_unicast+0x1f7/0x2f0
> [ 2528.879374]                    netlink_sendmsg+0x2c9/0x3c0
> [ 2528.880047]                    sock_sendmsg+0x30/0x40
> [ 2528.880662]                    ___sys_sendmsg+0x269/0x2c0
> [ 2528.881323]                    __sys_sendmsg+0x51/0x90
> [ 2528.881951]                    entry_SYSCALL_64_fastpath+0x25/0x9c
> [ 2528.882698]  }
> [ 2528.882916]  ... key      at: [<00000000926a7aad>] __key.69241+0x0/0x8
> [ 2528.883707]  ... acquired at:
> [ 2528.884299]    _raw_spin_lock+0x25/0x30
> [ 2528.885178]    iscsi_eh_cmd_timed_out+0x3d/0x2b0
> [ 2528.886286]    scsi_times_out+0x6e/0x200
> [ 2528.887127]    blk_rq_timed_out+0x1b/0x50
> [ 2528.887756]    blk_timeout_work+0xd4/0x120
> [ 2528.888413]    process_one_work+0x21d/0x640
> [ 2528.889147]    worker_thread+0x2e/0x380
> [ 2528.890235]    kthread+0x111/0x130
> [ 2528.890674]    ret_from_fork+0x3a/0x50
> 
> [ 2528.891405] 
>                stack backtrace:
> [ 2528.892009] CPU: 0 PID: 100 Comm: kworker/0:1H Not tainted 
> 4.15.0-00006-g4548e6f42022-dirty #418
> [ 2528.893148] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.10.2-1.fc26 04/01/2014
> [ 2528.894245] Workqueue: kblockd blk_timeout_work
> [ 2528.894840] Call Trace:
> [ 2528.895191]  dump_stack+0x85/0xc5
> [ 2528.895625]  check_usage+0x4b3/0x4c0
> [ 2528.896098]  ? __update_load_avg_se.isra.26+0x1b6/0x1c0
> [ 2528.896756]  ? __lock_acquire+0x1062/0x1310
> [ 2528.897272]  __lock_acquire+0x1062/0x1310
> [ 2528.897764]  ? ring_buffer_lock_reserve+0x10a/0x390
> [ 2528.898367]  ? scsi_times_out+0x6e/0x200
> [ 2528.898871]  ? lock_acquire+0x9f/0x1e0
> [ 2528.899348]  lock_acquire+0x9f/0x1e0
> [ 2528.899803]  ? iscsi_eh_cmd_timed_out+0x3d/0x2b0
> [ 2528.900399]  _raw_spin_lock+0x25/0x30
> [ 2528.900908]  ? iscsi_eh_cmd_timed_out+0x3d/0x2b0
> [ 2528.901494]  iscsi_eh_cmd_timed_out+0x3d/0x2b0
> [ 2528.902059]  scsi_times_out+0x6e/0x200
> [ 2528.902539]  blk_rq_timed_out+0x1b/0x50
> [ 2528.903016]  blk_timeout_work+0xd4/0x120
> [ 2528.903516]  process_one_work+0x21d/0x640
> [ 2528.904033]  worker_thread+0x2e/0x380
> [ 2528.904487]  ? process_one_work+0x640/0x640
> [ 2528.905025]  kthread+0x111/0x130
> [ 2528.905441]  ? kthread_create_worker_on_cpu+0x70/0x70
> [ 2528.906090]  ret_from_fork+0x3a/0x50
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "open-iscsi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to open-iscsi+unsubscr...@googlegroups.com.
> To post to this group, send email to open-iscsi@googlegroups.com.
> Visit this group at https://groups.google.com/group/open-iscsi.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at https://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.

Reply via email to