Re: mpt3sas: sleeping function called from invalid context

2018-03-19 Thread Jaco Kroon
Hi All,

On 14/03/2018 03:29, Bart Van Assche wrote:
> (+Jaco)
Bart, thanks for adding me.
>
> On Tue, 2018-03-13 at 16:18 +0530, Suganath Prabu Subramani wrote:
>> We have root caused the issue and it is same as you mentioned.
>> "_scsih_get_enclosure_logicalid_chassis_slot()" is called with interrupts
>> disabled and this function
>> "_scsih_get_enclosure_logicalid_chassis_slot" again calls
>> _config_request(), with mutex_lock().
>>
>> We have patch ready along with few other change and we ll be posting
>> it by tomorrow after covering BST.
Has there been any progress?  We're currently seeing our server going
down again, and we'd like to eliminate this as the cause.  Currently IO
is still flowing but some IO has started to deadlock.

Kind Regards,
Jaco


Re: mpt3sas: sleeping function called from invalid context

2018-03-13 Thread Bart Van Assche
(+Jaco)

On Tue, 2018-03-13 at 16:18 +0530, Suganath Prabu Subramani wrote:
> We have root caused the issue and it is same as you mentioned.
> "_scsih_get_enclosure_logicalid_chassis_slot()" is called with interrupts
> disabled and this function
> "_scsih_get_enclosure_logicalid_chassis_slot" again calls
> _config_request(), with mutex_lock().
> 
> We have patch ready along with few other change and we ll be posting
> it by tomorrow after covering BST.




Re: mpt3sas: sleeping function called from invalid context

2018-03-13 Thread Suganath Prabu Subramani
Hi Bart,

We have root caused the issue and it is same as you mentioned.
"_scsih_get_enclosure_logicalid_chassis_slot()" is called with interrupts
disabled and this function
"_scsih_get_enclosure_logicalid_chassis_slot" again calls
_config_request(), with mutex_lock().

We have patch ready along with few other change and we ll be posting
it by tomorrow after covering BST.

Thanks,
Suganath Prabu S

On Mon, Mar 12, 2018 at 11:53 PM, Bart Van Assche
 wrote:
> Hello,
>
> For the first I/O request after boot that is sent to a disk attached to an
> mpt3sas adapter I see the below complaint appearing in the kernel log. This
> occurs at least with kernels v4.16-rc4 and v4.16-rc5.
>
> What I see in the mpt3sas source code is that
> _scsih_get_enclosure_logicalid_chassis_slot() is called with interrupts
> disabled and also that a function called by that function, namely
> _config_request(), calls mutex_lock().
>
> Can someone who is more familiar than I with the mpt3sas adapter have a look
> at this and propose a fix?
>
> Thanks,
>
> Bart.
>
> BUG: sleeping function called from invalid context at 
> kernel/locking/mutex.c:747
> in_atomic(): 1, irqs_disabled(): 1, pid: 2389, name: kworker/u64:1
> INFO: lockdep is turned off.
> irq event stamp: 278
> hardirqs last  enabled at (277): [<32c577ec>] 
> _raw_spin_unlock_irq+0x24/0x50
> hardirqs last disabled at (278): [<6082e2fa>] __schedule+0x120/0x1010
> softirqs last  enabled at (0): [<8c2eb285>] 
> copy_process.part.45+0x930/0x3470
> softirqs last disabled at (0): [<  (null)>]   (null)
> Preemption disabled at:
> [<>]   (null)
> CPU: 3 PID: 2389 Comm: kworker/u64:1 Tainted: GW
> 4.16.0-rc5-dbg+ #1
> Workqueue: poll_mpt3sas0_statu _base_fault_reset_work [mpt3sas]
> Call Trace:
> dump_stack+0x67/0x90
> ___might_sleep+0x1da/0x2c0
> __mutex_lock+0xb9/0xbb0
> _config_request.constprop.5+0xa3/0xe70 [mpt3sas]
> mpt3sas_config_get_enclosure_pg0+0xb3/0x110 [mpt3sas]
> _scsih_get_enclosure_logicalid_chassis_slot+0xf8/0x160 [mpt3sas]
> mpt3sas_scsih_reset_handler+0x3f6/0xb30 [mpt3sas]
> mpt3sas_base_hard_reset_handler+0x49a/0x7c0 [mpt3sas]
> _base_fault_reset_work+0x1bb/0x260 [mpt3sas]
> process_one_work+0x441/0xa50
> worker_thread+0x76/0x6c0
> kthread+0x1b2/0x1d0
> ret_from_fork+0x24/0x30
>


mpt3sas: sleeping function called from invalid context

2018-03-12 Thread Bart Van Assche
Hello,

For the first I/O request after boot that is sent to a disk attached to an
mpt3sas adapter I see the below complaint appearing in the kernel log. This
occurs at least with kernels v4.16-rc4 and v4.16-rc5.

What I see in the mpt3sas source code is that
_scsih_get_enclosure_logicalid_chassis_slot() is called with interrupts
disabled and also that a function called by that function, namely
_config_request(), calls mutex_lock().

Can someone who is more familiar than I with the mpt3sas adapter have a look
at this and propose a fix?

Thanks,

Bart.

BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
in_atomic(): 1, irqs_disabled(): 1, pid: 2389, name: kworker/u64:1
INFO: lockdep is turned off.
irq event stamp: 278
hardirqs last  enabled at (277): [<32c577ec>] 
_raw_spin_unlock_irq+0x24/0x50
hardirqs last disabled at (278): [<6082e2fa>] __schedule+0x120/0x1010
softirqs last  enabled at (0): [<8c2eb285>] 
copy_process.part.45+0x930/0x3470
softirqs last disabled at (0): [<  (null)>]   (null)
Preemption disabled at:
[<>]   (null)
CPU: 3 PID: 2389 Comm: kworker/u64:1 Tainted: GW4.16.0-rc5-dbg+ 
#1
Workqueue: poll_mpt3sas0_statu _base_fault_reset_work [mpt3sas]
Call Trace:
dump_stack+0x67/0x90
___might_sleep+0x1da/0x2c0
__mutex_lock+0xb9/0xbb0
_config_request.constprop.5+0xa3/0xe70 [mpt3sas]
mpt3sas_config_get_enclosure_pg0+0xb3/0x110 [mpt3sas]
_scsih_get_enclosure_logicalid_chassis_slot+0xf8/0x160 [mpt3sas]
mpt3sas_scsih_reset_handler+0x3f6/0xb30 [mpt3sas]
mpt3sas_base_hard_reset_handler+0x49a/0x7c0 [mpt3sas]
_base_fault_reset_work+0x1bb/0x260 [mpt3sas]
process_one_work+0x441/0xa50
worker_thread+0x76/0x6c0
kthread+0x1b2/0x1d0
ret_from_fork+0x24/0x30