Re: mpt3sas: sleeping function called from invalid context

2018-03-19 Thread Jaco Kroon
Hi All,

On 14/03/2018 03:29, Bart Van Assche wrote:
> (+Jaco)
Bart, thanks for adding me.
>
> On Tue, 2018-03-13 at 16:18 +0530, Suganath Prabu Subramani wrote:
>> We have root caused the issue and it is same as you mentioned.
>> "_scsih_get_enclosure_logicalid_chassis_slot()" is called with interrupts
>> disabled and this function
>> "_scsih_get_enclosure_logicalid_chassis_slot" again calls
>> _config_request(), with mutex_lock().
>>
>> We have patch ready along with few other change and we ll be posting
>> it by tomorrow after covering BST.
Has there been any progress?  We're currently seeing our server going
down again, and we'd like to eliminate this as the cause.  Currently IO
is still flowing but some IO has started to deadlock.

Kind Regards,
Jaco


Re: mpt3sas: sleeping function called from invalid context

2018-03-13 Thread Bart Van Assche
(+Jaco)

On Tue, 2018-03-13 at 16:18 +0530, Suganath Prabu Subramani wrote:
> We have root caused the issue and it is same as you mentioned.
> "_scsih_get_enclosure_logicalid_chassis_slot()" is called with interrupts
> disabled and this function
> "_scsih_get_enclosure_logicalid_chassis_slot" again calls
> _config_request(), with mutex_lock().
> 
> We have patch ready along with few other change and we ll be posting
> it by tomorrow after covering BST.




Re: mpt3sas: sleeping function called from invalid context

2018-03-13 Thread Suganath Prabu Subramani
Hi Bart,

We have root caused the issue and it is same as you mentioned.
"_scsih_get_enclosure_logicalid_chassis_slot()" is called with interrupts
disabled and this function
"_scsih_get_enclosure_logicalid_chassis_slot" again calls
_config_request(), with mutex_lock().

We have patch ready along with few other change and we ll be posting
it by tomorrow after covering BST.

Thanks,
Suganath Prabu S

On Mon, Mar 12, 2018 at 11:53 PM, Bart Van Assche
 wrote:
> Hello,
>
> For the first I/O request after boot that is sent to a disk attached to an
> mpt3sas adapter I see the below complaint appearing in the kernel log. This
> occurs at least with kernels v4.16-rc4 and v4.16-rc5.
>
> What I see in the mpt3sas source code is that
> _scsih_get_enclosure_logicalid_chassis_slot() is called with interrupts
> disabled and also that a function called by that function, namely
> _config_request(), calls mutex_lock().
>
> Can someone who is more familiar than I with the mpt3sas adapter have a look
> at this and propose a fix?
>
> Thanks,
>
> Bart.
>
> BUG: sleeping function called from invalid context at 
> kernel/locking/mutex.c:747
> in_atomic(): 1, irqs_disabled(): 1, pid: 2389, name: kworker/u64:1
> INFO: lockdep is turned off.
> irq event stamp: 278
> hardirqs last  enabled at (277): [<32c577ec>] 
> _raw_spin_unlock_irq+0x24/0x50
> hardirqs last disabled at (278): [<6082e2fa>] __schedule+0x120/0x1010
> softirqs last  enabled at (0): [<8c2eb285>] 
> copy_process.part.45+0x930/0x3470
> softirqs last disabled at (0): [<  (null)>]   (null)
> Preemption disabled at:
> [<>]   (null)
> CPU: 3 PID: 2389 Comm: kworker/u64:1 Tainted: GW
> 4.16.0-rc5-dbg+ #1
> Workqueue: poll_mpt3sas0_statu _base_fault_reset_work [mpt3sas]
> Call Trace:
> dump_stack+0x67/0x90
> ___might_sleep+0x1da/0x2c0
> __mutex_lock+0xb9/0xbb0
> _config_request.constprop.5+0xa3/0xe70 [mpt3sas]
> mpt3sas_config_get_enclosure_pg0+0xb3/0x110 [mpt3sas]
> _scsih_get_enclosure_logicalid_chassis_slot+0xf8/0x160 [mpt3sas]
> mpt3sas_scsih_reset_handler+0x3f6/0xb30 [mpt3sas]
> mpt3sas_base_hard_reset_handler+0x49a/0x7c0 [mpt3sas]
> _base_fault_reset_work+0x1bb/0x260 [mpt3sas]
> process_one_work+0x441/0xa50
> worker_thread+0x76/0x6c0
> kthread+0x1b2/0x1d0
> ret_from_fork+0x24/0x30
>