On Tue, 2018-02-27 at 15:09 +0800, chenxiang (M) wrote:
> 在 2018/2/26 23:25, Bart Van Assche 写道:
> > On Mon, 2018-02-26 at 17:37 +0800, chenxiang (M) wrote:
> > > When i have a test on kernel 4.16-rc1, find a issue: running IO on SATA
> > > disk, then disable the disk through
> > > sysfs interface(echo 0 > /sys/class/sas_phy/phy-1:0:0/enable), IO will
> > > hang and never enter SCSI EH. The issue
> > > appears every time.
> > >
> > > I add some prints on code and find that those IOs will be timeout after
> > > 30s, and they all enter
> > > function scsi_eh_scmd_add, but only some of them can enter function
> > > scsi_eh_inc_host_failed. So it will never
> > > enter SCSI EH. I suspect it is related to the patch ("commit
> > > 3bd6f43f5cb371" scsi: core: Ensure that the
> > > SCSI error handler gets woken up ). Please have a check.
> >
> > Hello chenxiang,
> >
> > Had you already noticed patch "[PATCH v2] Avoid that ATA error handling can
> > trigger a kernel hang or oops"? If not, can you apply that patch to your
> > kernel and verify whether it fixes this behavior? See also
> > https://www.mail-archive.com/[email protected]/msg71189.html or
> > https://patchwork.kernel.org/patch/10236213/.
>
> After applied your patch, the issue i reported seems be solved.
Thanks for having testing that patch!
> But when i have long time test(disable/enable disk when running IO) on
> the testcase, Null pointer occurs.
> It seems not related to current issue but i am not sure.
> I ran the testcase for long time before in kernel 4.15-rc5, and it was okay.
>
> Part of log is as follows, and i add attachment of log in the email :
>
> [ 485.716578] pc : blk_abort_request+0x14/0x68
(+Tejun)
Hello chenxiang,
Please check whether the following patch fixes the kernel crash you ran into:
https://marc.info/?l=linux-block&m=151895951207014
Thanks,
Bart.