scsi_eh_scmd_add() currently only will fail if no
error handler thread is started (which will never be the
case) or if the state machine encounters an illegal transition.
But if we're encountering an invalid state transition
chances is we cannot fixup things with the error handler.
So better add
If sd_eh_action() decides to take the device offline there is
no point in returning FAILED, as taking the device offline
is the ultimate step in SCSI EH anyway.
So further escalation via SCSI EH is not likely to make a
difference and we can as well return SUCCESS.
Cc: Benjamin Block
Hi all,
this is a resend of a small patchset for cleaning up SCSI EH.
Primary goal is to make asynchronous aborts mandatory; there hasn't
been a single report so far where asynchronous abort won't work, so
the 'no_async_abort' flag has never been used and will be removed
with this patchset.
If a failed command is retried and fails again we need
to enter SCSI EH, otherwise we will never be able to
recover the command.
To detect this situation we must not clear scmd->eh_eflags
when EH finishes but rather make it persistent throughout
the lifetime of the command.
Signed-off-by: Hannes
The current medium access timeout counter will be increased for
each command, so if there are enough failed commands we'll hit
the medium access timeout for even a single device failure and
the following kernel message is displayed:
sd H:C:T:L: [sdXY] Medium access timeout failure. Offlining
There hasn't been any reports for HBAs where asynchronous abort
would not work, so we should make it mandatory and remove
the fallback.
Signed-off-by: Hannes Reinecke
Reviewed-by: Johannes Thumshirn
Reviewed-by: Bart Van Assche
The block layer always calls the timeout function from a workqueue
context, so there is no need to have yet another workqueue for
running command aborts.
Signed-off-by: Hannes Reinecke
---
drivers/scsi/scsi.c | 2 --
drivers/scsi/scsi_error.c | 83
When a command has timed out we always should be sending an
abort; with the previous code a failed abort might signal
SCSI EH to start, and all other timed out commands will
never be aborted, even though they might belong to a
different ITL nexus.
Cc: Benjamin Block
From: Christoph Hellwig
We now first try to call ->eh_abort_handler from a work queue, but libsas
was always failing that for no good reason. Allow async aborts.
Reviewed-by: Johannes Thumshirn
Reviewed-by: Hannes Reinecke
Signed-off-by: Christoph
On Thu, Apr 06, 2017 at 09:58:58AM +0200, Christoph Hellwig wrote:
> Ok, the version below simplify skip the function split entirely:
>
> ---
> From 7c9ca58f1d8cf53b42f14a51e02d0f3d0f12ab45 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig
> Date: Thu, 12 Jan 2017 11:17:29 +0100
>
On 05/04/17 11:33 PM, Sagi Grimberg wrote:
>
>>> Note that the nvme completion queues are still on the host memory, so
>>> this means we have lost the ordering between data and completions as
>>> they go to different pcie targets.
>>
>> Hmm, in this simple up/down case with a switch, I think it
Mauricio,
> The commit 08024885a2a3 ("ses: Add power_status to SES device slot")
> introduced the 'power_status' attribute to enclosure components and
> the associated callbacks.
Applied to 4.12/scsi-queue, thanks!
--
Martin K. Petersen Oracle Linux Engineering
Martin Wilck writes:
Martin,
> I noticed that the following commits
>
> eb94588dabec scsi: hpsa: fix volume offline state
> 2ef288498087 scsi: hpsa: do not timeout reset operations
> 87b9e6aa87d9 scsi: hpsa: limit outstanding rescans
> 85b29008d8af scsi: hpsa: update check for
Hannes Reinecke writes:
Hannes,
> this is a resend of a small patchset for cleaning up SCSI EH. Primary
> goal is to make asynchronous aborts mandatory; there hasn't been a
> single report so far where asynchronous abort won't work, so the
> 'no_async_abort' flag has never been
We want our own clearly defined error field for NVMe passthrough commands,
and the request errors field is going away in its current form.
Just store the status and result field in the nvme_request field from
hardirq completion context (using a new helper) and then generate a
Linux errno for the
This drivers was added in 2008, but as far as a I can tell we never had a
single platform that actually registered resources for the platform driver.
It's also been unmaintained for a long time and apparently has a ATA mode
that can be driven using the IDE/libata subsystem.
Signed-off-by:
Currently the request structure has an errors field that is used in
various different ways. The oldest drivers use it as an error count,
blk-mq and the generic timeout code assume that it holds a Linux
errno for block completions, and various drivers use it for internal
status values, often
Signed-off-by: Christoph Hellwig
---
drivers/block/virtio_blk.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index eaf99022bdc6..dbc4e80680b1 100644
--- a/drivers/block/virtio_blk.c
+++
Signed-off-by: Christoph Hellwig
---
drivers/block/floppy.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index ce102ec47ef2..60d4c7653178 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@
This is for the legacy floppy and ataflop drivers that currently abuse
->errors for this purpose. It's stashed away in a union to not grow
the struct size, the other fields are either used by modern drivers
for different purposes or the I/O scheduler before queing the I/O
to drivers.
This passes on the scsi_cmnd result field to users of passthrough
requests. Currently we abuse req->errors for this purpose, but that
field will go away in its current form.
Note that the old IDE code abuses the errors field in very creative
ways and stores all kinds of different values in it.
dm never uses rq->errors, so there is no need to pass an error argument
to blk_mq_complete_request.
Signed-off-by: Christoph Hellwig
---
drivers/md/dm-rq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index
We'll get all proper errors reported through ->end_io and ->errors will
go away soon.
Signed-off-by: Christoph Hellwig
---
drivers/md/dm-mpath.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index
nvme_complete_async_event expects the little endian status code
including the phase bit, and a new completion handler I plan to
introduce will do so as well.
Change the status variable into the little endian format with the
phase bit used in the NVMe CQE to fix / enable this.
Signed-off-by:
The function only returns -EIO if rq->errors is non-zero, which is not
very useful and lets a large number of callers ignore the return value.
Just let the callers figure out their error themselves.
Signed-off-by: Christoph Hellwig
---
block/blk-exec.c | 8 +---
Currently error is always 0 for non-passthrough requests when reaching the
scsi_noretry_cmd check in scsi_io_completion, which effectively disables
all fastfail logic. Fix this by having a single call to
__scsi_error_from_host_byte at the beginning of the function and always
having a valid error
In thruth I've just audited which blk-mq drivers don't currently have a
complete callback, but I think this change is at least borderline useful.
Signed-off-by: Christoph Hellwig
---
drivers/block/loop.c | 30 ++
drivers/block/loop.h | 1 +
2 files
Signed-off-by: Christoph Hellwig
---
drivers/block/null_blk.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c
index f93906ff31e8..24ca85a70fd8 100644
--- a/drivers/block/null_blk.c
+++ b/drivers/block/null_blk.c
Remove passing req->errors (which at that point is always 0) to
blk_mq_complete_requestq, and rely on the virtio status code for the
serial number passthrough request.
Signed-off-by: Christoph Hellwig
---
drivers/block/virtio_blk.c | 10 +++---
1 file changed, 3 insertions(+),
Currently it's used by the lighnvm passthrough ioctl, but we'd like to make
it private in preparation of block layer specific error code. Lighnvm already
returns the real NVMe status anyway, so I think we can just limit it to
returning -EIO for any status set.
This will need a careful audit from
On Thu, Apr 06, 2017 at 08:33:38AM +0300, Sagi Grimberg wrote:
>
> >>Note that the nvme completion queues are still on the host memory, so
> >>this means we have lost the ordering between data and completions as
> >>they go to different pcie targets.
> >
> >Hmm, in this simple up/down case with a
Hey Sagi,
On 05/04/17 11:47 PM, Sagi Grimberg wrote:
> Because the user can get it wrong, and its our job to do what we can in
> order to prevent the user from screwing itself.
Well, "screwing" themselves seems a bit strong. It wouldn't be much
different from a lot of other tunables in the
Bart Van Assche writes:
> Now that all scsi_device_get() callers check the return value of this
> function, make checking that return value mandatory.
Applied to 4.12/scsi-queue.
--
Martin K. Petersen Oracle Linux Engineering
Bart Van Assche writes:
Bart,
> scsi_device_get() can fail. Hence check its return value.
Applied to 4.12/scsi-queue.
--
Martin K. Petersen Oracle Linux Engineering
Bart Van Assche writes:
>> We previously made sure that the reported disk capacity was less than
>> 0x blocks when the kernel was not compiled with large sector_t
>> support (CONFIG_LBDAF). However, this check assumed that the capacity
>> was reported in units
xen-blkfron is the last users using rq->errros for passing back error to
blk-mq, and I'd like to get rid of that. In the longer run the driver
should be moving more of the completion processing into .complete, but
this is the minimal change to move forward for now.
Signed-off-by: Christoph
Instead of using req->errors, which will go away.
Signed-off-by: Christoph Hellwig
---
drivers/block/mtip32xx/mtip32xx.c | 16 +---
drivers/block/mtip32xx/mtip32xx.h | 1 +
2 files changed, 10 insertions(+), 7 deletions(-)
diff --git
Add a nbd-specific field instead.
Signed-off-by: Christoph Hellwig
---
drivers/block/nbd.c | 28 ++--
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 03ae72985c79..4f045fab9659 100644
---
Signed-off-by: Christoph Hellwig
---
drivers/block/ataflop.c | 12 +++-
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/block/ataflop.c b/drivers/block/ataflop.c
index 2104b1b4ccda..fa69ecd52cb5 100644
--- a/drivers/block/ataflop.c
+++
Now that we always have a ->complete callback we can remove the direct
call to blk_mq_end_request, as well as the error argument to
blk_mq_complete_request.
Signed-off-by: Christoph Hellwig
---
block/blk-mq.c| 14 +++---
drivers/block/loop.c
Signed-off-by: Christoph Hellwig
---
drivers/block/swim3.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/block/swim3.c b/drivers/block/swim3.c
index 61b3ffa4f458..ba4809c9bdba 100644
--- a/drivers/block/swim3.c
+++ b/drivers/block/swim3.c
@@ -343,8
Signed-off-by: Christoph Hellwig
---
include/trace/events/block.h | 44 ++--
kernel/trace/blktrace.c | 9 -
2 files changed, 10 insertions(+), 43 deletions(-)
diff --git a/include/trace/events/block.h
Signed-off-by: Christoph Hellwig
---
block/blk-core.c | 14 +-
block/blk-exec.c | 3 +--
block/blk-mq.c | 10 +++---
block/blk-timeout.c | 1 -
include/linux/blkdev.h | 2 --
include/trace/events/block.h | 17
The driver never sets req->errors
---
drivers/block/paride/pd.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/block/paride/pd.c b/drivers/block/paride/pd.c
index 82c6d02193ae..3b0ab214fe74 100644
--- a/drivers/block/paride/pd.c
+++ b/drivers/block/paride/pd.c
@@
Merge blk_mq_ipi_complete_request and blk_mq_stat_add into their only
caller.
Signed-off-by: Christoph Hellwig
---
block/blk-mq.c | 21 ++---
1 file changed, 6 insertions(+), 15 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index
Christoph Hellwig writes:
> Ok, the version below simplify skip the function split entirely:
Applied to 4.12/scsi-queue.
--
Martin K. Petersen Oracle Linux Engineering
Nicholas Mc Guire writes:
> The redundant init_completion() here seems to be a cut error as
> struct scsi_qla_host only has 4 completion elements to initialize,
> thus the duplicate init_completion(disable_acb_comp) is simply
> removed.
Applied to 4.12/scsi-queue.
--
Martin
David Buckley writes:
David,
> As I mentioned previously, I'm fairly certain that the issue I'm
> seeing is due to the fact that while NetApp LUNs are presented as 512B
> logical/4K physical disks for compatibility, they actually don't
> support requests smaller than 4K
On Thu, Apr 06, 2017 at 05:39:21PM +0200, Christoph Hellwig wrote:
> The function only returns -EIO if rq->errors is non-zero, which is not
> very useful and lets a large number of callers ignore the return value.
>
> Just let the callers figure out their error themselves.
>
> Signed-off-by:
On Thu, Apr 06, 2017 at 05:39:22PM +0200, Christoph Hellwig wrote:
> nvme_complete_async_event expects the little endian status code
> including the phase bit, and a new completion handler I plan to
> introduce will do so as well.
>
> Change the status variable into the little endian format with
When calling min_not_zero, both arguments should have the same type.
Otherwise the compiler will raise a warning:
CC drivers/scsi/sd.o
In file included from ./include/linux/list.h:8:0,
from ./include/linux/module.h:9,
from drivers/scsi/sd.c:35:
On Thu, Apr 06, 2017 at 05:39:19PM +0200, Christoph Hellwig wrote:
> Currently the request structure has an errors field that is used in
> various different ways. The oldest drivers use it as an error count,
> blk-mq and the generic timeout code assume that it holds a Linux
> errno for block
On Thu, Apr 06, 2017 at 05:39:23PM +0200, Christoph Hellwig wrote:
> We want our own clearly defined error field for NVMe passthrough commands,
> and the request errors field is going away in its current form.
>
> Just store the status and result field in the nvme_request field from
> hardirq
On Thu, Apr 06, 2017 at 05:39:25PM +0200, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig
> ---
Fair enough,
Reviewed-by: Johannes Thumshirn
--
Johannes Thumshirn Storage
jthumsh...@suse.de
On Thu, Apr 06, 2017 at 05:39:26PM +0200, Christoph Hellwig wrote:
> Remove passing req->errors (which at that point is always 0) to
> blk_mq_complete_requestq, and rely on the virtio status code for the
blk_mq_complete_request ^
> serial number passthrough request.
>
> Signed-off-by: Christoph
Martin,
I'm rather surprised nobody else has previously reported this as well,
especially as NetApp hadn't received any reports. The only probably
explanation I could think of is that EL 7 is still based on a 3.10
kernel so is too old to be affected, and that is likely to be what
most NetApp
> On Apr 6, 2017, at 11:39 AM, Christoph Hellwig wrote:
>
> Add a nbd-specific field instead.
>
> Signed-off-by: Christoph Hellwig
This is fine with me, you can add,
Reviewed-by: Josef Bacik
Thanks,
Josef
On Thu, 6 Apr 2017, 4:19am, Colin King wrote:
> From: Colin Ian King
>
> There are several local or function parameter pointers that are
> being assigned NULL after a kfree where and these have no effect
> and hence can be removed.
>
> Fixes various cppcheck
> From: Guilherme G. Piccoli [mailto:gpicc...@linux.vnet.ibm.com]
> Sent: Thursday, April 06, 2017 3:12 PM
> To: dl-esc-Aacraid Linux Driver
> Cc: gpicc...@linux.vnet.ibm.com; linux-scsi@vger.kernel.org; Raghava Aditya
> Renukunta
>
During a PCI error recovery, if aac_check_health() is not aware that
a PCI error happened and we have an offline PCI channel, it might
trigger some errors (like NULL pointer dereference) and inhibit the
error recovery process to complete.
This patch makes the health check procedure aware of PCI
On Wed, Apr 05, 2017 at 01:30:31PM +0200, Michal Hocko wrote:
> On Wed 05-04-17 09:46:59, Vlastimil Babka wrote:
> > We now have memalloc_noreclaim_{save,restore} helpers for robust setting and
> > clearing of PF_MEMALLOC. Let's convert the code which was using the generic
> > tsk_restore_flags().
Some compilers don't like BLK_DEF_MAX_SECTORS being an enum (int) when
expanding min_not_zero. Cast it to sector_t so it matches the type of
the other operand, logical_to_sectors().
Signed-off-by: Fam Zheng
---
drivers/scsi/sd.c | 2 +-
1 file changed, 1 insertion(+), 1
On 04/05/2017 07:21 PM, Christoph Hellwig wrote:
> From: "Martin K. Petersen"
>
> Now that zeroout and discards are distinct operations we need to
> separate the policy of choosing the appropriate command. Create a
> zeroing_mode which can be one of:
>
> write:
On 05/04/17 14:39, Vlastimil Babka wrote:
> On 04/05/2017 01:36 PM, Richard Weinberger wrote:
>> Michal,
>>
>> Am 05.04.2017 um 13:31 schrieb Michal Hocko:
>>> On Wed 05-04-17 09:47:00, Vlastimil Babka wrote:
Nandsim has own functions set_memalloc() and clear_memalloc() for robust
On Thu, Apr 06, 2017 at 12:30:43AM +, Bart Van Assche wrote:
> On Thu, 2017-04-06 at 08:27 +0800, kbuild test robot wrote:
> > All warnings (new ones prefixed by >>):
> >
> >drivers//scsi/osd/osd_uld.c: In function 'osd_probe':
> > > > drivers//scsi/osd/osd_uld.c:467:2: warning: ignoring
On Wed, Apr 05, 2017 at 09:52:50AM -0700, Bart Van Assche wrote:
> Now that all scsi_device_get() callers check the return value of this
> function, make checking that return value mandatory.
>
> Signed-off-by: Bart Van Assche
> Cc: Hannes Reinecke
>
On 04/05/2017 07:21 PM, Christoph Hellwig wrote:
> From: "Martin K. Petersen"
>
> Separating discards and zeroout operations allows us to remove the LBPRZ
> block zeroing constraints from discards and honor the device preferences
> for UNMAP commands.
>
> If
Ok, the version below simplify skip the function split entirely:
---
>From 7c9ca58f1d8cf53b42f14a51e02d0f3d0f12ab45 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig
Date: Thu, 12 Jan 2017 11:17:29 +0100
Subject: csiostor: switch to pci_alloc_irq_vectors
And get automatic MSI-X
Hi Hannes,
Thanks for taking a crack at the issue. My comments below..
On Tue, 4 Apr 2017, 5:07am, Hannes Reinecke wrote:
> Most legacy HBAs have a tagset per HBA, not per queue. To map
> these devices onto block-mq this patch implements a new tagset
> flag BLK_MQ_F_GLOBAL_TAGS, which will
On Thu 06-04-17 09:33:44, Adrian Hunter wrote:
> On 05/04/17 14:39, Vlastimil Babka wrote:
> > On 04/05/2017 01:36 PM, Richard Weinberger wrote:
> >> Michal,
> >>
> >> Am 05.04.2017 um 13:31 schrieb Michal Hocko:
> >>> On Wed 05-04-17 09:47:00, Vlastimil Babka wrote:
> Nandsim has own
On Thu, Apr 06, 2017 at 04:35:56PM +0800, 廖亨权 wrote:
> Hi, Guys,
> I want to ask if there is any plan to plant the NVMe driver to
> Vxworks OS?Thank you so much.---end quoted text---
The Linux NVMe team has no plans for a Vxworks NVMe driver at the moment.
From: Colin Ian King
There are several local or function parameter pointers that are
being assigned NULL after a kfree where and these have no effect
and hence can be removed.
Fixes various cppcheck warnings:
"Assignment of function parameter has no effect outside the
On Thu, Apr 06, 2017 at 08:38:10AM +0200, Wouter Verhelst wrote:
> On Wed, Apr 05, 2017 at 01:30:31PM +0200, Michal Hocko wrote:
> > On Wed 05-04-17 09:46:59, Vlastimil Babka wrote:
> > > We now have memalloc_noreclaim_{save,restore} helpers for robust setting
> > > and
> > > clearing of
On 04/06/2017 08:27 AM, Arun Easi wrote:
> Hi Hannes,
>
> Thanks for taking a crack at the issue. My comments below..
>
> On Tue, 4 Apr 2017, 5:07am, Hannes Reinecke wrote:
>
>> Most legacy HBAs have a tagset per HBA, not per queue. To map
>> these devices onto block-mq this patch implements a
Hi Martin,
I noticed that the following commits
eb94588dabec scsi: hpsa: fix volume offline state
2ef288498087 scsi: hpsa: do not timeout reset operations
87b9e6aa87d9 scsi: hpsa: limit outstanding rescans
85b29008d8af scsi: hpsa: update check for logical volume status
are included in
On Wed, Apr 05, 2017 at 07:18:08PM +0200, Christoph Hellwig wrote:
> ->retries is counting the number of times a command is resubmitted, and
> be cleared on the first time we see the command. We currently don't do
> that for non-PCIe command, which is easily fixed by moving the setup
> to common
76 matches
Mail list logo