[PATCH] ipr: do not set DID_PASSTHROUGH on CHECK CONDITION

2017-04-11 Thread Mauricio Faria de Oliveira
On a dual controller setup with multipath enabled, some MEDIUM ERRORs caused both paths to be failed, thus I/O got queued/blocked since the 'queue_if_no_path' feature is enabled by default on IPR controllers. This example disabled 'queue_if_no_path' so the I/O failure is seen at the sg_dd

Re: Race to power off harming SATA SSDs

2017-04-11 Thread Henrique de Moraes Holschuh
On Tue, 11 Apr 2017, Martin Steigerwald wrote: > I do have a Crucial M500 and I do have an increase of that counter: > > martin@merkaba:~[…]/Crucial-M500> grep "^174" smartctl-a-201* > smartctl-a-2014-03-05.txt:174 Unexpect_Power_Loss_Ct 0x0032 100 100 > 000 > Old_age Always

Re: [PATCH] ipr: do not set DID_PASSTHROUGH on CHECK CONDITION

2017-04-11 Thread Mauricio Faria de Oliveira
This is the PATCH v2. Sorry for the wrong subject line. On 04/11/2017 11:46 AM, Mauricio Faria de Oliveira wrote: Signed-off-by: Mauricio Faria de Oliveira Acked-by: Brian King --- v2: - use the scsi_cmd local variable rather than

Re: [PATCH] ipr: do not set DID_PASSTHROUGH on CHECK CONDITION

2017-04-11 Thread Mauricio Faria de Oliveira
On 04/11/2017 11:34 AM, Brian King wrote: Looks good to me. Thanks for the detailed analysis. One minor nit below and you can add: Acked-by: Brian King Cool, thanks. I'll submit a v2. Since we already have a scsi_cmd local, you don't need to go back to the

Re: [PATCH] ipr: do not set DID_PASSTHROUGH on CHECK CONDITION

2017-04-11 Thread Brian King
Mauricio, Looks good to me. Thanks for the detailed analysis. One minor nit below and you can add: Acked-by: Brian King On 04/10/2017 09:28 PM, Mauricio Faria de Oliveira wrote: > diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c > index b29afafc2885..1012674d9dc5

Re: RFC: remove REQ_OP_WRITE_SAME

2017-04-11 Thread Mike Christie
On 04/10/2017 11:07 AM, Christoph Hellwig wrote: > Now that we are using REQ_OP_WRITE_ZEROES for all zeroing needs in the > kernel there is very little use left for REQ_OP_WRITE_SAME. We only > have two callers left, and both just export optional protocol features > to remote systems: DRBD and

Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Bart Van Assche
On Tue, 2017-04-11 at 12:09 -0400, Mike Snitzer wrote: > This has no place in dm-mq (or any blk-mq > driver). If it is needed it should be elevated to blk-mq core to > trigger blk_mq_delay_run_hw_queue() when BLK_MQ_RQ_QUEUE_BUSY is > returned from blk_mq_ops' .queue_rq. Hello Mike, If the

Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Mike Snitzer
On Fri, Apr 07 2017 at 2:16pm -0400, Bart Van Assche wrote: > While running the srp-test software I noticed that request > processing stalls sporadically at the beginning of a test, namely > when mkfs is run against a dm-mpath device. Every time when that > happened

Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Mike Snitzer
On Tue, Apr 11 2017 at 12:26pm -0400, Bart Van Assche wrote: > On Tue, 2017-04-11 at 12:09 -0400, Mike Snitzer wrote: > > This has no place in dm-mq (or any blk-mq > > driver). If it is needed it should be elevated to blk-mq core to > > trigger

Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Mike Snitzer
On Tue, Apr 11 2017 at 1:51pm -0400, Bart Van Assche wrote: > On Tue, 2017-04-11 at 13:47 -0400, Mike Snitzer wrote: > > Other drivers will very likely be caught about by > > this blk-mq quirk in the future. > > Hello Mike, > > Are you aware that the requirement

Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Bart Van Assche
On Tue, 2017-04-11 at 14:03 -0400, Mike Snitzer wrote: > Rather than working so hard to use DM code against me, your argument > should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well > established pattern" > > I see drivers/nvme/host/fc.c:nvme_fc_start_fcp_op() does. But that

[PATCH 6/6] scsi: Implement blk_mq_ops.show_rq()

2017-04-11 Thread Bart Van Assche
Show the SCSI CDB, .eh_eflags and .result for pending SCSI commands in /sys/kernel/debug/block/*/mq/*/dispatch and */rq_list. Signed-off-by: Bart Van Assche Cc: Martin K. Petersen Cc: James Bottomley

Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Bart Van Assche
On Tue, 2017-04-11 at 13:47 -0400, Mike Snitzer wrote: > Other drivers will very likely be caught about by > this blk-mq quirk in the future. Hello Mike, Are you aware that the requirement that blk-mq drivers rerun the queue after having returned BLK_MQ_RQ_QUEUE_BUSY is a requirement that is

Re: problem with discard granularity in sd

2017-04-11 Thread David Buckley
(re-sending as I somehow lost the subject on my previous reply) Martin, I'm rather surprised nobody else has previously reported this as well, especially as NetApp hadn't received any reports. The only probable explanation I could think of is that EL 7 is still based on a 3.10 kernel so is too

Re: [PATCH RESEND] scsi: return correct blkprep status code in case scsi_init_io() fails.

2017-04-11 Thread Bart Van Assche
On Tue, 2017-04-11 at 09:46 +0200, Johannes Thumshirn wrote: > When instrumenting the SCSI layer to run into the > !blk_rq_nr_phys_segments(rq) case the following warning emitted from the > block layer: > > blk_peek_request: bad return=-22 > > This happens because since commit fd3fc0b4d730

Re: problem with discard granularity in sd

2017-04-11 Thread Martin K. Petersen
David, > I am wondering if part of the issue is that in my use case, UNMAP and > WRITE SAME zeros result in very different results. With thin > provisioned LUNs, UNMAP requests result in the blocks being freed and > thus reduces the actual size of the LUN allocation on disk. If WRITE > SAME

Re: [PATCH] scsi: qla2xxx: remove some redundant pointer assignments

2017-04-11 Thread Martin K. Petersen
Colin King writes: > There are several local or function parameter pointers that are > being assigned NULL after a kfree where and these have no effect > and hence can be removed. Applied to 4.12/scsi-queue. -- Martin K. Petersen Oracle Linux Engineering

Re: [PATCH] scsi: libfc: directly call ELS request handlers

2017-04-11 Thread Martin K. Petersen
Johannes Thumshirn writes: > Directly call ELS request handler functions in fc_lport_recv_els_req > instead of saving the pointer to the handler's receive function and then > later dereferencing this pointer. Applied to 4.12/scsi-queue. -- Martin K. Petersen Oracle

Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically

2017-04-11 Thread Ming Lei
On Tue, Apr 11, 2017 at 06:18:36PM +, Bart Van Assche wrote: > On Tue, 2017-04-11 at 14:03 -0400, Mike Snitzer wrote: > > Rather than working so hard to use DM code against me, your argument > > should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well > > established pattern" >

Re: linux-next: manual merge of the scsi-mkp tree with the char-misc tree

2017-04-11 Thread Martin K. Petersen
Bart Van Assche writes: > Sorry that I had not yet noticed Logan's patch series. Should my two > patches that conflict with Logan's patch series be dropped and > reworked after Logan's patches are upstream? Obviously things break the minute you go on vacation. I'm

Re: [PATCH] scsi: aacraid: fix PCI error recovery path

2017-04-11 Thread Martin K. Petersen
"Guilherme G. Piccoli" writes: > During a PCI error recovery, if aac_check_health() is not aware that > a PCI error happened and we have an offline PCI channel, it might > trigger some errors (like NULL pointer dereference) and inhibit the > error recovery process to

Re: [PATCH 1/2] scsi: sd: Separate zeroout and discard command choices

2017-04-11 Thread Martin K. Petersen
Bart Van Assche writes: Bart, > characters then zeroing_mode_show() will truncate it. Since all > strings in the zeroing_mode[] array are short, have you considered to > use sprintf() instead? And if you do not want to use sprintf(), how > about using snprintf(buf,

Re: [PATCHv4 0/6] sanitize sg

2017-04-11 Thread Martin K. Petersen
Hannes, > the infamous syzkaller incovered some more issues in the sg driver. > This patchset fixes those two issues (and adds a fix for yet another > potential issue; checking for a NULL dxferp when dxfer_len is not 0). > It also removes handling of the SET_FORCE_LOW_DMA ioctl, which never >

Re: [PATCH] ipr: do not set DID_PASSTHROUGH on CHECK CONDITION

2017-04-11 Thread Martin K. Petersen
Mauricio Faria de Oliveira writes: > On a dual controller setup with multipath enabled, some MEDIUM ERRORs > caused both paths to be failed, thus I/O got queued/blocked since the > 'queue_if_no_path' feature is enabled by default on IPR controllers. Applied to

Re: [PATCH 0/6] hisi_sas: v2 hw SoC bug workarounds

2017-04-11 Thread Martin K. Petersen
John, > This patchset introduces some v2 hw bug workarounds. Mostly they are > related to SATA devices, but there is also a workaround for a scenario > when internal abort command may timeout. > > The general rule for implementing workarounds was to do it in the hw > layer, as the next hw

Re: [PATCH 2/8] target: remove iblock WRITE_SAME passthrough support

2017-04-11 Thread Nicholas A. Bellinger
On Mon, 2017-04-10 at 18:08 +0200, Christoph Hellwig wrote: > Use the pscsi driver to support arbitrary command passthrough > instead. > The people who are actively using iblock_execute_write_same_direct() are doing so in the context of ESX VAAI BlockZero, together with EXTENDED_COPY and

Re: [PATCH 2/8] target: remove iblock WRITE_SAME passthrough support

2017-04-11 Thread Christoph Hellwig
Hi Nic, this patch looks fine, and I'll include it for the next post. I'll move some of the explanation in this mail into the patch, though.

[GIT PULL] target fixes for v4.11-rc7

2017-04-11 Thread Nicholas A. Bellinger
Hi Linus, Here are the outstanding target-pending bug-fixes for v4.11-rc7. Please go ahead and pull from: git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git master There has been work in a number of different areas over the last weeks, including: - Fix target-core-user

[PATCH RESEND] scsi: return correct blkprep status code in case scsi_init_io() fails.

2017-04-11 Thread Johannes Thumshirn
When instrumenting the SCSI layer to run into the !blk_rq_nr_phys_segments(rq) case the following warning emitted from the block layer: blk_peek_request: bad return=-22 This happens because since commit fd3fc0b4d730 ('scsi: don't BUG_ON() empty DMA transfers') we return the wrong error value

[Bug 195285] qla2xxx FW immediatly crashing after target start

2017-04-11 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=195285 --- Comment #2 from Anthony (anthony.blood...@gmail.com) --- All my test and and production environments a the same: on tagret: RAID controller (HP SmartArray / Adaptec 65xx) BCache in writeback mode on Intel SSD NVME QLE2564 in target mode on

Re: [PATCH 09/12] hpsa: separate monitor events from heartbeat worker

2017-04-11 Thread Martin Wilck
On Fri, 2017-04-07 at 15:06 -0500, Don Brace wrote: > From: Scott Teel > > create new worker thread to monitor controller events >  - detect controller events more frequently. >  - leave heartbeat check at 30 seconds. > > Reviewed-by: Scott Benesh

Re: [PATCH 07/12] hpsa: cleanup reset handler

2017-04-11 Thread Martin Wilck
On Fri, 2017-04-07 at 15:06 -0500, Don Brace wrote: >  - mark device state sooner. > > Reviewed-by: Scott Benesh > Reviewed-by: Scott Teel > Reviewed-by: Kevin Barnett > Signed-off-by: Don Brace

[BUG][next-20170410][PPC] WARNING: CPU: 22 PID: 0 at block/blk-core.c:2655 .blk_update_request+0x4f8/0x500

2017-04-11 Thread Abdul Haleem
Hi, Warning while booting next-20170410 on PowerPC. We did not see warnings with next-20170407. In mean time I will update with the badcommit once my automated bisect run finishes. Machine type: Power7 LPAR Kernel : 4.11.0-rc6-next-20170410 Config : file attched. IPv6: ADDRCONF(NETDEV_UP):

Re: Race to power off harming SATA SSDs

2017-04-11 Thread Martin Steigerwald
Am Dienstag, 11. April 2017, 08:52:06 CEST schrieb Tejun Heo: > > Evidently, how often the SSD will lose the race depends on a platform > > and SSD combination, and also on how often the system is powered off. > > A sluggish firmware that takes its time to cut power can save the day... > > > > >