Re: Debugging scsi abort handling ?
Il 23/08/2014 16:52, Hans de Goede ha scritto: Hi All, Now that the UAS driver is no longer marked as CONFIG_BROKEN, I'm getting quite a few bug reports about issues with UAS drives. One if the issues is that there might be a number of bugs in the abort handling path, as I don't think that was ever tested properly. So I'm wondering is there a way to test the abort path with a real drive? E.G. submit some command which is known to take a significant amount of time, and then abort it right after submitting ? You could have some luck with QEMU's UAS emulation. If you set QEMU's I/O throttling options low enough (e.g. 100KB/sec), and then use dd with iflag=direct and a largish block size (a few MBs), the guest should abort its I/O soon enough. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] drivers: scsi: #define missing include guards
The four files aha1542.h, aha1740.h, gvp11.h and mvme147.h under drivers/scsi/ contain two-thirds of an include guard, but do not #define the macro. Add those #defines. git grep says the macro names are not defined elsewhere. Signed-off-by: Rasmus Villemoes li...@rasmusvillemoes.dk --- For good measure, here's a version with a non-broken commit message. drivers/scsi/aha1542.h | 1 + drivers/scsi/aha1740.h | 1 + drivers/scsi/gvp11.h | 1 + drivers/scsi/mvme147.h | 1 + 4 files changed, 4 insertions(+) diff --git a/drivers/scsi/aha1542.h b/drivers/scsi/aha1542.h index b871d2b..8f4e07a 100644 --- a/drivers/scsi/aha1542.h +++ b/drivers/scsi/aha1542.h @@ -1,4 +1,5 @@ #ifndef _AHA1542_H +#define _AHA1542_H /* $Id: aha1542.h,v 1.1 1992/07/24 06:27:38 root Exp root $ * diff --git a/drivers/scsi/aha1740.h b/drivers/scsi/aha1740.h index af23fd6..7ea484f 100644 --- a/drivers/scsi/aha1740.h +++ b/drivers/scsi/aha1740.h @@ -1,4 +1,5 @@ #ifndef _AHA1740_H +#define _AHA1740_H /* $Id$ * diff --git a/drivers/scsi/gvp11.h b/drivers/scsi/gvp11.h index 852913c..5aabe90 100644 --- a/drivers/scsi/gvp11.h +++ b/drivers/scsi/gvp11.h @@ -1,4 +1,5 @@ #ifndef GVP11_H +#define GVP11_H /* $Id: gvp11.h,v 1.4 1997/01/19 23:07:12 davem Exp $ * diff --git a/drivers/scsi/mvme147.h b/drivers/scsi/mvme147.h index bfd4566..479e9b4 100644 --- a/drivers/scsi/mvme147.h +++ b/drivers/scsi/mvme147.h @@ -1,4 +1,5 @@ #ifndef MVME147_H +#define MVME147_H /* $Id: mvme147.h,v 1.4 1997/01/19 23:07:10 davem Exp $ * -- 2.0.4 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
Hi, On 08/25/2014 09:20 AM, Paolo Bonzini wrote: Il 23/08/2014 16:52, Hans de Goede ha scritto: Hi All, Now that the UAS driver is no longer marked as CONFIG_BROKEN, I'm getting quite a few bug reports about issues with UAS drives. One if the issues is that there might be a number of bugs in the abort handling path, as I don't think that was ever tested properly. So I'm wondering is there a way to test the abort path with a real drive? E.G. submit some command which is known to take a significant amount of time, and then abort it right after submitting ? You could have some luck with QEMU's UAS emulation. If you set QEMU's I/O throttling options low enough (e.g. 100KB/sec), and then use dd with iflag=direct and a largish block size (a few MBs), the guest should abort its I/O soon enough. Thanks for the suggestion, that is an interesting idea. The problem is though, that qemu's uas implementation is modeled after the kernel uas driver (including some bugs which I've ended up fixing at both ends), I want to see how real hardware deals with abort commands (e.g. does it only acknowledge the abort, or does it also sends a sense code for the actual command). Regards, Hans -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 13/17] arcmsr: fix ioctl data read/write error for adapter type C
On Fri, 2014-08-22 at 18:00 +0200, Tomas Henzl wrote: On 08/19/2014 09:17 AM, Ching Huang wrote: From: Ching Huang ching2...@areca.com.tw Rewrite ioctl entry and its relate function. This patch fix ioctl data read/write error and change data I/O access from byte to Dword. Signed-off-by: Ching Huang ching2...@areca.com.tw --- diff -uprN a/drivers/scsi/arcmsr/arcmsr_attr.c b/drivers/scsi/arcmsr/arcmsr_attr.c --- a/drivers/scsi/arcmsr/arcmsr_attr.c 2014-02-06 17:47:24.0 +0800 +++ b/drivers/scsi/arcmsr/arcmsr_attr.c 2014-04-29 17:10:42.0 +0800 @@ -70,40 +70,75 @@ static ssize_t arcmsr_sysfs_iop_message_ struct AdapterControlBlock *acb = (struct AdapterControlBlock *) host-hostdata; uint8_t *pQbuffer,*ptmpQbuffer; int32_t allxfer_len = 0; + unsigned long flags; if (!capable(CAP_SYS_ADMIN)) return -EACCES; /* do message unit read. */ ptmpQbuffer = (uint8_t *)buf; - while ((acb-rqbuf_firstindex != acb-rqbuf_lastindex) -(allxfer_len 1031)) { + spin_lock_irqsave(acb-rqbuffer_lock, flags); + if (acb-rqbuf_firstindex != acb-rqbuf_lastindex) { Hi - does this condition (acb-rqbuf_firstindex == acb-rqbuf_lastindex) mean we could just release the spinlock and return ? NO. We have to check the input buffer that may have message data come from IOP. pQbuffer = acb-rqbuffer[acb-rqbuf_firstindex]; - memcpy(ptmpQbuffer, pQbuffer, 1); - acb-rqbuf_firstindex++; - acb-rqbuf_firstindex %= ARCMSR_MAX_QBUFFER; - ptmpQbuffer++; - allxfer_len++; + if (acb-rqbuf_firstindex acb-rqbuf_lastindex) { + if ((ARCMSR_MAX_QBUFFER - acb-rqbuf_firstindex) = 1032) { + memcpy(ptmpQbuffer, pQbuffer, 1032); + acb-rqbuf_firstindex += 1032; + acb-rqbuf_firstindex %= ARCMSR_MAX_QBUFFER; + allxfer_len = 1032; + } else { + if (((ARCMSR_MAX_QBUFFER - acb-rqbuf_firstindex) + + acb-rqbuf_lastindex) 1032) { + memcpy(ptmpQbuffer, pQbuffer, + ARCMSR_MAX_QBUFFER + - acb-rqbuf_firstindex); + ptmpQbuffer += ARCMSR_MAX_QBUFFER + - acb-rqbuf_firstindex; + memcpy(ptmpQbuffer, acb-rqbuffer, 1032 + - (ARCMSR_MAX_QBUFFER - + acb-rqbuf_firstindex)); This code looks like you were copying some data from a ring buffer, in that case - shouldn't be acb-rqbuf_lastindex used instead of firstindex? Yes, there copying data from a ring buffer. firstindex and lastindex are bad name. For readability, I rename the firstindex to getIndex, lastindex to putIndex. What does the 1032 mean is that a hw. limit, actually could you explain the code should do? Maybe I'm just wrong with my assumptions. 1032 is the API data buffer limitation. Thanks, Tomas + acb-rqbuf_firstindex = 1032 - + (ARCMSR_MAX_QBUFFER - + acb-rqbuf_firstindex); + allxfer_len = 1032; + } else { + memcpy(ptmpQbuffer, pQbuffer, + ARCMSR_MAX_QBUFFER - + acb-rqbuf_firstindex); + ptmpQbuffer += ARCMSR_MAX_QBUFFER - + acb-rqbuf_firstindex; + memcpy(ptmpQbuffer, acb-rqbuffer, + acb-rqbuf_lastindex); + allxfer_len = ARCMSR_MAX_QBUFFER - + acb-rqbuf_firstindex + + acb-rqbuf_lastindex; + acb-rqbuf_firstindex = + acb-rqbuf_lastindex; + } + } + } else { + if ((acb-rqbuf_lastindex - acb-rqbuf_firstindex) 1032) { + memcpy(ptmpQbuffer, pQbuffer, 1032); + acb-rqbuf_firstindex += 1032; + allxfer_len = 1032; + } else { + memcpy(ptmpQbuffer, pQbuffer, acb-rqbuf_lastindex + - acb-rqbuf_firstindex); +
Re: [Xen-devel] [PATCH V5 3/5] Introduce xen-scsifront module
On 08/23/2014 12:25 AM, Konrad Rzeszutek Wilk wrote: --- /dev/null +++ b/drivers/scsi/xen-scsifront.c @@ -0,0 +1,1017 @@ +/* + * Xen SCSI frontend driver + * + * Copyright (c) 2008, FUJITSU Limited + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation; or, when distributed + * separately from the Linux kernel or incorporated into other + * software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the Software), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#define DEBUG :-) I think you don't want that in the driver. Correct. :-) + +#include linux/module.h +#include linux/kernel.h +#include linux/device.h +#include linux/wait.h +#include linux/interrupt.h +#include linux/spinlock.h +#include linux/sched.h +#include linux/blkdev.h +#include linux/pfn.h +#include linux/slab.h + +#include scsi/scsi_cmnd.h +#include scsi/scsi_device.h +#include scsi/scsi.h +#include scsi/scsi_host.h + +#include xen/xen.h +#include xen/xenbus.h +#include xen/grant_table.h +#include xen/events.h +#include xen/page.h + +#include xen/interface/grant_table.h +#include xen/interface/io/vscsiif.h +#include xen/interface/io/protocols.h + +#include asm/xen/hypervisor.h + + +#define GRANT_INVALID_REF 0 + +#define VSCSIFRONT_OP_ADD_LUN 1 +#define VSCSIFRONT_OP_DEL_LUN 2 Shouldn't those be defined in the vscsiff.h file? No, they are private for the frontend. + +#define DEFAULT_TASK_COMM_LEN TASK_COMM_LEN Not sure why you need the DEFAULT_ ? Could you use TASK_COMM_LEN? Sure. + +/* tuning point*/ Missing stop and a space after the 't': /* Tuning point. */ Okay. +#define VSCSIIF_DEFAULT_CMD_PER_LUN 10 +#define VSCSIIF_MAX_TARGET 64 +#define VSCSIIF_MAX_LUN 255 + +#define VSCSIIF_RING_SIZE __CONST_RING_SIZE(vscsiif, PAGE_SIZE) +#define VSCSIIF_MAX_REQS VSCSIIF_RING_SIZE + +#define vscsiif_grants_sg(_sg) (PFN_UP((_sg) * \ + sizeof(struct scsiif_request_segment))) + +struct vscsifrnt_shadow { + /* command between backend and frontend */ + unsigned char act; act? How about 'op' ? Or 'cmd_op'? I wanted to use the same name as the related element in vscsiif.h + uint16_t rqid; rqid? Could you have a comment explaining what that acronym is? Oh wait - request id? How about just req_id? Same again. It's called rqid in vscsiiif.h + + /* Number of pieces of scatter-gather */ + unsigned int nr_grants; s/nr_grants/nr_sg/ ? I'll update the comment, as this is really the number of grants. + struct scsiif_request_segment *sg; + + /* do reset or abort function */ Full stop missing. + wait_queue_head_t wq_reset; /* reset work queue */ + int wait_reset; /* reset work queue condition */ + int32_t rslt_reset; /* reset response status */ + /* (SUCESS or FAILED) */ Full stop missing. s/SUCESS/SUCCESS/ Okay. + + /* requested struct scsi_cmnd is stored from kernel */ Full stop missing. Okay. + struct scsi_cmnd *sc; + int gref[vscsiif_grants_sg(SG_ALL) + SG_ALL]; +}; + +struct vscsifrnt_info { + struct xenbus_device *dev; + + struct Scsi_Host *host; + int host_active; This 'host_active' seems to be a guard to call 'scsi_host_remove' through either the disconnect (so backend told us to disconnect) or the .remove (so XenStore keys are moved). Either way I think it is possible to run _both_ of them and this being just an 'int' is not sufficient. I would recommend you change this to an ref_count. Hmm, I think a lock is required here. + + spinlock_t shadow_lock; Could you move this spinlock below - to where you have tons of of 'shadow' values please? Okay. + unsigned int evtchn; + unsigned
Re: [PATCH v3 13/17] arcmsr: fix ioctl data read/write error for adapter type C
On 08/25/2014 07:59 PM, Ching Huang wrote: On Fri, 2014-08-22 at 18:00 +0200, Tomas Henzl wrote: On 08/19/2014 09:17 AM, Ching Huang wrote: From: Ching Huang ching2...@areca.com.tw Rewrite ioctl entry and its relate function. This patch fix ioctl data read/write error and change data I/O access from byte to Dword. Signed-off-by: Ching Huang ching2...@areca.com.tw --- diff -uprN a/drivers/scsi/arcmsr/arcmsr_attr.c b/drivers/scsi/arcmsr/arcmsr_attr.c --- a/drivers/scsi/arcmsr/arcmsr_attr.c 2014-02-06 17:47:24.0 +0800 +++ b/drivers/scsi/arcmsr/arcmsr_attr.c 2014-04-29 17:10:42.0 +0800 @@ -70,40 +70,75 @@ static ssize_t arcmsr_sysfs_iop_message_ struct AdapterControlBlock *acb = (struct AdapterControlBlock *) host-hostdata; uint8_t *pQbuffer,*ptmpQbuffer; int32_t allxfer_len = 0; + unsigned long flags; if (!capable(CAP_SYS_ADMIN)) return -EACCES; /* do message unit read. */ ptmpQbuffer = (uint8_t *)buf; - while ((acb-rqbuf_firstindex != acb-rqbuf_lastindex) -(allxfer_len 1031)) { + spin_lock_irqsave(acb-rqbuffer_lock, flags); + if (acb-rqbuf_firstindex != acb-rqbuf_lastindex) { Hi - does this condition (acb-rqbuf_firstindex == acb-rqbuf_lastindex) mean we could just release the spinlock and return ? NO. We have to check the input buffer that may have message data come from IOP. pQbuffer = acb-rqbuffer[acb-rqbuf_firstindex]; - memcpy(ptmpQbuffer, pQbuffer, 1); - acb-rqbuf_firstindex++; - acb-rqbuf_firstindex %= ARCMSR_MAX_QBUFFER; - ptmpQbuffer++; - allxfer_len++; + if (acb-rqbuf_firstindex acb-rqbuf_lastindex) { + if ((ARCMSR_MAX_QBUFFER - acb-rqbuf_firstindex) = 1032) { + memcpy(ptmpQbuffer, pQbuffer, 1032); + acb-rqbuf_firstindex += 1032; + acb-rqbuf_firstindex %= ARCMSR_MAX_QBUFFER; + allxfer_len = 1032; + } else { + if (((ARCMSR_MAX_QBUFFER - acb-rqbuf_firstindex) + + acb-rqbuf_lastindex) 1032) { + memcpy(ptmpQbuffer, pQbuffer, + ARCMSR_MAX_QBUFFER + - acb-rqbuf_firstindex); + ptmpQbuffer += ARCMSR_MAX_QBUFFER + - acb-rqbuf_firstindex; + memcpy(ptmpQbuffer, acb-rqbuffer, 1032 + - (ARCMSR_MAX_QBUFFER - + acb-rqbuf_firstindex)); This code looks like you were copying some data from a ring buffer, in that case - shouldn't be acb-rqbuf_lastindex used instead of firstindex? Yes, there copying data from a ring buffer. firstindex and lastindex are bad name. For readability, I rename the firstindex to getIndex, lastindex to putIndex. My comment is not about names, but in this path '(ARCMSR_MAX_QBUFFER - acb-rqbuf_firstindex)+ acb-rqbuf_lastindex) 1032)' you copy something twice and in both cases the 'firstindex' is used and never the 'lastindex'. Is this correct? What does the 1032 mean is that a hw. limit, actually could you explain the code should do? Maybe I'm just wrong with my assumptions. 1032 is the API data buffer limitation. Thanks, Tomas + acb-rqbuf_firstindex = 1032 - + (ARCMSR_MAX_QBUFFER - + acb-rqbuf_firstindex); + allxfer_len = 1032; + } else { + memcpy(ptmpQbuffer, pQbuffer, + ARCMSR_MAX_QBUFFER - + acb-rqbuf_firstindex); + ptmpQbuffer += ARCMSR_MAX_QBUFFER - + acb-rqbuf_firstindex; + memcpy(ptmpQbuffer, acb-rqbuffer, + acb-rqbuf_lastindex); + allxfer_len = ARCMSR_MAX_QBUFFER - + acb-rqbuf_firstindex + + acb-rqbuf_lastindex; + acb-rqbuf_firstindex = + acb-rqbuf_lastindex; + } + } + } else { + if ((acb-rqbuf_lastindex - acb-rqbuf_firstindex) 1032) { + memcpy(ptmpQbuffer, pQbuffer, 1032); + acb-rqbuf_firstindex += 1032; + allxfer_len = 1032; +
Re: Debugging scsi abort handling ?
On 08/25/14 10:47, Hans de Goede wrote: I want to see how real hardware deals with abort commands (e.g. does it only acknowledge the abort, or does it also sends a sense code for the actual command). The SCSI specs define whether a reply should be sent if a SCSI command has been aborted. From SAM-5: 5.3.1 Status codes [ ... ] TASK ABORTED. This status shall be returned when a command is aborted by a command or task management function on another I_T nexus and the Control mode page TAS bit is set to one (see 5.6). 5.6 Aborting commands - A command is aborted when a SCSI device condition (see 6.3), command, or task management function causes termination of the command prior to its completion by the device server. After a command is aborted and TASK ABORTED status, if any, is returned, the SCSI target device shall send no further requests or responses for that command. From SPC-4: 7.5.8 Control mode page [ ... ] A task aborted status (TAS) bit set to zero specifies that aborted commands shall be terminated by the device server without any response to the application client. A TAS bit set to one specifies that commands aborted by the actions of an I_T nexus other than the I_T nexus on which the command was received shall be completed with TASK ABORTED status (see SAM-5). Bart. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
Il 25/08/2014 12:28, Bart Van Assche ha scritto: From SPC-4: 7.5.8 Control mode page [ ... ] A task aborted status (TAS) bit set to zero specifies that aborted commands shall be terminated by the device server without any response to the application client. A TAS bit set to one specifies that commands aborted by the actions of an I_T nexus other than the I_T nexus on which the command was received shall be completed with TASK ABORTED status (see SAM-5). Note the aborted by the actions of an I_T nexus other than the I_T nexus on which the command was received. In practice, this means that TASK ABORTED should only happen if you use the CLEAR TASK SET tmf and TST is not set to 001b (i.e. _not_ to per I_T nexus) in the Control mode page. It should never happen for a pen drive. Setting TASK ABORTED aside, the important part is that an abort can do one of two things: - complete the command, and then eh_abort should return after the driver has noticed the completion and called the -scsi_done callback for the Scsi_Cmnd*. - abort the command, and then the driver should never call the -scsi_done callback for the Scsi_Cmnd*. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
Hi, On 08/25/2014 01:15 PM, Paolo Bonzini wrote: Il 25/08/2014 12:28, Bart Van Assche ha scritto: From SPC-4: 7.5.8 Control mode page [ ... ] A task aborted status (TAS) bit set to zero specifies that aborted commands shall be terminated by the device server without any response to the application client. A TAS bit set to one specifies that commands aborted by the actions of an I_T nexus other than the I_T nexus on which the command was received shall be completed with TASK ABORTED status (see SAM-5). Note the aborted by the actions of an I_T nexus other than the I_T nexus on which the command was received. In practice, this means that TASK ABORTED should only happen if you use the CLEAR TASK SET tmf and TST is not set to 001b (i.e. _not_ to per I_T nexus) in the Control mode page. It should never happen for a pen drive. Setting TASK ABORTED aside, the important part is that an abort can do one of two things: - complete the command, and then eh_abort should return after the driver has noticed the completion and called the -scsi_done callback for the Scsi_Cmnd*. - abort the command, and then the driver should never call the -scsi_done callback for the Scsi_Cmnd*. Thanks Bart and Paolo, your insights into this are greatly appreciated. So with uas there are separate usb transaction for cmd, data in, data out and sense for each tag. At the time of abort, usually one of data in / data out and a sense usb transaction will be outstanding. There already is logic in the driver to kill the data in / out transactions if a sense gets returned (usually with an error) before they are done. So if I'm reading this correctly, then on a successful abort, the sense transaction (if not already completed by the target) should be cancelled as it will never complete, correct ? I wish the uas spec contained some more details on this, but it is very vague wrt task management (well it is vague in general, task management just is extra hard to test). Regards, Hans -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: scsi logging future directions, was Re: [RFC PATCH -logging 00/10] scsi/constants: Output continuous error messages on trace
On 08/24/2014 10:44 PM, Christoph Hellwig wrote: On Fri, Aug 22, 2014 at 12:39:59AM +, Elliott, Robert (Server Storage) wrote: If you trigger hundreds of errors (e.g., hot remove a device during heavy IO), then all the prints to the linux serial console bog down the system, causing timeouts in commands to other devices and soft lockups for applications. Some changes that would help are: 1. Put them under SCSI logging level control 2. Use printk_ratelimited so an excessive number are trimmed Would you like to include something like this in your patch set? I think we should come to an agreement where we want to go with scsi logging first before doing various smaller adjustments. (Although your example is one that's urgent enough that I'd like to put it in ASAP, I had issues with it a few times). I had a chat with Martin at Linuxcon about these issues, and we were both in favor of getting rid of the old scsi logging mechansisms and instead replace it by an extended version of the scsi tracepoints that cover all places, and dump all data from the old logging mechanism that people find useful. In a few places we'd still want to log normal dev_printk style errors, and the I/O completion is one of them, even if they really need to be ratelimited and condensed. If someone has arguments in favour of keeping the old logging code I'd love to hear them, but in practive the traceevent code has huge benefits: - almost zero overhead if disabled - can easily be used without any tools through configs, but can be used even better with tools like trace-cmd or perf - allows both fine and coarse grained selections of events to trace - allows to capture statistics on each trace point without event enabling the output - doesn't have any of the console lockup problems. I've already been working on updating scsi logging infrastructure, removing old cludges and streamlining it. I'm all in favour of moving things over to scsi tracing; in fact I've already moved all the current SCSI_ML_XXX statements to tracepoints in my current patchset. Unfortunately I haven't found time to test things out there, and there's the patchset from Yoshihiro which needs review and integration. As of now I've treated this as rather low priority as no-one seemed to mind and the patchsets will be touching each and every driver. I'll be updating the patchset and send it for review. Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bnx2fc: do not add shared skbs to the fcoe_rx_list
On Fri, 22 Aug 2014, Eddie Wai wrote: On Fri, 2014-07-25 at 10:12 +0200, Maurizio Lombardi wrote: Hi Eddie, just want to add you to the CC list. Some days ago the bnx2fc's maintainer email address has been updated, this should be the new one: qlogic-storage-upstr...@qlogic.com I tried to send this patch to the new address but I received the following delivery failure notification: Delivery has failed to these recipients or groups: dept_linux...@qlogic.commailto:dept_linux...@qlogic.com Your message can't be delivered because delivery to this address is restricted. is this still a problem? The mail reflector seems to work for me... This issue has been fixed. On 07/25/2014 10:02 AM, Maurizio Lombardi wrote: In some cases, the fcoe_rx_list may contains multiple instances of the same skb (the so called shared skbs). the bnx2fc_l2_rcv thread is a loop that extracts a skb from the list, modifies (and destroys) its content and the proceed to the next one. The problem is that if the skb is shared, the remaining instances will be corrupted. The solution is to use skb_share_check() before adding the skb to the fcoe_rx_list. [ 6286.808725] [ cut here ] [ 6286.808729] WARNING: at include/scsi/fc_frame.h:173 bnx2fc_l2_rcv_thread+0x425/0x450 [bnx2fc]() [ 6286.808748] Modules linked in: bnx2x(-) mdio dm_service_time bnx2fc cnic uio fcoe libfcoe 8021q garp stp mrp libfc llc scsi_transport_fc scsi_tgt sg iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel e1000e ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper ptp cryptd hpilo serio_raw hpwdt lpc_ich pps_core ipmi_si pcspkr mfd_core ipmi_msghandler shpchp pcc_cpufreq mperf nfsd auth_rpcgss nfs_acl lockd sunrpc dm_multipath xfs libcrc32c ata_generic pata_acpi sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit ata_piix drm_kms_helper ttm drm libata i2c_core hpsa dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mdio] [ 6286.808750] CPU: 3 PID: 1304 Comm: bnx2fc_l2_threa Not tainted 3.10.0-121.el7.x86_64 #1 [ 6286.808750] Hardware name: HP ProLiant DL120 G7, BIOS J01 07/01/2013 [ 6286.808752] 0b36e715 8800deba1e00 815ec0ba [ 6286.808753] 8800deba1e38 8105dee1 a05618c0 8801e4c81888 [ 6286.808754] e8663868 8801f402b180 8801f56bc000 8800deba1e48 [ 6286.808754] Call Trace: [ 6286.808759] [815ec0ba] dump_stack+0x19/0x1b [ 6286.808762] [8105dee1] warn_slowpath_common+0x61/0x80 [ 6286.808763] [8105e00a] warn_slowpath_null+0x1a/0x20 [ 6286.808765] [a054f415] bnx2fc_l2_rcv_thread+0x425/0x450 [bnx2fc] [ 6286.808767] [a054eff0] ? bnx2fc_disable+0x90/0x90 [bnx2fc] [ 6286.808769] [81085aef] kthread+0xcf/0xe0 [ 6286.808770] [81085a20] ? kthread_create_on_node+0x140/0x140 [ 6286.808772] [815fc76c] ret_from_fork+0x7c/0xb0 [ 6286.808773] [81085a20] ? kthread_create_on_node+0x140/0x140 [ 6286.808774] ---[ end trace c6cdb939184ccb4e ]--- Signed-off-by: Maurizio Lombardi mlomb...@redhat.com --- drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c index 785d0d7..a190ab6 100644 --- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c +++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c @@ -411,6 +411,7 @@ static int bnx2fc_rcv(struct sk_buff *skb, struct net_device *dev, struct fc_frame_header *fh; struct fcoe_rcv_info *fr; struct fcoe_percpu_s *bg; + struct sk_buff *tmp_skb; unsigned short oxid; interface = container_of(ptype, struct bnx2fc_interface, @@ -423,6 +424,12 @@ static int bnx2fc_rcv(struct sk_buff *skb, struct net_device *dev, goto err; } + tmp_skb = skb_share_check(skb, GFP_ATOMIC); + if (!tmp_skb) + goto err; + + skb = tmp_skb; + if (unlikely(eth_hdr(skb)-h_proto != htons(ETH_P_FCOE))) { printk(KERN_ERR PFX bnx2fc_rcv: Wrong FC type frame\n); goto err; Seems logical, but this patch requires some testing which ought to be verified by the Qlogic team. Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 13/17] arcmsr: fix ioctl data read/write error for adapter type C
On 08/25/2014 12:29 PM, Tomas Henzl wrote: On 08/25/2014 07:59 PM, Ching Huang wrote: On Fri, 2014-08-22 at 18:00 +0200, Tomas Henzl wrote: On 08/19/2014 09:17 AM, Ching Huang wrote: From: Ching Huang ching2...@areca.com.tw Rewrite ioctl entry and its relate function. This patch fix ioctl data read/write error and change data I/O access from byte to Dword. Signed-off-by: Ching Huang ching2...@areca.com.tw --- diff -uprN a/drivers/scsi/arcmsr/arcmsr_attr.c b/drivers/scsi/arcmsr/arcmsr_attr.c --- a/drivers/scsi/arcmsr/arcmsr_attr.c2014-02-06 17:47:24.0 +0800 +++ b/drivers/scsi/arcmsr/arcmsr_attr.c2014-04-29 17:10:42.0 +0800 @@ -70,40 +70,75 @@ static ssize_t arcmsr_sysfs_iop_message_ struct AdapterControlBlock *acb = (struct AdapterControlBlock *) host-hostdata; uint8_t *pQbuffer,*ptmpQbuffer; int32_t allxfer_len = 0; + unsigned long flags; if (!capable(CAP_SYS_ADMIN)) return -EACCES; /* do message unit read. */ ptmpQbuffer = (uint8_t *)buf; - while ((acb-rqbuf_firstindex != acb-rqbuf_lastindex) - (allxfer_len 1031)) { + spin_lock_irqsave(acb-rqbuffer_lock, flags); + if (acb-rqbuf_firstindex != acb-rqbuf_lastindex) { Hi - does this condition (acb-rqbuf_firstindex == acb-rqbuf_lastindex) mean we could just release the spinlock and return ? NO. We have to check the input buffer that may have message data come from IOP. pQbuffer = acb-rqbuffer[acb-rqbuf_firstindex]; - memcpy(ptmpQbuffer, pQbuffer, 1); - acb-rqbuf_firstindex++; - acb-rqbuf_firstindex %= ARCMSR_MAX_QBUFFER; - ptmpQbuffer++; - allxfer_len++; + if (acb-rqbuf_firstindex acb-rqbuf_lastindex) { + if ((ARCMSR_MAX_QBUFFER - acb-rqbuf_firstindex) = 1032) { + memcpy(ptmpQbuffer, pQbuffer, 1032); + acb-rqbuf_firstindex += 1032; + acb-rqbuf_firstindex %= ARCMSR_MAX_QBUFFER; + allxfer_len = 1032; + } else { + if (((ARCMSR_MAX_QBUFFER - acb-rqbuf_firstindex) + + acb-rqbuf_lastindex) 1032) { + memcpy(ptmpQbuffer, pQbuffer, + ARCMSR_MAX_QBUFFER + - acb-rqbuf_firstindex); + ptmpQbuffer += ARCMSR_MAX_QBUFFER + - acb-rqbuf_firstindex; + memcpy(ptmpQbuffer, acb-rqbuffer, 1032 + - (ARCMSR_MAX_QBUFFER - + acb-rqbuf_firstindex)); This code looks like you were copying some data from a ring buffer, in that case - shouldn't be acb-rqbuf_lastindex used instead of firstindex? Yes, there copying data from a ring buffer. firstindex and lastindex are bad name. For readability, I rename the firstindex to getIndex, lastindex to putIndex. My comment is not about names, but in this path '(ARCMSR_MAX_QBUFFER - acb-rqbuf_firstindex)+ acb-rqbuf_lastindex) 1032)' you copy something twice and in both cases the 'firstindex' is used and never the 'lastindex'. Is this correct? And when it is copying from a ring buffer, maybe it could be made in a simpler way? What do you think about this (not even compile tested, just an idea): spin_lock_irqsave(acb-rqbuffer_lock, flags); unsigned int tail = acb-rqbuf_firstindex; unsigned int head = acb-rqbuf_lastindex; unsigned int cnt_to_end = CIRC_CNT_TO_END(head, tail, ARCMSR_MAX_QBUFFER); allxfer_len = CIRC_CNT(head, tail, ARCMSR_MAX_QBUFFER); if (allxfer_len 1032) allxfer_len = 1032; if (allxfer_len = cnt_to_end) memcpy(buf, acb-rqbuffer + tail, allxfer_len); else { memcpy(buf, acb-rqbuffer + tails, cnt_to_end); memcpy(buf + cnt_to_end, acb-rqbuffer, allxfer_len - cnt_to_end); } acb-rqbuf_firstindex = (acb-rqbuf_firstindex + allxfer_len) % ARCMSR_MAX_QBUFFER; What does the 1032 mean is that a hw. limit, actually could you explain the code should do? Maybe I'm just wrong with my assumptions. 1032 is the API data buffer limitation. Thanks, Tomas + acb-rqbuf_firstindex = 1032 - + (ARCMSR_MAX_QBUFFER - + acb-rqbuf_firstindex); + allxfer_len = 1032; + } else { + memcpy(ptmpQbuffer, pQbuffer, + ARCMSR_MAX_QBUFFER - + acb-rqbuf_firstindex); + ptmpQbuffer +=
Re: AS2105-based enclosure size issues with 2TB HDDs
On Mon, 2014-08-25 at 10:58 +, Alfredo Dal Ava Junior wrote: - 1TB and 2TB: READ_CAPACITY_10 returns correct size value - 3TB and 4TB: READ_CAPACITY_10 returns size in a 2TB modulus If we fix capacity size by reporting (READ_CAPACITY_10 + MODULO_2TB), the result will be invalid when user plugs a 2TB HDD. An idea (bring by Oliver) is: It is worse than that. Pretty soon a 4.7TB disk will also be plausible. first guess reading last sector using modulus result to check if size is valid. It is necessary that a virgin disk also be handled correctly. We cannot use the partition table (besides it being a layering violation) It would propose (in pseudo code) if (read_capacity_16(device) 0) { lower_limit = read_capacity_10(device); for (i = 1; i++; i SANITY_LIMIT) { err = read_sector(device, lower_limit + i * 2TB-SAFETY); if (err == OUT_OF_RANGE) break; } if (i SANITY_LIMIT) return (i - 1) * 2TB + lower_limit; else return ERROR; Regards Oliver -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][SCSI] scsi-mq: fix requests that use a separate CDB buffer
On 08/23/2014 03:09 PM, Douglas Gilbert wrote: For inclusion in 3.17 only. May want to check if blk-mq work in lk 3.16 and earlier breaks the bsg driver's capability to send SCSI cdbs that are longer than 16 bytes. I think 3.16 and earlier are OK. The problem was caused by scsi_mq_prep_fn(), which is new in 3.17. Thanks for testing. Tony -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: AS2105-based enclosure size issues with 2TB HDDs
On Mon, 25 Aug 2014, Oliver Neukum wrote: On Mon, 2014-08-25 at 10:58 +, Alfredo Dal Ava Junior wrote: - 1TB and 2TB: READ_CAPACITY_10 returns correct size value - 3TB and 4TB: READ_CAPACITY_10 returns size in a 2TB modulus If we fix capacity size by reporting (READ_CAPACITY_10 + MODULO_2TB), the result will be invalid when user plugs a 2TB HDD. An idea (bring by Oliver) is: It is worse than that. Pretty soon a 4.7TB disk will also be plausible. first guess reading last sector using modulus result to check if size is valid. It is necessary that a virgin disk also be handled correctly. We cannot use the partition table (besides it being a layering violation) It would propose (in pseudo code) if (read_capacity_16(device) 0) { lower_limit = read_capacity_10(device); for (i = 1; i++; i SANITY_LIMIT) { err = read_sector(device, lower_limit + i * 2TB-SAFETY); if (err == OUT_OF_RANGE) break; } if (i SANITY_LIMIT) return (i - 1) * 2TB + lower_limit; else return ERROR; Don't forget that lots of disks go crazy if you try to read from a nonexistent block, that is, one beyond the end of the disk. IMO, this bug cannot be worked around in any reasonable manner. The device simply cannot handle disks larger than 2 TB. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
scsi-mq and 3.17rc1
Two scsi-mq tips: 1. Several people have been wondering how to enable scsi-mq. Add this to your kernel command line (e.g., in /boot/grub/grub.conf if using grub-1): scsi_mod.use_blk_mq=Y One way to tell it is enabled is to notice these directories being created in sysfs: /sys/block/sda/mq 2. If you're running kernel-3.17.0rc1 and scsi-mq, you may want to pick up this patch from Tejun in Jens' linux-block tree: http://git.kernel.dk/?p=linux-block.git;a=commit;h=cddd5d17642cc6881352732693c2ae6930e9ce65 to avoid errors like this on device removal: [ 1454.516816] [ cut here ] [ 1454.518348] WARNING: CPU: 0 PID: 4157 at lib/percpu-refcount.c:179 percpu_ref_kill_and_confirm+0x69/0x80() [ 1454.521464] percpu_ref_kill() called more than once! ... [ 1454.553623] Call Trace: [ 1454.554707] [815af44f] dump_stack+0x49/0x62 [ 1454.556349] [81055f6c] warn_slowpath_common+0x8c/0xc0 [ 1454.558487] [81056056] warn_slowpath_fmt+0x46/0x50 [ 1454.560565] [812aa1b9] percpu_ref_kill_and_confirm+0x69/0x80 [ 1454.562654] [8127a753] blk_mq_freeze_queue+0x53/0xf0 [ 1454.565179] [8127ae56] blk_mq_free_queue+0x26/0x140 [ 1454.567572] [81272da8] blk_release_queue+0x88/0xe0 [ 1454.569465] [81299972] kobject_cleanup+0x82/0x1c0 [ 1454.571316] [81299abd] kobject_release+0xd/0x10 [ 1454.573289] [81299af1] kobject_put+0x31/0x70 [ 1454.574937] [8126e8d5] blk_put_queue+0x15/0x20 [ 1454.576651] [813dfaa4] scsi_device_dev_release_usercontext+0xc4/0x120 [ 1454.579026] [8106cc7d] execute_in_process_context+0x5d/0x70 [ 1454.581442] [813df9dc] scsi_device_dev_release+0x1c/0x20 [ 1454.583502] [813a46d8] device_release+0x38/0xb0 [ 1454.585483] [81299972] kobject_cleanup+0x82/0x1c0 [ 1454.587547] [81299abd] kobject_release+0xd/0x10 [ 1454.589307] [81299af1object_put+0x31/0x70 [ 1454.691222] [813a42f7] put_device+0x17/0x20 [ 1454.692843] [813ce8f5] scsi_device_put+0x45/0x60 Both Douglas Gilbert and I ran into that. --- Rob ElliottHP Server Storage -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 80711] [PATCH]SG_FLAG_LUN_INHIBIT is no longer implemented and there's not way to prevent the kernel from using the 2nd cdb byte for the LUN
On Sun, 24 Aug 2014, Christoph Hellwig wrote: On Fri, Aug 22, 2014 at 01:29:32PM -0400, Alan Stern wrote: Other than this, I'm fine with the code ... you can add the acked by from me when we resolve the above question. Okay. It's true that this issue is only tangentially related to the main point of the patch. It could be removed and addressed later. Just make it a separate patch and send it along.. All right. But I still want to know first whether the patch really fixes the original problem. Tiziano, do you intend to test this patch? James, can you explain how the INQUIRY command in scsi_probe_lun() managed to work back in the days when multi-lun SCSI-2 devices were common? sdev-scsi_level doesn't get set when sdev is allocated, so it initially contains 0, so the LUN bits won't get filled in when the first INQUIRY command is sent. Then how could the target know which logical unit the INQUIRY was meant for? Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 80711] [PATCH]SG_FLAG_LUN_INHIBIT is no longer implemented and there's not way to prevent the kernel from using the 2nd cdb byte for the LUN
https://bugzilla.kernel.org/show_bug.cgi?id=80711 --- Comment #18 from Alan Stern st...@rowland.harvard.edu --- On Sun, 24 Aug 2014, Christoph Hellwig wrote: On Fri, Aug 22, 2014 at 01:29:32PM -0400, Alan Stern wrote: Other than this, I'm fine with the code ... you can add the acked by from me when we resolve the above question. Okay. It's true that this issue is only tangentially related to the main point of the patch. It could be removed and addressed later. Just make it a separate patch and send it along.. All right. But I still want to know first whether the patch really fixes the original problem. Tiziano, do you intend to test this patch? James, can you explain how the INQUIRY command in scsi_probe_lun() managed to work back in the days when multi-lun SCSI-2 devices were common? sdev-scsi_level doesn't get set when sdev is allocated, so it initially contains 0, so the LUN bits won't get filled in when the first INQUIRY command is sent. Then how could the target know which logical unit the INQUIRY was meant for? Alan Stern -- You are receiving this mail because: You are the assignee for the bug. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: scsi-mq and 3.17rc1
On Mon, Aug 25, 2014 at 02:31:58PM +, Elliott, Robert (Server Storage) wrote: Two scsi-mq tips: 1. Several people have been wondering how to enable scsi-mq. Add this to your kernel command line (e.g., in /boot/grub/grub.conf if using grub-1): scsi_mod.use_blk_mq=Y Btw, I heard a few complaints about this, and one suggestion would be to add a config option to enable it by default. Would you or others like this? -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Debugging scsi abort handling ?
On Mon, 2014-08-25 at 13:39 +0200, Paolo Bonzini wrote: Il 25/08/2014 13:26, Hans de Goede ha scritto: Thanks Bart and Paolo, your insights into this are greatly appreciated. So with uas there are separate usb transaction for cmd, data in, data out and sense for each tag. At the time of abort, usually one of data in / data out and a sense usb transaction will be outstanding. There already is logic in the driver to kill the data in / out transactions if a sense gets returned (usually with an error) before they are done. So if I'm reading this correctly, then on a successful abort, the sense transaction (if not already completed by the target) should be cancelled as it will never complete, correct ? Yes. More precisely, once the response IU comes back for the abort, there should be no more IUs for that task. Figure 13 (Multiple Command Example) in the UAS spec actually shows that. At least, that's what it should be like in theory. I suspect firmware bugs will abound in this area, but at least you shouldn't be actively expecting an IU for an aborted task. Just a note on this. The abort area in all devices is where we have scope for spec compliance issues. Also in the old days abort itself could trigger a firmware crash on some devices (including arrays). The problem is actually that the system is usually groaning because it's out of memory within the controller. That actually means that sending in another task (the abort) causes greater pressure. For this reason, some device drivers choose to skip the abort step and head straight to reset. For reset, you guarantee that all outstanding tasks for the unit are killed. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] libsas: modify SATA error handler
Some more comments below. [..] + + pmp = sata_srst_pmp(link); + + msecs = 0; + now = jiffies; + if (time_after(deadline, now)) + msecs = jiffies_to_msecs(deadline - now); + + memset(tf, 0, sizeof(struct ata_taskfile)); + tf.ctl = ATA_SRST; + tf.device = ATA_DEVICE_OBS; + + if (sas_ata_exec_polled_cmd(ap, tf, pmp, msecs)) { + ret = -EIO; + goto fail; + } + + tf.ctl = ~ATA_SRST; + sas_ata_exec_polled_cmd(ap, tf, pmp, msecs); What about lldd's that do not know how to handle ATA_SRST? I think we need preparation patches to make SRST support opt-in, or teach all lldds how to handle these SRST sas_tasks. I think lldds have different actions to handle SRST because there is no unified standard. But it should be easy to support it. Later, I'll submit a mvsas patch to show how to support it. Right, but what about the other lldd's? Libsas needs to check whether the lldd supports SRST before attempting to submit. [..] +/* According sata 3.0 spec 13.15.4.2, enable the device port */ +static int sas_ata_pmp_hard_reset(struct ata_link *link, unsigned int *class, + unsigned long deadline) { + struct ata_port *ap = link-ap; + struct domain_device *dev = ap-private_data; + struct sas_ha_struct *sas_ha = dev-port-ha; + struct Scsi_Host *host = sas_ha-core.shost; + struct sas_internal *i = to_sas_internal(host-transportt); + int ret = -1; + u32 scontrol = 0; + + set_bit(SAS_DEV_RESET, dev-state); + + ret = sata_scr_read(link, SCR_CONTROL, scontrol); I think a comment is needed clarifying that these reads generate sas_tasks to a pmp and are not trying to read/write local SCR registers that do not exist on a SAS hba. I think the situation can't happen here. Right, that's why we need a comment, because by normally libsas lldd's do not support scr-register accesses. + if (ret) + goto error; + + /* enable device port */ + scontrol = 0x1; + ret = sata_scr_write(link, SCR_CONTROL, scontrol); + if (ret) + goto error; + + ata_msleep(ap, 1); + + scontrol = 0x0; + ret = sata_scr_write(link, SCR_CONTROL, scontrol); + if (ret) + goto error; + + ata_msleep(ap, 10); + + /* check device link status */ + if (ata_link_offline(link)) { + SAS_DPRINTK(link is offline.\n); + goto error; + } + + /* clear X bit */ + scontrol = 0x; + ret = sata_scr_write(link, SCR_ERROR, scontrol); + if (ret) + goto error; + + /* may be need to wait for device sig */ + ata_msleep(ap, 3); + + /* check device class */ + if (i-dft-lldd_dev_classify) + *class = i-dft-lldd_dev_classify(dev); + + return 0; + +error: + SAS_DPRINTK(failed to hard reset.\n); + return ret; +} + /* * notify the lldd to forget the sas_task for this internal ata command * that bypasses scsi-eh @@ -551,8 +771,12 @@ void sas_ata_end_eh(struct ata_port *ap) static struct ata_port_operations sas_sata_ops = { .prereset = ata_std_prereset, .hardreset = sas_ata_hard_reset, + .softreset = sas_ata_soft_reset, + .pmp_hardreset = sas_ata_pmp_hard_reset, + .freeze = sas_ata_freeze, + .thaw = sas_ata_thaw, .postreset = ata_std_postreset, - .error_handler = ata_std_error_handler, + .error_handler = sata_pmp_error_handler, .post_internal_cmd = sas_ata_post_internal, .qc_defer = ata_std_qc_defer, .qc_prep= ata_noop_qc_prep, diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h index ef7872c..a26466a 100644 --- a/include/scsi/libsas.h +++ b/include/scsi/libsas.h @@ -689,6 +689,12 @@ struct sas_domain_function_template { /* GPIO support */ int (*lldd_write_gpio)(struct sas_ha_struct *, u8 reg_type, u8 reg_index, u8 reg_count, u8 *write_data); + + /* ATA EH functions */ + void (*lldd_dev_freeze)(struct domain_device *); Why do we need to pass this all the way through to the lldd? Can we get away with emulating this in sas_ata.c. Because SAS HBAs spec haven't a unified standard, not like AHCI. But I think we can delete the interface because we don't disable interrupt during EH now. Ok. + void (*lldd_dev_thaw)(struct domain_device *); Same note as lldd_dev_freeze + int (*lldd_wait_task_done)(struct sas_task *); Should not be needed. + int (*lldd_dev_classify)(struct domain_device *); I think this belongs in a
[PATCH 2/5] SES: generate KOBJ_CHANGE on enclosure attach
From: Song Liu [mailto:songliubrav...@fb.com] Sent: Monday, August 25, 2014 10:26 AM To: Song Liu Cc: Dan Williams; Hannes Reinecke Subject: [PATCH 2/5] SES: generate KOBJ_CHANGE on enclosure attach From: Dan Williams dan.j.willi...@intel.com In support of a /dev/disk/by-slot populated with data from the enclosure and ses modules udev needs notification when the new interface files/links are available. Otherwise, any udev rules specified for the disk cannot assume that the enclosure topology has settled. Signed-off-by: Dan Williams dan.j.willi...@intel.com Signed-off-by: Song Liu songliubrav...@fb.com Reviewed-by: Jens Axboe ax...@fb.com Cc: Hannes Reinecke h...@suse.de --- drivers/scsi/ses.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c index c2e8a98..8f0a62a 100644 --- a/drivers/scsi/ses.c +++ b/drivers/scsi/ses.c @@ -349,7 +349,8 @@ static int ses_enclosure_find_by_addr(struct enclosure_device *edev, if (scomp-addr != efd-addr) continue; - enclosure_add_device(edev, i, efd-dev); + if (enclosure_add_device(edev, i, efd-dev) == 0) + kobject_uevent(efd-dev-kobj, KOBJ_CHANGE); return 1; } return 0; -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] SES: add reliable slot attribute
From: Song Liu [mailto:songliubrav...@fb.com] Sent: Monday, August 25, 2014 10:26 AM To: Song Liu Cc: Dan Williams; Hannes Reinecke Subject: [PATCH 4/5] SES: add reliable slot attribute From: Dan Williams dan.j.willi...@intel.com The name provided by firmware is in a vendor specific format, publish the slot number to have a reliable mechanism for identifying slots across firmware implementations. If the enclosure does not provide a slot number fallback to the component number which is guaranteed unique, and usually mirrors the slot number. Cleaned up the unused ses_component.desc in the process. Signed-off-by: Dan Williams dan.j.willi...@intel.com Signed-off-by: Song Liu songliubrav...@fb.com Reviewed-by: Jens Axboe ax...@fb.com Cc: Hannes Reinecke h...@suse.de --- drivers/misc/enclosure.c | 20 +++- drivers/scsi/ses.c| 17 - include/linux/enclosure.h | 1 + 3 files changed, 32 insertions(+), 6 deletions(-) diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c index 646068a..de335bc 100644 --- a/drivers/misc/enclosure.c +++ b/drivers/misc/enclosure.c @@ -145,8 +145,10 @@ enclosure_register(struct device *dev, const char *name, int components, if (err) goto err; - for (i = 0; i components; i++) + for (i = 0; i components; i++) { edev-component[i].number = -1; + edev-component[i].slot = -1; + } mutex_lock(container_list_lock); list_add_tail(edev-node, container_list); @@ -552,6 +554,20 @@ static ssize_t get_component_type(struct device *cdev, return snprintf(buf, 40, %s\n, enclosure_type[ecomp-type]); } +static ssize_t get_component_slot(struct device *cdev, + struct device_attribute *attr, char *buf) { + struct enclosure_component *ecomp = to_enclosure_component(cdev); + int slot; + + /* if the enclosure does not override then use 'number' as a stand-in */ + if (ecomp-slot = 0) + slot = ecomp-slot; + else + slot = ecomp-number; + + return snprintf(buf, 40, %d\n, slot); } static DEVICE_ATTR(fault, S_IRUGO | S_IWUSR, get_component_fault, set_component_fault); @@ -562,6 +578,7 @@ static DEVICE_ATTR(active, S_IRUGO | S_IWUSR, get_component_active, static DEVICE_ATTR(locate, S_IRUGO | S_IWUSR, get_component_locate, set_component_locate); static DEVICE_ATTR(type, S_IRUGO, get_component_type, NULL); +static DEVICE_ATTR(slot, S_IRUGO, get_component_slot, NULL); static struct attribute *enclosure_component_attrs[] = { dev_attr_fault.attr, @@ -569,6 +586,7 @@ static struct attribute *enclosure_component_attrs[] = { dev_attr_active.attr, dev_attr_locate.attr, dev_attr_type.attr, + dev_attr_slot.attr, NULL }; ATTRIBUTE_GROUPS(enclosure_component); diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c index 61deb4e..bafa301 100644 --- a/drivers/scsi/ses.c +++ b/drivers/scsi/ses.c @@ -47,7 +47,6 @@ struct ses_device { struct ses_component { u64 addr; - unsigned char *desc; }; static int ses_probe(struct device *dev) @@ -307,19 +306,26 @@ static void ses_process_descriptor(struct enclosure_component *ecomp, int invalid = desc[0] 0x80; enum scsi_protocol proto = desc[0] 0x0f; u64 addr = 0; + int slot = -1; struct ses_component *scomp = ecomp-scratch; unsigned char *d; - scomp-desc = desc; - if (invalid) return; switch (proto) { + case SCSI_PROTOCOL_FCP: + if (eip) { + d = desc + 4; + slot = d[3]; + } + break; case SCSI_PROTOCOL_SAS: - if (eip) + if (eip) { + d = desc + 4; + slot = d[3]; d = desc + 8; - else + } else d = desc + 4; /* only take the phy0 addr */ addr = (u64)d[12] 56 | @@ -335,6 +341,7 @@ static void ses_process_descriptor(struct enclosure_component *ecomp, /* FIXME: Need to add more protocols than just SAS */ break; } + ecomp-slot = slot; scomp-addr = addr; } diff --git a/include/linux/enclosure.h b/include/linux/enclosure.h index 807622b..0f826c1 100644 --- a/include/linux/enclosure.h +++ b/include/linux/enclosure.h @@ -92,6 +92,7 @@ struct enclosure_component { int fault; int active; int locate; + int slot; enum enclosure_status status; }; -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/5] SES: Add power_status to SES enclosure component
From: Song Liu [mailto:songliubrav...@fb.com] Sent: Monday, August 25, 2014 10:26 AM To: Song Liu Cc: Hannes Reinecke Subject: [PATCH 5/5] SES: Add power_status to SES enclosure component Add power_status to SES enclosure component, so we can power on/off the HDDs behind the enclosure. Check firmware status in ses_set_* before sending control pages to firmware. Signed-off-by: Song Liu songliubrav...@fb.com Acked-by: Dan Williams dan.j.willi...@intel.com Reviewed-by: Jens Axboe ax...@fb.com Cc: Hannes Reinecke h...@suse.de --- drivers/misc/enclosure.c | 29 ++ drivers/scsi/ses.c| 98 ++- include/linux/enclosure.h | 6 +++ 3 files changed, 124 insertions(+), 9 deletions(-) diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c index de335bc..0331dfe 100644 --- a/drivers/misc/enclosure.c +++ b/drivers/misc/enclosure.c @@ -148,6 +148,7 @@ enclosure_register(struct device *dev, const char *name, int components, for (i = 0; i components; i++) { edev-component[i].number = -1; edev-component[i].slot = -1; + edev-component[i].power_status = 1; } mutex_lock(container_list_lock); @@ -546,6 +547,31 @@ static ssize_t set_component_locate(struct device *cdev, return count; } +static ssize_t get_component_power_status(struct device *cdev, + struct device_attribute *attr, + char *buf) +{ + struct enclosure_device *edev = to_enclosure_device(cdev-parent); + struct enclosure_component *ecomp = to_enclosure_component(cdev); + + if (edev-cb-get_power_status) + edev-cb-get_power_status(edev, ecomp); + return snprintf(buf, 40, %d\n, ecomp-power_status); } + +static ssize_t set_component_power_status(struct device *cdev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct enclosure_device *edev = to_enclosure_device(cdev-parent); + struct enclosure_component *ecomp = to_enclosure_component(cdev); + int val = simple_strtoul(buf, NULL, 0); + + if (edev-cb-set_power_status) + edev-cb-set_power_status(edev, ecomp, val); + return count; +} + static ssize_t get_component_type(struct device *cdev, struct device_attribute *attr, char *buf) { @@ -577,6 +603,8 @@ static DEVICE_ATTR(active, S_IRUGO | S_IWUSR, get_component_active, set_component_active); static DEVICE_ATTR(locate, S_IRUGO | S_IWUSR, get_component_locate, set_component_locate); +static DEVICE_ATTR(power_status, S_IRUGO | S_IWUSR, get_component_power_status, + set_component_power_status); static DEVICE_ATTR(type, S_IRUGO, get_component_type, NULL); static DEVICE_ATTR(slot, S_IRUGO, get_component_slot, NULL); @@ -585,6 +613,7 @@ static struct attribute *enclosure_component_attrs[] = { dev_attr_status.attr, dev_attr_active.attr, dev_attr_locate.attr, + dev_attr_power_status.attr, dev_attr_type.attr, dev_attr_slot.attr, NULL diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c index bafa301..ea6b262 100644 --- a/drivers/scsi/ses.c +++ b/drivers/scsi/ses.c @@ -67,6 +67,20 @@ static int ses_probe(struct device *dev) #define SES_TIMEOUT (30 * HZ) #define SES_RETRIES 3 +static void init_device_slot_control(unsigned char *dest_desc, +struct enclosure_component *ecomp, +unsigned char *status) +{ + memcpy(dest_desc, status, 4); + dest_desc[0] = 0; + /* only clear byte 1 for ENCLOSURE_COMPONENT_DEVICE */ + if (ecomp-type == ENCLOSURE_COMPONENT_DEVICE) + dest_desc[1] = 0; + dest_desc[2] = 0xde; + dest_desc[3] = 0x3c; +} + + static int ses_recv_diag(struct scsi_device *sdev, int page_code, void *buf, int bufflen) { @@ -178,14 +192,22 @@ static int ses_set_fault(struct enclosure_device *edev, struct enclosure_component *ecomp, enum enclosure_component_setting val) { - unsigned char desc[4] = {0 }; + unsigned char desc[4]; + unsigned char *desc_ptr; + + desc_ptr = ses_get_page2_descriptor(edev, ecomp); + + if (!desc_ptr) + return -EIO; + + init_device_slot_control(desc, ecomp, desc_ptr); switch (val) { case ENCLOSURE_SETTING_DISABLED: - /* zero is disabled */ + desc[3] = 0xdf; break; case ENCLOSURE_SETTING_ENABLED: - desc[3] = 0x20; + desc[3] |= 0x20; break; default: /* SES doesn't do the SGPIO blink settings */ @@ -219,14
[PATCH 0/5] Feature enhancements for ses module
From: Song Liu [mailto:songliubrav...@fb.com] Sent: Monday, August 25, 2014 10:26 AM To: Song Liu Subject: [PATCH 0/5] Feature enhancements for ses module These patches include a few enhancements to ses module: 1: close potential race condition by at enclosure race condition 2,3,4: add enclosure id and device slot, so we can create symlink in /dev/disk/by-slot: # ls -d /dev/disk/by-slot/* /dev/disk/by-slot/enclosure-0x5000ae41fc1310ff-slot0 5: add ability to power on/off device with power_status file in sysfs. Dan Williams (4): SES: close potential registration race SES: generate KOBJ_CHANGE on enclosure attach SES: add enclosure logical id SES: add reliable slot attribute Song Liu (1): SES: Add power_status to SES enclosure component drivers/misc/enclosure.c | 98 +++ drivers/scsi/ses.c| 145 +++--- include/linux/enclosure.h | 13 - 3 files changed, 220 insertions(+), 36 deletions(-) -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] SES: add enclosure logical id
From: Song Liu [mailto:songliubrav...@fb.com] Sent: Monday, August 25, 2014 10:26 AM To: Song Liu Cc: Dan Williams; Hannes Reinecke Subject: [PATCH 3/5] SES: add enclosure logical id From: Dan Williams dan.j.willi...@intel.com Export the NAA logical id for the enclosure. This is optionally available from the sas_transport_class, but it is really a property of the enclosure. Signed-off-by: Dan Williams dan.j.willi...@intel.com Signed-off-by: Song Liu songliubrav...@fb.com Reviewed-by: Jens Axboe ax...@fb.com Cc: Hannes Reinecke h...@suse.de --- drivers/misc/enclosure.c | 13 + drivers/scsi/ses.c| 9 + include/linux/enclosure.h | 1 + 3 files changed, 23 insertions(+) diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c index 15faf61..646068a 100644 --- a/drivers/misc/enclosure.c +++ b/drivers/misc/enclosure.c @@ -395,8 +395,21 @@ static ssize_t components_show(struct device *cdev, } static DEVICE_ATTR_RO(components); +static ssize_t id_show(struct device *cdev, +struct device_attribute *attr, +char *buf) +{ + struct enclosure_device *edev = to_enclosure_device(cdev); + + if (edev-cb-show_id) + return edev-cb-show_id(edev, buf); + return 0; +} +static DEVICE_ATTR_RO(id); + static struct attribute *enclosure_class_attrs[] = { dev_attr_components.attr, + dev_attr_id.attr, NULL, }; ATTRIBUTE_GROUPS(enclosure_class); diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c index 8f0a62a..61deb4e 100644 --- a/drivers/scsi/ses.c +++ b/drivers/scsi/ses.c @@ -258,6 +258,14 @@ static int ses_set_active(struct enclosure_device *edev, return ses_set_page2_descriptor(edev, ecomp, desc); } +static int ses_show_id(struct enclosure_device *edev, char *buf) { + struct ses_device *ses_dev = edev-scratch; + unsigned long long id = get_unaligned_be64(ses_dev-page1+8+4); + + return sprintf(buf, %#llx\n, id); +} + static struct enclosure_component_callbacks ses_enclosure_callbacks = { .get_fault = ses_get_fault, .set_fault = ses_set_fault, @@ -265,6 +273,7 @@ static struct enclosure_component_callbacks ses_enclosure_callbacks = { .get_locate = ses_get_locate, .set_locate = ses_set_locate, .set_active = ses_set_active, + .show_id= ses_show_id, }; struct ses_host_edev { diff --git a/include/linux/enclosure.h b/include/linux/enclosure.h index a835d33..807622b 100644 --- a/include/linux/enclosure.h +++ b/include/linux/enclosure.h @@ -79,6 +79,7 @@ struct enclosure_component_callbacks { int (*set_locate)(struct enclosure_device *, struct enclosure_component *, enum enclosure_component_setting); + int (*show_id)(struct enclosure_device *, char *buf); }; -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bnx2fc: fix incorrect DMA memory mapping in bnx2fc_map_sg()
On Fri, 22 Aug 2014, Eddie Wai wrote: On Fri, 2014-08-22 at 20:12 +0200, Maurizio Lombardi wrote: Hi Chad, On 08/22/2014 02:08 PM, Chad Dupuis wrote: Eddie, Maurizio, Since it looks like there can be a difference in the pdev would it make sense instead to convert the unmap function to use the bare DMA API so we're consistent in our use of hba-pcidev-dev? Maybe something like this: I looked at what the bnx2i driver code does, it has a hba-pcidev pointer too but makes use of scsi_map_sg()/scsi_unmap_sg(). So, from my point of view, it is the bnx2fc's map operation (that bypasses scsi_map_sg() and uses dma_map_sg()) to look like a suspicious exception and I decided to change it. So far, I didn't get any error or strange behaviour after this change. Eddie, what do you think about it? Regards, Maurizio Lombardi diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c index 32a5e0a..8b4adcf 100644 --- a/drivers/scsi/bnx2fc/bnx2fc_io.c +++ b/drivers/scsi/bnx2fc/bnx2fc_io.c @@ -1700,9 +1700,12 @@ static int bnx2fc_build_bd_list_from_sg(struct bnx2fc_cmd *io_req) static void bnx2fc_unmap_sg_list(struct bnx2fc_cmd *io_req) { struct scsi_cmnd *sc = io_req-sc_cmd; + struct bnx2fc_interface *interface = io_req-port-priv; + struct bnx2fc_hba *hba = interface-hba; - if (io_req-bd_tbl-bd_valid sc) { - scsi_dma_unmap(sc); + if (io_req-bd_tbl-bd_valid sc scsi_sg_count(sc)) { + dma_unmap_sg(hba-pcidev-dev, scsi_sglist(sc), + scsi_sg_count(sc), sc-sc_data_direction); io_req-bd_tbl-bd_valid = 0; } } On Fri, 22 Aug 2014, Maurizio Lombardi wrote: Hi Eddie, On 08/20/2014 07:35 PM, Eddie Wai wrote: On Mon, 2014-08-04 at 10:20 +0200, Maurizio Lombardi wrote: In the bnx2fc_map_sg() function, the original behaviour is to allocate the DMA memory by directly calling dma_map_sg() instead of using scsi_dma_map(). In contrast, bnx2fc_unmap_sg_list() calls scsi_dma_unmap(). The problem is that bnx2fc_map_sg() passes to the dma_map_sg() function the pointer to the device structure taken from pci_dev and not from Scsi_Host (as scsi_dma_map/unmap() do). Great find, Maurizio, Thanks again. Thanks :) In some circumstances, the two devices may be different and the DMA library will be unable to find the correct entry when freeing the memory. I'm curious at how the hba-pcidev-dev in the dma_map_sg call could end up being different from the scsi_cmnd's device-host-dma_dev... Can you share the test scenario? It happens every time you set up an fcoe interface, so you just have to compile the kernel with the DMA API debug option. It is 100% reproducible with RHEL6, RHEL7 and the latest mainline kernel also. This got me thinking that the scsi_host's dma_dev must have been changed recently and I think I found the culprit: http://www.spinics.net/lists/linux-scsi/msg65225.html So back when, we changed the scsi_host's parent from hba-pcidev-dev to the fcoe_ctlr_device's dev to basically align with the soft fcoe code to fix a fcoemon expectation. Although this was changed correctly, the sg DMA mapping routine did not adapt to it; hence the mismatch. Prior to this change, both sg dma map and unmap were done to the hba-pcidev-dev along with all other DMA mapping routines. I also see that even though bnx2i uses the scsi_map_sg/scsi_unmap_sg pairs, ultimately, they were operating on the hba-pcidev-dev device. My opinion is that it would probably be in our best interest to have all DMA mapping/unmapping routines continue to operate on the hba-pcidev-dev directly to prevent any unforeseen DMA problems. Thanks Eddie. Would my original patch work then (I added comments per Christoph's request): diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c index 32a5e0a..6a61a0d 100644 --- a/drivers/scsi/bnx2fc/bnx2fc_io.c +++ b/drivers/scsi/bnx2fc/bnx2fc_io.c @@ -1651,6 +1651,10 @@ static int bnx2fc_map_sg(struct bnx2fc_cmd *io_req) u64 addr; int i; + /* +* Use dma_map_sg directly to ensure we're using the correct +* dev struct off of pcidev. +*/ sg_count = dma_map_sg(hba-pcidev-dev, scsi_sglist(sc), scsi_sg_count(sc), sc-sc_data_direction); scsi_for_each_sg(sc, sg, sg_count, i) { @@ -1700,9 +1704,16 @@ static int bnx2fc_build_bd_list_from_sg(struct bnx2fc_cmd *io_req) static void bnx2fc_unmap_sg_list(struct bnx2fc_cmd *io_req) { struct scsi_cmnd *sc = io_req-sc_cmd; + struct bnx2fc_interface *interface = io_req-port-priv; + struct bnx2fc_hba *hba = interface-hba; - if (io_req-bd_tbl-bd_valid sc) { - scsi_dma_unmap(sc); + /* +* Use dma_unmap_sg directly to ensure we're using the correct +* dev struct off of pcidev. +*/ + if (io_req-bd_tbl-bd_valid sc scsi_sg_count(sc)) { +
Re: RES: AS2105-based enclosure size issues with 2TB HDDs
On Mon, 2014-08-25 at 18:48 +, Alfredo Dal Ava Junior wrote: On Mon, 25 Aug 2014, Alan Stern wrote: Don't forget that lots of disks go crazy if you try to read from a nonexistent block, that is, one beyond the end of the disk. IMO, this bug cannot be worked around in any reasonable manner. The device simply cannot handle disks larger than 2 TB. This device works well on Windows 7 if HDD is already partitioned. So how did the partition get on there at the correct size in the first place? Even under windows partition managers believe the disk size they get from the system if the disk is blank. Sounds like Win7 gnores the READ_CAPACITY value on a partitioned HDD. It shows 4TB on disk manager, but will fall back to 1.8TB if I remove the partition. I assume for those of us on linux-scsi who don't have the start of this thread, you already tried read capacity(16) and it has this same problem? Could we do the same? Hm, allowing users to set desired capacity by overriding the partition size ... I'm sure that's not going to cause support problems ... Would be possible to signalize to upper layers that capacity is not accurate (or return zero) on this device, and tell GPT handlers to bypass it's partition_size vs disk size consistency check? If we can find a heuristic to set the correct capacity in the kernel, then we may be able to fix this ... but as Alain says, it looks hard. We can't ask userspace to hack tools for broken devices. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 3/3] ahci_xgene: Fix the link down in first attempt for the APM X-Gene SoC AHCI SATA host controller driver.
On Sun, Aug 24, 2014 at 12:07:27AM +0530, Suman Tripathi wrote: This patch addresses two HW erratas as described below by retrying the COMRESET: 1. During speed negotiation, controller is not able to detect ALIGN at GEN3(6Gbps) within 54.6us and results in a timeout. This issue can be recovered by issuing a COMRESET. 2. Although ALIGN detection is successful, 8b/10b and disparity bit errors can occur which result in the signature FIS not received successfully by the Host controller. Due to this, the PHY communication between the host and drive is not established because of CDR(clock and data recovery) circuit doesn't lock. This issue can be recoverd by issuing a COMRESET. The above retries are issued only if the port status register PXSTATUS reports device presence detected but PHY communication not established. The maximum retry attempts are 3. Didn't I ask you to update the comment to explain what's going on? Or is the existing comment already sufficient? Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RES: AS2105-based enclosure size issues with 2TB HDDs
On Mon, 25 Aug 2014, Alfredo Dal Ava Junior wrote: On Mon, 25 Aug 2014, Alan Stern wrote: Don't forget that lots of disks go crazy if you try to read from a nonexistent block, that is, one beyond the end of the disk. IMO, this bug cannot be worked around in any reasonable manner. The device simply cannot handle disks larger than 2 TB. This device works well on Windows 7 if HDD is already partitioned. Sounds like Win7 gnores the READ_CAPACITY value on a partitioned HDD. It shows 4TB on disk manager, but will fall back to 1.8TB if I remove the partition. That's right. I don't know why Windows behaves that way. Could we do the same? Would be possible to signalize to upper layers that capacity is not accurate (or return zero) on this device, and tell GPT handlers to bypass it's partition_size vs disk size consistency check? There is no way to do this, as far as I know. But I'm not an expert in this area. Maybe you can figure out a way to add this capability. (But then what happens if the size stored in the partition table really is larger than the disk's capacity?) Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
FW: [GIT PULL] First round of SCSI updates for the 3.15+ merge window
Hey James, Is is possible to include attached patch? Regards, Hiral On 6/9/14, 8:02 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: This patch consists of the usual driver updates (qla2xxx, qla4xxx, lpfc, be2iscsi, fnic, ufs, NCR5380) The NCR5380 is the addition to maintained status of a long neglected driver for older hardware. In addition there are a lot of minor fixes and cleanups and some more updates to make scsi mq ready. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-for-linus The short changelog is: Adheer Chandravanshi (2): qla4xxx: Fix smatch warning in func qla4xxx_conn_get_param qla4xxx: Fix smatch warning in func qla4xxx_get_ep_param Alexey Khoroshilov (1): bfa: allocate memory with GFP_ATOMIC in spinlock context Armen Baloyan (3): qla2xxx: Adjust adapter reset routine to the changes in firmware specification for ISPFx00. qla2xxx: Decrease pci access for response queue processing for ISPFX00. qla2xxx: Change copyright year to 2014 in all the source files. Atul Deshmukh (3): qla2xxx: IOCB data should be copied to I/O mem using memcpy_toio. qla2xxx: Include delay.h file for msleep declartion in qla_nx2.c file. qla2xxx: Use proper log message for flash lock failed error. Ben Hutchings (1): mvsas: Recognise device/subsystem 9485/9485 as 88SE9485 Benoit Taine (2): qla2xxx: Use kmemdup instead of kmalloc + memcpy qla4xxx: Use kmemdup instead of kmalloc + memcpy Chad Dupuis (7): qla2xxx: Remove wait for online from host reset handler. qla2xxx: Do logins from a chip reset in DPC thread instead of the error handler thread. qla2xxx: Reduce the time we wait for a command to complete during SCSI error handling. qla2xxx: Clear loop_id for ports that are marked lost during fabric scanning. qla2xxx: Avoid escalating the SCSI error handler if the command is not found in firmware. qla2xxx: Remove unnecessary printk_ratelimited from qla_nx2.c qla2xxx: Do not schedule reset when one is already active when receiving an invalid status handle. Christoph Hellwig (7): Revert be2iscsi: Fix processing cqe for cxn whose endpoint is freed scsi_debug: simple short transfer injection virtio_scsi: use cmd_size scsi: handle command allocation failure in scsi_reset_provider scsi: reintroduce scsi_driver.init_command scsi: remove scsi_end_request scsi: explicitly release bidi buffers Dan Carpenter (1): qla2xxx: fix incorrect debug printk David Jeffery (1): sd: medium access timeout counter fails to reset Fabian Frederick (1): include/scsi/osd_protocol.h: remove unnecessary __constant Finn Thain (14): scsi/NCR5380: dprintk macro scsi/NCR5380: merge sun3_scsi_vme.c into sun3_scsi.c scsi/NCR5380: reduce depth of sun3_scsi nested includes scsi/NCR5380: remove unused macro definitions scsi/NCR5380: fix and standardize NDEBUG macros scsi/NCR5380: adopt dprintk() scsi/NCR5380: adopt NCR5380_dprint() and NCR5380_dprint_phase() scsi/NCR5380: fix dprintk macro usage and definition scsi/NCR5380: fix build failures when debugging is enabled scsi/NCR5380: use NCR5380_dprint() instead of NCR5380_print() scsi/NCR5380: remove old CVS keywords scsi/NCR5380: remove redundant HOSTS_C macro tests scsi/NCR5380: remove unused BOARD_NORMAL and BOARD_NCR53C400 MAINTAINERS: add an entry for all the NCR5380 drivers Giridhar Malavali (3): qla2xxx: Check for peg alive counter and clear any outstanding mailbox command. qla2xxx: Issue abort command for outstanding commands during cleanup when only firmware is alive. qla2xxx: Log when device state is moved to failed state. Hannes Reinecke (1): scsi: set correct completion code in scsi_send_eh_cmnd() Himanshu Madani (1): qla2xxx: Fix beacon blink logic for ISP26xx/83xx. Himanshu Madhani (1): qla2xxx: Remove mapped vp index iterator macro dead code. Hiral Patel (5): qla2xxx: Check the QLA8044_CRB_DRV_ACTIVE_INDEX register when we are not the owner of the reset. qla2xxx: Enable fw_dump_size for ISP8044. qla2xxx: Introduce fw_dump_flag to track fw dump progress. qla2xxx: Remove unnecessary delays from fw dump code path. qla2xxx: Track the process when the ROM_LOCK failure happens Hiral Shah (3): fnic: fnic Control Path Trace Utility fnic: Failing to queue aborts due to Q full cause terminate driver timeout fnic: NoFIP solicitation frame in NONFIP mode and changed IO Throttle count James Smart (1): lpfc: Add iotag memory barrier Jayamohan Kallickal (8): be2iscsi: Bump the driver version be2iscsi: Fix processing cqe for cxn whose endpoint is freed be2iscsi: Fix destroy MCC-CQ before MCC-EQ is destroyed be2iscsi: Fix memory corruption in MBX path
Re: [Bug 80711] [PATCH]SG_FLAG_LUN_INHIBIT is no longer implemented and there's not way to prevent the kernel from using the 2nd cdb byte for the LUN
On Mon, 2014-08-25 at 10:44 -0400, Alan Stern wrote: James, can you explain how the INQUIRY command in scsi_probe_lun() managed to work back in the days when multi-lun SCSI-2 devices were common? sdev-scsi_level doesn't get set when sdev is allocated, so it initially contains 0, so the LUN bits won't get filled in when the first INQUIRY command is sent. Then how could the target know which logical unit the INQUIRY was meant for? Best guess, some patches over the course of time altered the way we do this and no-one noticed. I think it was probably the introduction of the unknown SCSI data level that caused the breakage. Historically, the LUN in CMD bits is left over from SCSI-1; it was incorporated into SCSI-2 for backward compatibility (even though SCSI-2 moved the LUN specification to the identify message). In SCSI-3 and beyond, those bits were obsoleted and transports took sole responsibility for LUN handling. I'm fairly certain all the SCSI-1 devices relying on this behaviour have long ago migrated to the great data centre in the sky. Alan's fix looks reasonable because we probe LUN 0 first (for SCSI-1 and 2 which has parallel scanning), which is why it doesn't matter (the bits are set to zero) and once we have LUN 0 we propagate the data to the target and make it the basis for future checks. I'd like to see a comment explaining this in the code, though ... James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND][PATCH 09/10][SCSI]mpt2sas: Added module parameter 'unblock_io' to unblock IO's during disk addition
Let me try to answer this as I had worked on this defect in the async release. Martin This really sounds like a scenario you should be able to handle in Martin general (without special don't-be-broken module parameters). In the async release, we wanted this fix to be tried, tested and vetted by customers, before making this as the default behaviour. We wanted to make sure, this change doesn't cause any data corruption inadvertently. Martin Also, shouldn't your internal task management be able to deal with this? Martin Why does the sdev's state during probe affect your ability to make Martin forward progress? The FW informs the driver to add a new disk and we add that through the SAS transport layer (through a workqueue). Before the SCSI mid layer could finish the probe and add the disk at its layer, FW identifies a link down and informs the driver (DELAY_NOT_RESPONDING). As per the current design, the driver blocks any further I/O to that disk. Now, the SCSI mid layer couldn't move forward with the addition because it couldn't send down Report_Luns/TUR to the disk. The FW in the meantime, would either sense the link up (RC_PHY_CHANGED) or disk completely removed (TARGET_NOT_RESPONDING) and send up the event to the driver. As per the current design, the driver would push the processing of those events in the same workqueue behind the new disk addition work (which is blocked). So, the disk addition code waits for the unblock to happen, while the RC_PHY_CHANGED work waits in the queue behind the disk addition for its chance to unblock the disk. The fix is basically to perform the unblock for RC_PHY_CHANGED in the interrupt context, so that the disk addition work could proceed. The FW has I/O missing delay timer device missing delay timer. If we don't block I/Os upon receiving DELAY_NOT_RESPONDING, there is possibility of I/O missing delay timer expiring and SCSI mid layer exhausting the no of retries leading to I/O failure which the customers do not want to happen for the link down case. Regards, Praveen -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RES: RES: AS2105-based enclosure size issues with 2TB HDDs
On Mon, 15 Aug 2014 James Bottomley wrote: So how did the partition get on there at the correct size in the first place? Even under windows partition managers believe the disk size they get from the system if the disk is blank. The HDD can be partitioned outside the enclosure, when connected directly to one SATA port on motherboard. READ_CAPACITY(16) will return properly when talking directly to the HDD. I assume for those of us on linux-scsi who don't have the start of this thread, you already tried read capacity(16) and it has this same problem? Sorry, I forgot to include linux-scsi. On this device, READ_CAPACITY_16 fails 100% of times as unsupported command, then READ_CAPACITY_10 has a distinct behavior depending on HDD size: 1TB and 2TB: READ_CAPACITY_10 returns correct value 3TB and 4TB: READ_CAPACITY_10 returns in a 2TB modulus Hm, allowing users to set desired capacity by overriding the partition size ... I'm sure that's not going to cause support problems ... Well, it is causing problems anyway... from user perspective, it's a Linux compatibility issue, as it works fine on Windows. I'm not an expert, but I'm wondering that if usb-storage could set capacity as UNDETERMINED/ Zero (or keep using the readcapacity_10 as it as with some flag signalizing it as inaccurate), EFI partition check would be able to ignore size and look for secondary GPT where it really is. []'s Alfredo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 80711] [PATCH]SG_FLAG_LUN_INHIBIT is no longer implemented and there's not way to prevent the kernel from using the 2nd cdb byte for the LUN
On Mon, 25 Aug 2014, James Bottomley wrote: On Mon, 2014-08-25 at 10:44 -0400, Alan Stern wrote: James, can you explain how the INQUIRY command in scsi_probe_lun() managed to work back in the days when multi-lun SCSI-2 devices were common? sdev-scsi_level doesn't get set when sdev is allocated, so it initially contains 0, so the LUN bits won't get filled in when the first INQUIRY command is sent. Then how could the target know which logical unit the INQUIRY was meant for? Best guess, some patches over the course of time altered the way we do this and no-one noticed. I think it was probably the introduction of the unknown SCSI data level that caused the breakage. Heh. The change was made by commit 4d7db04a7a69 ([SCSI] add SCSI_UNKNOWN and LUN transfer limit restrictions) back in 2006 -- the 2.6.17 kernel. If nobody has complained in all this time then it's probably not worth changing. Historically, the LUN in CMD bits is left over from SCSI-1; it was incorporated into SCSI-2 for backward compatibility (even though SCSI-2 moved the LUN specification to the identify message). In SCSI-3 and beyond, those bits were obsoleted and transports took sole responsibility for LUN handling. I'm fairly certain all the SCSI-1 devices relying on this behaviour have long ago migrated to the great data centre in the sky. Alan's fix looks reasonable because we probe LUN 0 first (for SCSI-1 and 2 which has parallel scanning), which is why it doesn't matter (the bits are set to zero) and once we have LUN 0 we propagate the data to the target and make it the basis for future checks. I'd like to see a comment explaining this in the code, though ... If you think it would be a good idea, I could put it into a separate patch with an explanatory comment. At the moment, I'm inclined to forget about it. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 80711] [PATCH]SG_FLAG_LUN_INHIBIT is no longer implemented and there's not way to prevent the kernel from using the 2nd cdb byte for the LUN
https://bugzilla.kernel.org/show_bug.cgi?id=80711 --- Comment #19 from Alan Stern st...@rowland.harvard.edu --- On Mon, 25 Aug 2014, James Bottomley wrote: On Mon, 2014-08-25 at 10:44 -0400, Alan Stern wrote: James, can you explain how the INQUIRY command in scsi_probe_lun() managed to work back in the days when multi-lun SCSI-2 devices were common? sdev-scsi_level doesn't get set when sdev is allocated, so it initially contains 0, so the LUN bits won't get filled in when the first INQUIRY command is sent. Then how could the target know which logical unit the INQUIRY was meant for? Best guess, some patches over the course of time altered the way we do this and no-one noticed. I think it was probably the introduction of the unknown SCSI data level that caused the breakage. Heh. The change was made by commit 4d7db04a7a69 ([SCSI] add SCSI_UNKNOWN and LUN transfer limit restrictions) back in 2006 -- the 2.6.17 kernel. If nobody has complained in all this time then it's probably not worth changing. Historically, the LUN in CMD bits is left over from SCSI-1; it was incorporated into SCSI-2 for backward compatibility (even though SCSI-2 moved the LUN specification to the identify message). In SCSI-3 and beyond, those bits were obsoleted and transports took sole responsibility for LUN handling. I'm fairly certain all the SCSI-1 devices relying on this behaviour have long ago migrated to the great data centre in the sky. Alan's fix looks reasonable because we probe LUN 0 first (for SCSI-1 and 2 which has parallel scanning), which is why it doesn't matter (the bits are set to zero) and once we have LUN 0 we propagate the data to the target and make it the basis for future checks. I'd like to see a comment explaining this in the code, though ... If you think it would be a good idea, I could put it into a separate patch with an explanatory comment. At the moment, I'm inclined to forget about it. Alan Stern -- You are receiving this mail because: You are the assignee for the bug. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RES: RES: AS2105-based enclosure size issues with 2TB HDDs
On Mon, 25 Aug 2014 Alan Stern wrote: On Mon, 25 Aug 2014, Alfredo Dal Ava Junior wrote: That's right. I don't know why Windows behaves that way. Please look this output from diskpart (Windows): DISKPART list partition Partition ### Type Size Offset - --- --- Partition 1Primary 3726 GB17 KB DISKPART list disk Disk ### Status Size Free Dyn Gpt - --- --- --- --- Disk 0Online 298 GB 0 B * Disk 1Online 1678 GB 0 B* DISKPART select disk 1 Disk 1 is now the selected disk. DISKPART list partition Partition ### Type Size Offset - --- --- Partition 1Primary 3726 GB17 KB Could we do the same? Would be possible to signalize to upper layers that capacity is not accurate (or return zero) on this device, and tell GPT handlers to bypass it's partition_size vs disk size consistency check? There is no way to do this, as far as I know. But I'm not an expert in this area. Maybe you can figure out a way to add this capability. (But then what happens if the size stored in the partition table really is larger than the disk's capacity?) It's the first time I touch this code, but I will learn from the code and try to find it out... but still in hope to find a clean solution... If size stored on partition table is really larger than disk, my guess it will cause I/O errors, and that some disks may get crazy like you pointed. Do you think disk could cause kernel instability? I think it is acceptable if no consequences to kernel stability, since it is a specific-device workaround. []'s Alfredo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RES: RES: AS2105-based enclosure size issues with 2TB HDDs
On Mon, 25 Aug 2014, Alfredo Dal Ava Junior wrote: Well, it is causing problems anyway... from user perspective, it's a Linux compatibility issue, as it works fine on Windows. I'm not an expert, but I'm wondering that if usb-storage could set capacity as UNDETERMINED/ Zero (or keep using the readcapacity_10 as it as with some flag signalizing it as inaccurate), EFI partition check would be able to ignore size and look for secondary GPT where it really is. Part of the problem is that usb-storage has no way to know that anything strange is going on. It's normal for READ CAPACITY(16) to fail (this depend on the SCSI level), and it's normal for the READ CAPACITY(10) to report a value less than 2 TB. Really there is only one way to know whether the actual capacity is larger than the reported capacity, and that is by trying to read blocks beyond the reported capacity -- a dangerous test that many drives do not like. (And in most cases a futile test. If a device doesn't support READ CAPACITY(16), how likely is it to support READ(16)?) Yes, in theory you can believe the value in the partition table in preference to the reported capacity. But what if that value is wrong? And how do you tell partition-manager programs what the capacity should be when they modify or set up the initial partition table? Attaching the drive over a SATA connection when you want to partition it isn't a very satisfactory solution. After all, if you have a SATA connection available then why would you be using a USB enclosure in the first place? Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RES: RES: RES: AS2105-based enclosure size issues with 2TB HDDs
On Mon, 25 Aug 2014 Alan Stern wrote: Part of the problem is that usb-storage has no way to know that anything strange is going on. It's normal for READ CAPACITY(16) to fail (this depend on the SCSI level), and it's normal for the READ CAPACITY(10) to report a value less than 2 TB. Really there is only one way to know whether the actual capacity is larger than the reported capacity, and that is by trying to read blocks beyond the reported capacity -- a dangerous test that many drives do not like. (And in most cases a futile test. If a device doesn't support READ CAPACITY(16), how likely is it to support READ(16)?) Yes, in theory you can believe the value in the partition table in preference to the reported capacity. But what if that value is wrong? And how do you tell partition-manager programs what the capacity should be when they modify or set up the initial partition table? We may add this device to UNUSUAL_DEV list, to keep using READ_CAPACITY(10) and pass some flag to bypass EFI GPT partition check. I'm almost sure that kernel will be able to mount the partition if we can make it available on /proc/partition. This would make this device behaves like it does on Windows 7. I see this as best effort workaround for a cheap buggy hardware, like many others on UNUSUAL_DEV list Attaching the drive over a SATA connection when you want to partition it isn't a very satisfactory solution. After all, if you have a SATA connection available then why would you be using a USB enclosure in the first place? You may want do a backup or plug it in a laptop, for example. []'s Alfredo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 80711] [PATCH]SG_FLAG_LUN_INHIBIT is no longer implemented and there's not way to prevent the kernel from using the 2nd cdb byte for the LUN
https://bugzilla.kernel.org/show_bug.cgi?id=80711 --- Comment #20 from Alan Stern st...@rowland.harvard.edu --- On Mon, 25 Aug 2014, Alan Stern wrote: On Mon, 25 Aug 2014, James Bottomley wrote: On Mon, 2014-08-25 at 10:44 -0400, Alan Stern wrote: James, can you explain how the INQUIRY command in scsi_probe_lun() managed to work back in the days when multi-lun SCSI-2 devices were common? sdev-scsi_level doesn't get set when sdev is allocated, so it initially contains 0, so the LUN bits won't get filled in when the first INQUIRY command is sent. Then how could the target know which logical unit the INQUIRY was meant for? Best guess, some patches over the course of time altered the way we do this and no-one noticed. I think it was probably the introduction of the unknown SCSI data level that caused the breakage. Heh. The change was made by commit 4d7db04a7a69 ([SCSI] add SCSI_UNKNOWN and LUN transfer limit restrictions) back in 2006 -- the 2.6.17 kernel. If nobody has complained in all this time then it's probably not worth changing. It turns out the code is already there, and I didn't realize because I was looking at the wrong source file. scsi_sysfs_device_initialize() already does: sdev-scsi_level = starget-scsi_level; Here's the update to the patch, adding an appropriate comment and setting the new sdev-lun_in_sdb flag properly: Alan Stern Index: usb-3.16/drivers/scsi/scsi_sysfs.c === --- usb-3.16.orig/drivers/scsi/scsi_sysfs.c +++ usb-3.16/drivers/scsi/scsi_sysfs.c @@ -1238,7 +1238,19 @@ void scsi_sysfs_device_initialize(struct sdev-sdev_dev.class = sdev_class; dev_set_name(sdev-sdev_dev, %d:%d:%d:%d, sdev-host-host_no, sdev-channel, sdev-id, sdev-lun); +/* + * Get a default scsi_level from the target (derived from sibling + * devices). This is the best we can do for guessing how to set + * sdev-lun_in_cdb for the initial INQUIRY command. For LUN 0 the + * setting doesn't matter, because all the bits are zero anyway. + * But it does matter for higher LUNs. + */ sdev-scsi_level = starget-scsi_level; +if (sdev-scsi_level = SCSI_2 +sdev-scsi_level != SCSI_UNKNOWN +!shost-no_scsi2_lun_in_cdb) +sdev-lun_in_cdb = 1; + transport_setup_device(sdev-sdev_gendev); spin_lock_irqsave(shost-host_lock, flags); list_add_tail(sdev-same_target_siblings, starget-devices); -- You are receiving this mail because: You are the assignee for the bug. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RES: RES: AS2105-based enclosure size issues with 2TB HDDs
On Mon, 2014-08-25 at 16:21 -0400, Alan Stern wrote: On Mon, 25 Aug 2014, Alfredo Dal Ava Junior wrote: Well, it is causing problems anyway... from user perspective, it's a Linux compatibility issue, as it works fine on Windows. I'm not an expert, but I'm wondering that if usb-storage could set capacity as UNDETERMINED/ Zero (or keep using the readcapacity_10 as it as with some flag signalizing it as inaccurate), EFI partition check would be able to ignore size and look for secondary GPT where it really is. Part of the problem is that usb-storage has no way to know that anything strange is going on. It's normal for READ CAPACITY(16) to fail (this depend on the SCSI level), and it's normal for the READ CAPACITY(10) to report a value less than 2 TB. Just set US_FL_NEEDS_CAP16. If READ CAPACITY(16) fails in that case, it is clear that something is wrong. It must be set or READ CAPACITY(10) alone would be taken as giving a valid answer. At that time we are sure that the drive will be unusable unless we determine the correct capacity, so we don't make matters worse if we crash it. Is there an easy way for Alfredo to determine what happens if we read beyond the end? Regards Oliver -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 80711] [PATCH]SG_FLAG_LUN_INHIBIT is no longer implemented and there's not way to prevent the kernel from using the 2nd cdb byte for the LUN
On Mon, 2014-08-25 at 17:19 -0400, Alan Stern wrote: On Mon, 25 Aug 2014, Alan Stern wrote: On Mon, 25 Aug 2014, James Bottomley wrote: On Mon, 2014-08-25 at 10:44 -0400, Alan Stern wrote: James, can you explain how the INQUIRY command in scsi_probe_lun() managed to work back in the days when multi-lun SCSI-2 devices were common? sdev-scsi_level doesn't get set when sdev is allocated, so it initially contains 0, so the LUN bits won't get filled in when the first INQUIRY command is sent. Then how could the target know which logical unit the INQUIRY was meant for? Best guess, some patches over the course of time altered the way we do this and no-one noticed. I think it was probably the introduction of the unknown SCSI data level that caused the breakage. Heh. The change was made by commit 4d7db04a7a69 ([SCSI] add SCSI_UNKNOWN and LUN transfer limit restrictions) back in 2006 -- the 2.6.17 kernel. If nobody has complained in all this time then it's probably not worth changing. It turns out the code is already there, and I didn't realize because I was looking at the wrong source file. scsi_sysfs_device_initialize() already does: sdev-scsi_level = starget-scsi_level; Here's the update to the patch, adding an appropriate comment and setting the new sdev-lun_in_sdb flag properly: Looks good. Add your signed-off-by and you can add my acked-by as well. James Index: usb-3.16/drivers/scsi/scsi_sysfs.c === --- usb-3.16.orig/drivers/scsi/scsi_sysfs.c +++ usb-3.16/drivers/scsi/scsi_sysfs.c @@ -1238,7 +1238,19 @@ void scsi_sysfs_device_initialize(struct sdev-sdev_dev.class = sdev_class; dev_set_name(sdev-sdev_dev, %d:%d:%d:%d, sdev-host-host_no, sdev-channel, sdev-id, sdev-lun); + /* + * Get a default scsi_level from the target (derived from sibling + * devices). This is the best we can do for guessing how to set + * sdev-lun_in_cdb for the initial INQUIRY command. For LUN 0 the + * setting doesn't matter, because all the bits are zero anyway. + * But it does matter for higher LUNs. + */ sdev-scsi_level = starget-scsi_level; + if (sdev-scsi_level = SCSI_2 + sdev-scsi_level != SCSI_UNKNOWN + !shost-no_scsi2_lun_in_cdb) + sdev-lun_in_cdb = 1; + transport_setup_device(sdev-sdev_gendev); spin_lock_irqsave(shost-host_lock, flags); list_add_tail(sdev-same_target_siblings, starget-devices); -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html