Re: cdparanoia not setting count and/or reply_len properly
Stefan Richter wrote: DervishD wrote at lkml: Hi all :) I know, this has been treated on the list before (year 2005) but without any real solution I'm aware of. I'm running kernel 2.6.20.14, and I have an ATAPI DVD writer that I use with an IDE-to-USB adapter, so it appears as an SCSI drive to the kernel. Anytime I rip anything with it, the log fills with the same message: some numbers about a certain number of bytes and the old friend message that I've put in the subject. I assume that the warning makes sense, but the fact is that my log is full with the same message, the ripping is correct (so cdparanoia is working OK WRT ripping) and if weren't for the printk_ratelimit, the system will freeze. I don't know if cdparanoia should be fixed, but certainly the warning could be issued only if CONFIG_SCSI_VERBOSE is set. This way you will have the message if something goes wrong and you want more info, but in cases where the warning is harmless your log will be clean... Anyway, this message is not for make suggestions, but for asking for information: why is this warning happening? naugthy cdparanoia? naughty kernel? I'm a bit confused and I want to use my external DVD drive for ripping from time to time, to exercise it... Thanks a lot in advance :) Raúl Núñez de Arenas Coronado This question is better asked at lsml. (Therefore I'm quoting in full.) In Fedora 7 I see this: # cdparanoia --version cdparanoia III release 9.8 (March 23, 2001) (C) 2001 Monty [EMAIL PROTECTED] and Xiphophorus Report bugs to [EMAIL PROTECTED] http://www.xiph.org/paranoia/ So, given that date, lk 2.4.2 was out but it was probably a bit early to start using the sg version 3 interface which first appeared in lk 2.4.1 . So that lets annoy the user message was added by someone who got burnt by the old sg version 2 interface and decided people needed to be warned. The warning comes from this code is sg.c : /* * SG_DXFER_TO_FROM_DEV is functionally equivalent to SG_DXFER_FROM_DEV, * but is is possible that the app intended SG_DXFER_TO_DEV, because the re * is a non-zero input_size, so emit a warning. */ if (hp-dxfer_direction == SG_DXFER_TO_FROM_DEV) if (printk_ratelimit()) printk(KERN_WARNING sg_write: data in/out %d/%d bytes for SCSI comma nd 0x%x-- guessing data in;\n KERN_WARNING program %s not setting count and/or reply_len pr operly\n, old_hdr.reply_len - (int)SZ_SG_HEADER, input_size, (unsigned int) cmnd[0], current-comm); That code wasn't written be me and I would gladly remove it. For anyone who has read the sg driver documentation, SG_DXFER_TO_FROM_DEV implies a _read_ from the device. The reason SG_DXFER_TO_FROM_DEV exists is for backward compatibility to the sg version 1 interface. It was a hack to get around the fact that the SCSI subsystem didn't report short reads (what folks should use 'resid' for) back in those days. It is probably about time that cdparanoia was updated ... Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/3] clean gendisk out of scsi ULD structs
James Bottomley wrote: On Thu, 2007-07-05 at 14:06 -0700, Kristen Carlson Accardi wrote: Since gendisk will now become part of struct scsi_device, we don't need to store this value in any private data structs where they already store scsi_device. This series cleans up a few drivers which did this. Actually, as Al pointed out, we do have lifetime rules issues with doing this. The problem is that gendisk itself always has a shorter lifetime than scsi_device (not much shorter, usually, but if you execute a legal ULD unbind manoeuvre you'll end up with a dangling gendisk pointer). What about having short-lived scsi_device objects? For example: one that lives long enough for a pass-through to send a SCSI command (and receive its response) to one of a target's well known logical units. The other problem with taking gendisk out of the ULD structure and putting it into the scsi_device is that for the sg driver, we have two of them (one for the attached ULD and one for the sg driver). Add the bsg driver and that would make three of them. Or; if the lu's peripheral device type was not of interest to sd, st, sr, and osst; back to two gendisk objects (i.e. one each for sg and bsg). The fundamental issue seems to be that the gendisk is the holder of all the other info (queue, ULD etc) not vice versa ... and this patch is trying to reverse that relationship. A minor issue is the name gendisk ... unless, of course, you go and look at its definition in linux/genhd.h in which case the name looks somewhat appropriate. It looks like a mess [queue, ULD name, major/minor(s), partitions, capacity, disk_stats, kobjects, etc]. That is a considerable amount of superfluous information for just a tag for requests coming into (a) given queue when that queue leads to a non-block device. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/3] clean gendisk out of scsi ULD structs
Kristen Carlson Accardi wrote: Since gendisk will now become part of struct scsi_device, we don't need to store this value in any private data structs where they already store scsi_device. This series cleans up a few drivers which did this. Since a scsi_device object is usually a SCSI logical unit, one wonders why it would contain a gendisk object. Logical units aren't necessarily disks, they might be enclosures or just place holders that respond to an INQUIRY (e.g. lun=0 when the enclosing target has other active lus whose lun!=0). Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Low-level reformat with different sector size: ?
Matthias Urlichs wrote: Hello, Yesterday I managed to buy a couple of SCA disks with a sector size of ... *drumroll* ... 524. What's the easiest way to re-format these to use 512 bytes? Preferably without screwing up anything else on these things? Umm, I hope you don't consider losing all the previous data on the disks when a re-format is performed as screwing up? Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Low-level reformat with different sector size: ?
James Bottomley wrote: On Fri, 2007-06-29 at 15:34 +, Matthias Urlichs wrote: Yesterday I managed to buy a couple of SCA disks with a sector size of ... *drumroll* ... 524. What's the easiest way to re-format these to use 512 bytes? Preferably without screwing up anything else on these things? We use this program go reformat 520 sector size disks back to 512: http://parisc-linux.org/~jejb/blk512-linux.c It should work for your device as well. Beware it requires a complete low level format to achieve this, which can take a very long time. I might mention at this point that sg_format is derived from blk512-linux.c . Both should be able to format recent SCSI disks (e.g. manufactured in this millennium). sformat is an older program. All of them invoke the SCSI FORMAT command. If the SCSI FORMAT command is examined in SCSI-2, SBC, SBC-2 and SBC-3 then it can be seen as quite complex. Over the 15 year period spanned by those standards (SBC-3 is still a draft) it has become more complex and changed somewhat. The first terabyte SCSI disk was announced this week. I wonder how long it takes to format. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Very slow writes with mptsas
[EMAIL PROTECTED] wrote: On Tue, 05 Jun 2007, [EMAIL PROTECTED] wrote: Hello I'm seeing very slow writes on a Dell Precision 690 with the Dell SAS5 adapter, serving a RAID1 array of SATA-II disks. It's very similar to the problem in FreeBSD, described here: http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2007-03/msg00756.html I'm running FC6 with the latest kernel. Reads are quite fast, writes terribly slow. Thanks to all who replied to this query, especially the very detailed response from Eric Moore at LSI. The first important facet is that we need to operate on the two hidden physical disks, not the RAID device. lsscsi differentiates them: # lsscsi [0:0:0:0]diskATA WDC WD5000KS-75M 2E08 - [0:0:1:0]diskATA HDS725050KLA360 AB5A - [0:1:0:0]diskDell VIRTUAL DISK 1028 /dev/sda sg_map gives the generic device numbers: Using 'lsscsi -g' would also give you the generic device numbers. It is interesting that the above ATA disks do not have corresponding /dev/sd* device names. # sg_map -i -x /dev/sg0 0 0 0 0 0 ATA WDC WD5000KS-75M 2E08 /dev/sg1 0 0 1 0 0 ATA HDS725050KLA360 AB5A The write cache can then be enabled using sdparm: sdparm -s WCE=1 -S /dev/sg0 and the result checked with # sdparm -g WCE /dev/sg1 /dev/sg1: ATA HDS725050KLA360 AB5A WCE 1 [cha: y] This seems to make the write performance much better. Good. The question for Dell is why their version of the BIOS doesn't set the write cache in the first place or allow it to be altered by the user. The mechanism for doing this was only formalized recently with the SAT standard, so it may take a while for BIOSes and other infrastructure to catch up. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: doubts about sg driver
Parav Pandit wrote: Hi, Few basic questions on sg driver: 1. Are there any hooks that low level HBA driver needs to implement - for providing support for SG (SCSI generic) driver? Or SG always interacts with scsi_mod and it is transparent to the HBA drivers? From the tldp How-to and sg.c it looks like it doesn't directly talk with Low level HBA driver, but want to confirm. The sg driver talks to the scsi mid level (and the block layer strangely enough) but not directly to LLDs. 2. Can applications talk with SCSI RAID controller device (some targets exposes LUN-0 as controller) through sg interface or it is only for storage devices? The sg driver is useful for any SCSI device (logical unit) that is exposed by the scsi mid level. Apart from direct access (i.e. disk) devices that might include cd/dvd drives, tape drives, scsi enclosures, saf-te controllers (which have processor peripheral device type) and well known logical units. 3. How is the mapping between /dev/sda /dev/sdb etc to /dev/sg0 /dev/sg1 etc? Is this information is accessible via procfs or sysfs interface? In the lk 2.6 series the mapping can be found in sysfs (see lsscsi, specifically 'lsscsi -g'). In the sg3_utils package the sg_map utility shows the mapping. That may be helpful in the lk 2.4 series since there is no sysfs. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: scsi_debug fault-injection question
Randy Dunlap wrote: Hi Doug, scsi_debug.c says: MODULE_PARM_DESC(every_nth, timeout every nth command(def=100)); I don't see where the default of 100 is set. #define DEF_EVERY_NTH 0 ... static int scsi_debug_every_nth = DEF_EVERY_NTH; Can you clarify for me, please? Randy, s/100/0/ The string in MODULE_PARM_DESC is wrong. The support web page, http://www.torque.net/sg/sdebug26.html is accurate stating the default is 0 and notes: for error injection: 0 - off. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi_debug: correct parameter default text
Randy Dunlap wrote: From: Randy Dunlap [EMAIL PROTECTED] Correct the module info text for the default value of every_nth to 0. Signed-off-by: Randy Dunlap [EMAIL PROTECTED] Signed-off-by: Douglas Gilbert [EMAIL PROTECTED] Doug Gilbert --- drivers/scsi/scsi_debug.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-2622-rc4.orig/drivers/scsi/scsi_debug.c +++ linux-2622-rc4/drivers/scsi/scsi_debug.c @@ -2405,7 +2405,7 @@ MODULE_PARM_DESC(add_host, 0..127 hosts MODULE_PARM_DESC(delay, # of jiffies to delay response(def=1)); MODULE_PARM_DESC(dev_size_mb, size in MB of ram shared by devs(def=8)); MODULE_PARM_DESC(dsense, use descriptor sense format(def=0 - fixed)); -MODULE_PARM_DESC(every_nth, timeout every nth command(def=100)); +MODULE_PARM_DESC(every_nth, timeout every nth command(def=0)); MODULE_PARM_DESC(fake_rw, fake reads/writes instead of copying (def=0)); MODULE_PARM_DESC(max_luns, number of LUNs per target to simulate(def=1)); MODULE_PARM_DESC(no_lun_0, no LU number 0 (def=0 - have lun 0)); - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: MEDIUM FORMAT CORRUPTED error
sandip shete wrote: Hi, I recieved the following ASC/ASCQ values as Additional Sense data : 31/00. I looked it up and found that it stands for MEDIUM FORMAT CORRUPTED. Does it mean that the target disk has bad sectors? That error may be reported after a disk is reset during a FORMAT operation. Another related case is a media access after a MODE SELECT is used to change the sector size (e.g. from 512 to 528 bytes) and prior to a FORMAT command which actually reformats the disk to 528 byte sectors. So if a disk is reporting that ASC/ASCQ sequence for ever media access, then you need to format it. In that case sg_format in sg3_utils may be useful. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Very slow writes with mptsas
Matthew Jacob wrote: The FreeBSD problem was fixed by Scott Long a couple of days ago by doing some cut through SAS stuff that enabled Write Cache for SATA drives. Why LSI-Logic couldn't just blitheringly synthesize mode page 8 is beyond me, but okay. I dunno whether the issue here is the same one Scott tackled- probably given the messages. Eric- you listening in on this? Matt, Just in case Eric doesn't answer, I suspect if the HBA firmware can be upgraded (from Dell or LSI?) then WCE (write cache enable) in the caching mode page will be supported. It is one of the few mode page settings that is required to be implemented in SAT. The other field that should be changeable is DRA (disable read ahead). Both work on my LSI SAS HBA. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SMART support for SATA drives in SAS enclosures
Pim Zandbergen wrote: Is SMART support available for SATA drives in SAS enclosures? I'm testing this setup LSI Logic SAS3800X PCI-X SAS controller (mptsas driver) Promise V-Trak J300S SAS/SATA enclosure/expander 12x Seagate ST3500630NS Linux kernel 2.6.21.1 x86_64 smartmontools-5.37-1.1.fc6 from Fedora Core 6 smartctl -i -d sat /dev/sdc gives me Smartctl: Device Read Identity Failed I presume /dev/sdc is an actual disk rather than a RAID device made up of several disks. The SAT standard (and smartmontools) don't have a general way of addressing individual disks behind RAID infrastructure. For recent versions of smartmontools version 5.37 and MPT Fusion SAS HBAs this should work if /dev/sdc is a SATA disk. Your HBA may need a firmware upgrade. You might fetch sg3_utils version 1.24 and try: sg_sat_identify /dev/sdc That needs to work before smartctl has a hope. Same with -d ata. If I treat the disk as SCSI (-d scsi), the command will not fail, but wil only retrieve the serial number. With MPT Fusion SAS hardware (that I have seen) the SAT layer is in the HBA firmware. Only later versions of the firmware support the SCSI ATA PASS-THROUGH command. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Sg_ses question
Haefliger, Juerg wrote: Hi, Not sure if this is the right list for my question but I couldn't find a more suitable place to ask it. I'm trying to set the locator light of a disk in a SAS enclosure using sg_ses but I'm not getting anywhere. I'm dumping the enclosure status diagnostic page using 'sg_ses --page=2 --raw /dev/sgXX page' and then set the SELECT and RQST IDENT bits of the array device element in question and write it back doing 'sg_ses --control --page=2 --data=- /dev/sgXX tmp'. The command completes without error but unfortunately, nothing happens. When I read the page back, the IDENT bit is still cleared and the light on the enclosure remains turned off. Am I doing something wrong or am I missing something? Can't I use sg_ses to achieve this? The procedure looks correct. I haven't had any (other) reports of sg_ses not working lately. The only suggestion I can make is to ask if you have selected the element control rather than the overall control. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
hdparm 7.3 supports SAT
Mark Lord's hdparm version 7.3 supports the SCSI to ATA Translation (SAT) pass-through. So if SAT is supported, this allows hdparm to access ATA (e.g. SATA disks) and ATAPI (e.g. cd/dvd drives) devices behind SCSI transports. Note that the SAT layer may be in the kernel (e.g. libata), in a HBA's firmware (MPT Fusion SAS HBAs) or external. Also one of those SCSI transports might be SATA. See http://sourceforge.net/projects/hdparm/ Both sdparm and two utilities in sg3_utils used some tortured syntax to pipe through ATA IDENTIFY (PACKET) DEVICE responses to 'hdparm --Istdin' prior to hdparm 7.x . That, in most cases, should no longer be needed. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] bidi support: bidirectional request
Jens Axboe wrote: On Mon, Apr 30 2007, Benny Halevy wrote: Jens Axboe wrote: On Sun, Apr 29 2007, James Bottomley wrote: On Sun, 2007-04-29 at 18:48 +0300, Boaz Harrosh wrote: FUJITA Tomonori wrote: From: Boaz Harrosh [EMAIL PROTECTED] Subject: [PATCH 4/4] bidi support: bidirectional request Date: Sun, 15 Apr 2007 20:33:28 +0300 diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 645d24b..16a02ee 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -322,6 +322,7 @@ struct request { void *end_io_data; struct request_io_part uni; +struct request_io_part bidi_read; }; Would be more straightforward to have: struct request_io_part in; struct request_io_part out; Yes I wish I could do that. For bidi supporting drivers this is the most logical. But for the 99.9% of uni-directional drivers, calling rq_uni(), and being some what on the hotish paths, this means we will need a pointer to a uni request_io_part. This is bad because: 1st- There is no defined stage in a request life where to definitely set that pointer, specially in the preparation stages. 2nd- hacks like scsi_error.c/scsi_send_eh_cmnd() will not work at all. Now this is a very bad spot already, and I have a short term fix for it in the SCSI-bidi patches (not sent yet) but a more long term solution is needed. Once such hacks are cleaned up we can do what you say. This is exactly why I use the access functions rq_uni/rq_io/rq_in/rq_out and not open code access. I'm still not really convinced about this approach. The primary job of the block layer is to manage and merge READ and WRITE requests. It serves a beautiful secondary function of queueing for arbitrary requests it doesn't understand (REQ_TYPE_BLOCK_PC or REQ_TYPE_SPECIAL ... or indeed any non REQ_TYPE_FS). bidirectional requests fall into the latter category (there's nothing really we can do to merge them ... they're just transported by the block layer). The only unusual feature is that they carry two bios. I think the drivers that actually support bidirectional will be a rarity, so it might even be advisable to add it to the queue capability (refuse bidirectional requests at the top rather than perturbing all the drivers to process them). So, what about REQ_TYPE_BIDIRECTIONAL rather than REQ_BIDI? That will remove it from the standard path and put it on the special command type path where we can process it specially. Additionally, if you take this approach, you can probably simply chain the second bio through req-special as an additional request in the stream. The only thing that would then need modification would be the dequeue of the block driver (it would have to dequeue both requests and prepare them) and that needs to be done only for drivers handling bidirectional requests. I agree, I'm really not crazy about shuffling the entire request setup around just for something as exotic as bidirection commands. How about just keeping it simple - have a second request linked off the first one for the second data phase? So keep it completely seperate, not just overload -special for 2nd bio list. So basically just add a struct request pointer, so you can do rq = rq-next_rq or something for the next data phase. I bet this would be a LOT less invasive as well, and we can get by with a few helpers to support it. And it should definitely be a request type. I'm a bit confused since what you both suggest is very similar to what we've proposed back in October 2006 and the impression we got was that it will be better to support bidirectional block requests natively (yet to be honest, James, you wanted a linked request all along). It still has to be implemented natively at the block layer, just differently like described above. So instead of messing all over the block layer adding rq_uni() stuff, just add that struct request pointer to the request structure for the 2nd data phase. You can relatively easy then modify the block layer helpers to support mapping and setup of such requests. Before we go on that route again, how do you see the support for bidi at the scsi mid-layer done? Again, we prefer to support that officially using two struct scsi_cmnd_buff instances in struct scsi_cmnd and not as a one-off feature, using special-purpose state and logic (e.g. a linked struct scsi_cmd for the bidi_read sg list). The SCSI part is up to James, that can be done as either inside a single scsi command, or as linked scsi commands as well. I don't care too much about that bit, just the block layer parts :-). And the proposed block layer design can be used both ways by the scsi layer. Linked SCSI commands have been obsolete since SPC-4 rev 6 (18 July 2006) after proposal 06-259r1 was accepted. That proposal starts: The reasons for linked commands have been overtaken by time and events. I haven't see anyone mourning their demise on
Re: [PATCH 0/4] bidi support: block layer bidirectional io.
Boaz Harrosh wrote: Following are 4 (large) patches for support of bidirectional block I/O in kernel. (not including SCSI-ml or iSCSI) The submitted work is against linux-2.6-block tree as of 2007/04/15, and will only cleanly apply in succession. The patches are based on the RFC I sent 3 months ago. They only cover the block layer at this point. I suggest they get included in Morton's tree until they reach the kernel so they can get compiled on all architectures/platforms. There is still a chance that architectures I did not compile were not fully converted. (FWIW, my search for use of struct request members failed to find them). If you find such a case, please send me the file name and I will fix it ASAP. Patches summary: 1. [PATCH 1/4] bidi support: request dma_data_direction - Convert REQ_RW bit flag to a dma_data_direction member like in SCSI-ml use. - removed rq_data_dir() and added other APIs for querying request's direction. - fix usage of rq_data_dir() and peeking at req-cmd_flags REQ_RW to using new api. - clean-up bad usage of DMA_BIDIRECTIONAL and bzero of none-queue requests, to use the new blk_rq_init_unqueued_req() 2. [PATCH 2/4] bidi support: fix req-cmd == INT cases - Digging into all these old drivers, I have found traces of past life where request-cmd was the command type. This patch fixes some of these places. All drivers touched by this patch are clear indication of drivers that were not used for a while. Should we removed them from Kernel? These Are: drivers/acorn/block/fd1772.c, drivers/acorn/block/mfmhd.c, drivers/block/nbd.c, drivers/cdrom/aztcd.c, drivers/cdrom/cm206.c drivers/cdrom/gscd.c, drivers/cdrom/mcdx.c, drivers/cdrom/optcd.c drivers/cdrom/sjcd.c, drivers/ide/legacy/hd.c, drivers/block/amiflop.c 2. [PATCH 3/4] bidi support: request_io_part - extract io related fields in struct request into struct request_io_part in preparation to full bidi support. - new rq_uni() API to access the sub-structure. (Please read below comment on why an API and not open code the access) - Convert All users to new API. 3. [PATCH 4/4] bidi support: bidirectional block layer - add one more request_io_part member for bidi support in struct request. - add block layer API functions for mapping and accessing bidi data buffers and for ending a block request as a whole (end_that_request_block()) Developer comments: patch 1/4: Borrow from struct scsi_cmnd use of enum dma_data_direction. Further work (in progress) is the removal of the corresponding member from struct scsi_cmnd and converting all users to directly access rq_dma_dir(sc-req). patch 3/4: The reasons for introducing the rq_uni(req) API rather than directly accessing req-uni are: * WARN(!bidi_dir(req)) is life saving when developing bidi enabled paths. Once we, bidi users, start to push bidi requests down the kernel paths, we immediately get warned of paths we did not anticipate. Otherwise, they will be very hard to find, and will hurt kernel stability. * A cleaner and saner future implementation could be in/out members rather than uni/bidi_read. This way the dma_direction member can deprecated and the uni sub- structure can be maintained using a pointer in struct req. With this API we are free to change the implementation in the future without touching any users of the API. We can also experiment with what's best. Also, with the API it is much easier to convert uni-directional drivers for bidi (look in ll_rw_block.c in patch 4/4). * Note, that internal uses inside the block layer access req-uni directly, as they will need to be changed if the implementation of req-{uni, bidi_read} changes. Boaz, Recently I have been looking at things from the perspective of a SAS target and thinking about bidi commands. Taking XDWRITEREAD(10) in sbc3r09.pdf (section 5.44) as an example, with DISABLE_WRITE=0, the device server in the target should do the following: a) decode the cdb ** b) read from storage [lba, transfer_length] c) fetch data_out from initiator [transfer_length] *** d) XOR data from (b) and (c) and place result in (z) e) write the data from (c) to storage [lba, transfer_length] f) send (z) in data_in to initiator [transfer_length] g) send SCSI completion status to initiator Logically a) must occur first and g) last. The b) to f) sequence could be repeated (perhaps) by the device server subdividing the transfer_length (i.e. it may not be reasonable for the OS to assume that the data_out transfer will be complete before there is any data_in transfer). With this command (and with most other bidi commands
[ANNOUNCE] sdparm 1.01
sdparm is a command line utility designed to get and set SCSI device parameters (cf hdparm for ATA disks). The parameters are held in mode pages. Apart from SCSI devices (e.g. disks, tapes and enclosures) sdparm can be used on any device that uses a SCSI command set. Virtually all CD/DVD drives use the SCSI MMC set irrespective of the transport. sdparm also can decode VPD pages including the device identification page. Commands to start and stop the media; load and unload removable media and some other housekeeping functions are supported. sdparm supports both the linux kernel 2.4 and 2.6 series with ports to FreeBSD and Windows. ChangeLog for sdparm-1.01 [20070405] - add element address assignment mode page (smc) - improve error handling in lk 2.4 series mapping to sg devices - add configure.ac rule for mingw (Windows) - include inttypes.h to use PRIx64 instead of %llx - add LUICLR bit to extended inquiry VPD page - correct some headers for C++ inclusion - fix some C code to compile under C++ - fix bug when unusual transport or vendor given - add a Fujitsu vendor mode page - add initial priority to control extension mpage - add disconnect-reconnect mpage to generic list; there are still transport specific versions - extend block limits VPD page (sbc3r09) - sync with sg3_utils-1.24 pass-through code For more information and downloads see: http://www.torque.net/sg/sdparm.html A release announcement has been sent to freshmeat.net . Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Oops in scsi_send_eh_cmnd 2.6.21-rc5-git6,7,10,13
James Bottomley wrote: On Fri, 2007-04-06 at 08:51 -0700, Andrew Burgess wrote: James Bottomley wrote: It's actually a long standing bug in the 3w- driver. Apparently it assumes request sense is always the use_sg == 0 case. This is what it does on a request sense: static int tw_scsiop_request_sense(TW_Device_Extension *tw_dev, int request_id) { dprintk(KERN_NOTICE 3w-: tw_scsiop_request_sense()\n); /* For now we just zero the request buffer */ memset(tw_dev-srb[request_id]-request_buffer, 0, tw_dev-srb[request_id]-request_bufflen); tw_dev-state[request_id] = TW_S_COMPLETED; tw_state_request_finish(tw_dev, request_id); Note that it's clearing the request buffer, which is actually zeroing the scatterlist, hence the problem. OK. Is there a quick workaround or should I just wait for Adam Company to make a patch? Try this ... I think it's roughly the correct fix. You said your earlier patch would hide it, and then said you had a length wrong in it and I'm not sure what length you mean. It's the length specifier in the error handler request sense command ... I'll fix it up and redo my patch through scsi-misc, since it's not going to fix the root cause of the problem. James diff --git a/drivers/scsi/3w-.c b/drivers/scsi/3w-.c index bf5d63e..6b303ba 100644 --- a/drivers/scsi/3w-.c +++ b/drivers/scsi/3w-.c @@ -1864,10 +1864,17 @@ static int tw_scsiop_read_write(TW_Device_Extension *tw_dev, int request_id) /* This function will handle the request sense scsi command */ static int tw_scsiop_request_sense(TW_Device_Extension *tw_dev, int request_id) { + char request_buffer[18]; + dprintk(KERN_NOTICE 3w-: tw_scsiop_request_sense()\n); - /* For now we just zero the request buffer */ - memset(tw_dev-srb[request_id]-request_buffer, 0, tw_dev-srb[request_id]-request_bufflen); + memset(request_buffer, 0, sizeof(request_buffer)); + request_buffer[0] = 0x70; /* Immediate fixed format */ + request_buffer[7] = 11; /* minimum size per SPC: 18 bytes */ James, That last line should be: request_buffer[7] = 10; /* minimum size per SPC: 18 bytes */ Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] SG: cap reserved_size values at max_sectors
Alan Stern wrote: This patch (as857) modifies the SG_GET_RESERVED_SIZE and SG_SET_RESERVED_SIZE ioctls in the sg driver, capping the values at the device's request_queue's max_sectors value. This will permit cdrecord to obtain a legal value for the maximum transfer length, fixing Bugzilla #7026. The patch also caps the initial reserved_size value. There's no reason to have a reserved buffer larger than max_sectors, since it would be impossible to use the extra space. The corresponding ioctls in the block layer are modified similarly, and the initial value for the reserved_size is set as large as possible. This will effectively make it default to max_sectors. Note that the actual value is meaningless anyway, since block devices don't have a reserved buffer. Finally, the BLKSECTGET ioctl is added to sg, so that there will be a uniform way for users to determine the actual max_sectors value for any raw SCSI transport. Signed-off-by: Alan Stern [EMAIL PROTECTED] Alan, I have voiced my concerns about this earlier but I will now sign off to unblock the process (and deal with the consequences to sg users, if any). Signed-off-by: Douglas Gilbert [EMAIL PROTECTED] --- Index: usb-2.6/drivers/scsi/sg.c === --- usb-2.6.orig/drivers/scsi/sg.c +++ usb-2.6/drivers/scsi/sg.c @@ -917,6 +917,8 @@ sg_ioctl(struct inode *inode, struct fil return result; if (val 0) return -EINVAL; + val = min_t(int, val, + sdp-device-request_queue-max_sectors * 512); if (val != sfp-reserve.bufflen) { if (sg_res_in_use(sfp) || sfp-mmap_called) return -EBUSY; @@ -925,7 +927,8 @@ sg_ioctl(struct inode *inode, struct fil } return 0; case SG_GET_RESERVED_SIZE: - val = (int) sfp-reserve.bufflen; + val = min_t(int, sfp-reserve.bufflen, + sdp-device-request_queue-max_sectors * 512); return put_user(val, ip); case SG_SET_COMMAND_Q: result = get_user(val, ip); @@ -1061,6 +1064,9 @@ sg_ioctl(struct inode *inode, struct fil if (sdp-detached) return -ENODEV; return scsi_ioctl(sdp-device, cmd_in, p); + case BLKSECTGET: + return put_user(sdp-device-request_queue-max_sectors * 512, + ip); default: if (read_only) return -EPERM; /* don't know so take safe approach */ @@ -2339,6 +2345,7 @@ sg_add_sfp(Sg_device * sdp, int dev) { Sg_fd *sfp; unsigned long iflags; + int bufflen; sfp = kzalloc(sizeof(*sfp), GFP_ATOMIC | __GFP_NOWARN); if (!sfp) @@ -2369,7 +2376,9 @@ sg_add_sfp(Sg_device * sdp, int dev) if (unlikely(sg_big_buff != def_reserved_size)) sg_big_buff = def_reserved_size; - sg_build_reserve(sfp, sg_big_buff); + bufflen = min_t(int, sg_big_buff, + sdp-device-request_queue-max_sectors * 512); + sg_build_reserve(sfp, bufflen); SCSI_LOG_TIMEOUT(3, printk(sg_add_sfp: bufflen=%d, k_use_sg=%d\n, sfp-reserve.bufflen, sfp-reserve.k_use_sg)); return sfp; Index: usb-2.6/block/ll_rw_blk.c === --- usb-2.6.orig/block/ll_rw_blk.c +++ usb-2.6/block/ll_rw_blk.c @@ -1925,6 +1925,8 @@ blk_init_queue_node(request_fn_proc *rfn blk_queue_max_hw_segments(q, MAX_HW_SEGMENTS); blk_queue_max_phys_segments(q, MAX_PHYS_SEGMENTS); + q-sg_reserved_size = INT_MAX; + /* * all done */ Index: usb-2.6/block/scsi_ioctl.c === --- usb-2.6.orig/block/scsi_ioctl.c +++ usb-2.6/block/scsi_ioctl.c @@ -78,7 +78,9 @@ static int sg_set_timeout(request_queue_ static int sg_get_reserved_size(request_queue_t *q, int __user *p) { - return put_user(q-sg_reserved_size, p); + unsigned val = min(q-sg_reserved_size, q-max_sectors 9); + + return put_user(val, p); } static int sg_set_reserved_size(request_queue_t *q, int __user *p) - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux tape drivers
Kai Makisara wrote: On Tue, 3 Apr 2007, Andrew Morton wrote: (cc's added, with permission) On Tue, 3 Apr 2007 15:08:37 +0200 Kern Sibbald [EMAIL PROTECTED] wrote: Hello, I am the project manager for Bacula, an Open Source network backup program that runs on all popular OSes. After your presentation at FOSDEM in Febrary, we briefly talked about Linux tape driver problems I am encountering, and you offered to put me in touch with the appropriate kernel developers. I would much appreciate any help in this. Since the problems concern all tape drivers, I provide a very brief outline of what my would like to discuss. First, I must mention that the Linux SCSI driver works perfectly fine with Bacula, it is simply a question of possible improvements, under item 2 below. Issues for discussion: 1. Bugs: a. Other than the OSST driver, apparently no IDE/SATA tape driver works with Bacula. I don't have such a drive (working on it), but from user reports, it appears to me that there are problems of permitting variable length blocks, and more serious, when writing to the end of the tape, either the logical end of tape indicator is ignored, or when it is encountered, all further I/O is prohibited -- including a WEOF. This makes reliable writing of multiple reel tapes impossible. By the way, these IDE/SATA drives work with Bacula using the same source code cross-compiled with GNU C++ on Linux, then run on Windows machines, so it is most likely a driver issue rather than anything in Bacula or the hardware. Others have already answered this and I agree with their view. All of the tape drives seem to use the SSC command set or something close to that. One high-level driver should be enough to implement the user semantics. Libata should be able to drive the SATA/IDE drives using and the drives are visible as SCSI devices in Linux. In future there should be no real need for ide-scsi. Probably very few people have tried libata with tapes and there may be some problems to fix. Someone should test this with real devices and report the problems back to libata maintainers. 2. Usability of the current tape driver API (not bugs) a. With the new O_NONBLOCK flag introduced in kernel 2.5.x, opening a tape drive and finding out if a volume is mounted is much more complicated. It is really inconvenient and required a lot more code in prior kernels. This should be an item for discussion. The reasons for the change were: 1. To be compatible with the Unix standards, and 2. To be compatible with other Unix tape driver semantics. Because of these reasons the changes should probably not be reversed but there may be something to improve in the implementation. Suggestions? Kai, Perhaps an ignore_nonblock sysfs attribute or driver option could be added for the old semantics. As I have found in the past, programs the scan for devices by opening device nodes don't play well with drivers that hang on open. b. There is no simple way to determine if a tape is in a drive -- it is at least 20 or 30 lines of C code to do it right. Why not use GMT_ONLINE() with MTIOCGET? The definition from the st man page is: GMT_ONLINE(x): The last open() found the drive with a tape in place and ready for operation. If it does not work correctly, it can be fixed. (Of course, if you want to see if a tape is in a drive but not loaded, it is more difficult.) Sound like a TEST UNIT READY is all that is needed. They could call out to a utility like sg_turs or sdparm and check the exit status. They could also build with sg3_utils-libs and call sg_ll_test_unit_ready(). [All sg3_utils code is C++ friendly.] Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
sg_v4 interface, release 1.3
Attached is the SCSI generic version 4 interface, release 1.3 ChangeLog for release 1.3 [20070404] - increase tag size to 64 bits to comply with SAM-4 and SRP - add request_extra and spare_out2 for alignment Doug Gilbert SCSI Generic version 4 interface structure == Release 1.3 Goals: - handle both generalized request/response and data_out/data_in independently in same invocation (i.e. synchronous usage). - alternatively the request and data_out could be instigated in one invocation with pointers given for the incoming response and data_in. Then a second invocation (as a result of polling or asynchronous notification) reports the response and/or data_in is done, plus provides error/resid/timing information. This is asynchronous usage. This allows for the most complicated SCSI commands: tagged, variable length cdbs with bidirectional data transfers. - support multiple protocols. If they are generalized request-response protocols then they can choose either the request/response part of the interface or the data_out/data_in part. - layered error/condition reporting: (OS) driver, transport and device (logical unit). Method used to present this struct to OS (e.g. ioctl()) may also report error (e.g. EPERM). - allow for auxiliary information to be passed back for the application client to consider - same structure can be used for a synchronous (e.g. interruptible ioctl) or asynchronous (e.g. ioctl()/read() ) pass through. - leave device (lu) or target addressing issues to some other mechanism (what SCSI standards call the I_T_L or the I_T nexus respectively) as they are transport dependent. However do include the tag level (the _Q part of a I_T_L_Q nexus). - stay close enough to struct sg_io_hdr (sg version 3 interface) to use with existing SG_IO ioctls, current implementations expect 'S' in 'guard' Comments: - unsigned 64 bit integers used as pointer carriers to ease 32/64 bit code interworking (e.g. 32 bit app on 64 bit kernel) - should there be more (or less) spare fields? - the write() usage in the sg driver's asynchronous interface has caused problems when mistakenly applied to a block device node rather than a sg device node. Using an ioctl(flag_async) followed by a read() for asynchronous work offers similar functionality and is safer. Using ioctl(flag_async_start) and ioctl(flag_async_finish) is another possibility. - rather than have a separate ATA pass through mechanism, the SAT defined ATA PASS THROUGH SCSI commands could be used with the driver implementation routing the ATA commands to their subsystem. This could be flagged so it didn't preclude a SAT layer in a SCSI transport (e.g. MPT SAS HBA firmware). - if SAM/SPC does not define an enumeration for lesser used input fields, then use the value 0 for inert/off/don't_care . ChangeLog for release 1.3 [20070404] - increase tag size to 64 bits to comply with SAM-4 and SRP - add request_extra and spare_out2 for alignment ChangeLog for release 1.2 [20070314] - add dout_resid - re-arrange uint64_t types (i.e. pointer carriers) to be on a 8 byte boundary - reinstate dout_iovec_count and din_iovec_count (they were in release 1.1 but bsg dropped them) - change name: response_len_wr to response_len - pick up some descriptions from bsg ChangeLog for release 1.1 [20061106] - was called sg version 4 interface, version 1.1 so change the second version to release --- #include stdint.h struct sg_io_v4 { int32_t guard; /* [i] 'Q' to differentiate from v3 */ uint32_t protocol; /* [i] 0 - SCSI , */ uint32_t subprotocol; /* [i] 0 - SCSI command, 1 - SCSI task management function, */ uint32_t request_len; /* [i] in bytes {SCSI: cdb length} */ uint64_t request; /* [i], [*i] {SCSI: cdb} */ uint64_t request_tag; /* [i] {SCSI: task tag (only if flagged)} */ uint32_t request_attr; /* [i] {SCSI: task attribute} */ uint32_t request_priority; /* [i] {SCSI: task priority} */ uint32_t request_extra; /* [i] {spare, for padding} */ uint32_t max_response_len; /* [i] in bytes */ uint64_t response; /* [i], [*o] {SCSI: (auto)sense data} */ /* dout_: data out (to device); din_: data in (from device) */ uint32_t dout_iovec_count; /* [i] 0 - flat dout transfer */ /* else dout_xfer points to array of iovec */ uint32_t dout_xfer_len; /* [i] bytes to be transferred to device */ uint32_t din_iovec_count; /* [i] 0 - flat din transfer */ uint32_t din_xfer_len;
Re: SMP pass through interface via bsg
James Smart wrote: James Bottomley wrote: -- each SAS object (host, device, expander, etc) has the own bsg device I think so; probably attached via the transport class. FYI - I understand the idea of a bsg device per object, but really, for something that is used rarely, it's a bunch of overhead. Objects, data structures, etc - more udev/kobject mgmt I believe I prefer the approach of a shared distribution point - e.g. one bsg device at the transport globally, or perhaps one at the host (actually the outbound port aka host/channel) supporting the transport - followed by headers in the messages that direct flow after that. This kinda follows the model we have today for I/O - w/ queuecommand for the host, and addressing in the SCSI command. James, I fully agree. Additionally, I've always had some concern that we had to create an object for everything in the SAN (every phy!), and have that view replicated per host (for multi-initiator/multi-path SANs). I always believed there was some sets of things that you would want to talk to that just doesn't justify a new object (for example - do we start talking to process associators in FC ?). Another reason to move toward a transport-specific addressing header. Yes, seldom used things like well known logical units and virtual SMP targets (there is one on every MPT Fusion SAS LBA) that don't make the cut in the devices for everything model become invisible to Linux users. It is exactly these type of things that specialized user space programs use a pass-through interface for. So if the kernel can't find a use for it, then you, the owner of the hardware, won't be able to use it either. Hard to describe that approach as open software. My other concern with using bsg and the i/o path for transport management functions is they compete with i/o for things like the can_queue values. Should they ? Should they have higher priority ? sg v4 adds priority control mechanisms but there still remains possibilities for conflict, some of which may cause problems. I can see that a state based driver like st may want to stop a pass-through getting to a logical unit most of the time (and mechanisms could be added). However even st may want to use a pass through to the transport to reset the target (hard reset) if it can't get the LU RESET task management function to work. I'd really rather not go this route unless the one device per object approach becomes untenable. Understood, but building things until they topple is not a great idea as there will be back-ward compatibility issues w/ user-space/sysfs and the tools built around it. If you start with the shared distribution point, you can always support both (eventually) if its a good idea. Harder to do that in the reverse if it's toppling. We are talking about the SAS Management Protocol (SMP) in this thread and in SAS-1 and SAS-1.1 discovery is done by every SAS initiator, for every ripple in the topology. In large topologies this approach can cause a SMP storm that can temporarily drop SAS bandwidth to SCSI-1 figures. Today discovery is done in the LLD or firmware (Adaptec and LSI respectively) so they can magically make devices appear. The approach in SAS-2 is to devolve SAS discovery to expanders and use more efficient SMP functions. Current generation SAS HBAs (and some LLDs) will need to alter or stop their current SAS discovery techniques. The user space may need to get involved, for zoning and associated security. Only allowing the SMP pass-through to talk to devices that the kernel thinks are SAS expanders has some shortcomings: - how can user space SAS topology discovery be done? - what about SMP targets that are not on expanders - disabling the phy that connects an expander to the SAS domain is problematic when the file descriptor you are using notionally represents that expander. Note: discovery of a SAS topology is a different process from finding logical units within SCSI targets. In the context of SAS, the latter process can stay in the kernel and can be done for each SSP target found, preferably after the SAS topology has been fully discovered. The patch adds a hook into sas transport class. sas_host_setup calls bsg_register_queue. Then, the request_fn calls smp_execute_task to send a smp request and get the response. It doesn't look good to link the sas transport class with libsas. In addition, the mpt driver handles smp request/response in a very different way. Any suggestion to bind SMP pass through via bsg to aic94xx and mpt cleanly? bind in the transport class, not the driver ... Agree - the trick for libsas is to get an interface into the driver that both drivers can support. For LSI MPT Fusion it should be almost trivial to map the host,phy_id,sas_address (Tomo's hacky approach) through to LSI's ioc_num and SMP pass-through structures. The aic94xx must have a similar structure. How else could it implement a SMP
Re: aic94xx driver woes
James Bottomley wrote: On Sun, 2007-04-01 at 16:29 -0400, Douglas Gilbert wrote: ... sas: phy3 added to port0, phy_mask:0x8 sas: DOING DISCOVERY on port 0, pid:2110 aic94xx: scb:0x80 timed out This might be the problem. I see this periodically when a phy goes out to lunch on my system ... with me, it always seems to be phy0 of a port containing phy0-4 ... so phy1-3 still function to get messages. Can you try sending a link reset to phy3? It should be something like echo 1 /sys/class/sas_phy/phy-X:3/link_reset and see if it just produces aic94xx: scb:0x80 timed out Yes it does. Again? It is repeatable. Also when I connect to phy 0 it works (both direct connect and expander). However phys 1 and 2 react like phy 3 shown above. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: aic94xx driver woes
James Bottomley wrote: On Sat, 2007-03-31 at 15:05 -0400, Douglas Gilbert wrote: James, note the SAS address of the first expander. Thanks, just checking ... what happens when you directly attach a disk? Then I get what I term as udev hell. That is when FC6 gets to the point during boot-up of saying Starting udev: and hangs for about 5 minutes and then continues. I don't think my log records what happens in that elongated pause. Later attempts to talk to the single SAS disk (one port only connected) during boot-up are shown below starting from the first sign of trouble. The SAS address of the disk port is 0x5000c50001b02139 . Or even try the other expander? Same as yesterday's report: sas: RG to ex 500605b00af0 failed:0xff06 If I fiddle with the cabling long enough (i.e. shorten it) then it will work some of the time. But how come the card POST, Luben's driver and Adaptec's for Windows have no problem with exactly the same wiring all of the time? I suspect that either the HBA's phys are not being set up properly or, the first blemish (e.g. loss of dword synchronization) on the link, knocks the production driver off its perch, while the other drivers recover and continue. Doug Gilbert ... sas: phy3 added to port0, phy_mask:0x8 sas: DOING DISCOVERY on port 0, pid:2110 aic94xx: scb:0x80 timed out last message repeated 6 times sas: command 0xf57d5edc, task 0xf527bea8, timed out: EH_NOT_HANDLED sas: Enter sas_scsi_recover_host sas: trying to find task 0xf527bea8 sas: sas_scsi_find_task: aborting task 0xf527bea8 aic94xx: tmf timed out aic94xx: tmf came back aic94xx: task not done, clearing nexus aic94xx: asd_clear_nexus_index: PRE aic94xx: asd_clear_nexus_index: POST aic94xx: asd_clear_nexus_index: clear nexus posted, waiting... aic94xx: asd_clear_nexus_timedout: here aic94xx: came back from clear nexus aic94xx: task not done, clearing nexus aic94xx: asd_clear_nexus_index: PRE aic94xx: asd_clear_nexus_index: POST aic94xx: asd_clear_nexus_index: clear nexus posted, waiting... aic94xx: asd_clear_nexus_timedout: here aic94xx: came back from clear nexus aic94xx: task 0xf527bea8 aborted, res: 0x5 sas: sas_scsi_find_task: querying task 0xf527bea8 aic94xx: tmf timed out sas: sas_scsi_find_task: task 0xf527bea8 failed to abort sas: task 0xf527bea8 is not at LU: I_T recover sas: I_T nexus reset for dev 5000c50001b02139 sas: clearing nexus for port:0 aic94xx: asd_clear_nexus_port: PRE aic94xx: asd_clear_nexus_port: POST aic94xx: asd_clear_nexus_port: clear nexus posted, waiting... aic94xx: asd_clear_nexus_timedout: here sas: clear nexus ha aic94xx: asd_clear_nexus_ha: PRE aic94xx: asd_clear_nexus_ha: POST aic94xx: asd_clear_nexus_ha: clear nexus posted, waiting... aic94xx: asd_clear_nexus_timedout: here sas: error from device 5000c50001b02139, LUN 0 couldn't be recovered in any way sas: --- Exit sas_eh_handle_sas_errors -- clear_q sas: --- Exit sas_scsi_recover_host sas: command 0xf57d5edc, task 0xf527bea8, timed out: EH_NOT_HANDLED sas: Enter sas_scsi_recover_host sas: trying to find task 0xf527bea8 sas: sas_scsi_find_task: aborting task 0xf527bea8 aic94xx: tmf timed out aic94xx: tmf came back aic94xx: task not done, clearing nexus aic94xx: asd_clear_nexus_index: PRE aic94xx: asd_clear_nexus_index: POST aic94xx: asd_clear_nexus_index: clear nexus posted, waiting... aic94xx: asd_clear_nexus_timedout: here aic94xx: came back from clear nexus aic94xx: task not done, clearing nexus aic94xx: asd_clear_nexus_index: PRE aic94xx: asd_clear_nexus_index: POST aic94xx: asd_clear_nexus_index: clear nexus posted, waiting... aic94xx: asd_clear_nexus_timedout: here aic94xx: came back from clear nexus aic94xx: task 0xf527bea8 aborted, res: 0x5 sas: sas_scsi_find_task: querying task 0xf527bea8 aic94xx: tmf timed out sas: sas_scsi_find_task: task 0xf527bea8 failed to abort sas: task 0xf527bea8 is not at LU: I_T recover sas: I_T nexus reset for dev 5000c50001b02139 sas: clearing nexus for port:0 aic94xx: asd_clear_nexus_port: PRE aic94xx: asd_clear_nexus_port: POST aic94xx: asd_clear_nexus_port: clear nexus posted, waiting... aic94xx: asd_clear_nexus_timedout: here sas: clear nexus ha aic94xx: asd_clear_nexus_ha: PRE aic94xx: asd_clear_nexus_ha: POST aic94xx: asd_clear_nexus_ha: clear nexus posted, waiting... aic94xx: asd_clear_nexus_timedout: here sas: error from device 5000c50001b02139, LUN 0 couldn't be recovered in any way sas: --- Exit sas_eh_handle_sas_errors -- clear_q sas: --- Exit sas_scsi_recover_host sas: command 0xf57d5edc, task 0xf527bea8, timed out: EH_NOT_HANDLED sas: Enter sas_scsi_recover_host sas: trying to find task 0xf527bea8 sas: sas_scsi_find_task: aborting task 0xf527bea8 aic94xx: tmf timed out aic94xx: tmf came back aic94xx: task not done, clearing nexus aic94xx: asd_clear_nexus_index: PRE aic94xx: asd_clear_nexus_index: POST aic94xx: asd_clear_nexus_index: clear nexus posted, waiting... aic94xx: asd_clear_nexus_timedout: here aic94xx: came back from clear nexus aic94xx: task not done
Re: aic94xx driver woes
Darrick J. Wong wrote: Douglas Gilbert wrote: So that is almost 12 months that I have been reporting this driver as broken. Is it just me or my hardware? I seem to recall you saying that the LSI Fusion card was plugged into the same expander as the 48300? If so, does unplugging the Fusion card from the expander make it work? Darrick, There is a LSI Fusion card in the adjacent PCI-X slot but it wasn't connected to anything so it should not have been interfering. I have another Fusion card in a second machine which was off. I'll turn the second machine on now to show the topology of my SAS domain. Topology (seen from the second machine's MPT Fusion phy which is both an initiator and a target): # smp_discover -mb Device 500605b033ef, expander (only connected phys shown): phy 3:S:attached:[500605b6f260:00 i(SSP+STP+SMP) t(SSP)] 3 Gbps phy 5:T:attached:[500605b00af0:02 exp t(SMP)] 3 Gbps phy 6:T:attached:[5d10002dc000:00 i(SSP+STP+SMP)] 3 Gbps phy 9:T:attached:[5000c55208ee:01 t(SSP)] 3 Gbps phy 11:T:attached:[5000c50001b0213a:01 t(SSP)] 3 Gbps # smp_discover -mb -s 0x500605b00af0 Device 500605b00af0, expander (only connected phys shown): phy 2:S:attached:[500605b033ef:05 exp t(SMP)] 3 Gbps phy 10:T:attached:[5000c50001b02139:00 t(SSP)] 3 Gbps phy 11:T:attached:[5000c55208ed:00 t(SSP)] 3 Gbps James, note the SAS address of the first expander. So with the second machine off, the expander entry on 0x500605b033ef phy_id 3 is not there. [The mainline aic94xx driver fails the same way with the second machine off or on.] aic94xx: Found sequencer Firmware version 1.1 (V17/10c6) Have you tried the V30 sequencer? No. But I note that Luben's driver is still using V17/10c6 successfully (in lk 2.6.21-rc4). How would I know that the official driver needs firmware, where to get it and what was the recommended version with a Kconfig entry like this: config SCSI_AIC94XX tristate Adaptec AIC94xx SAS/SATA support depends on PCI select SCSI_SAS_LIBSAS select FW_LOADER help This driver supports Adaptec's SAS/SATA 3Gb/s 64 bit PCI-X AIC94xx chip based host adapters. config AIC94XX_DEBUG bool Compile in debug mode default y depends on SCSI_AIC94XX help Compiles the aic94xx driver in debug mode. In debug mode, the driver prints some messages to the console. ?? Is there some useful documentation somewhere else? If so perhaps I link to it could be placed in the Kconfig entry. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] scsi: Add EH Start Unit retry
Brian King wrote: Currently, the scsi error handler will issue a START_UNIT command if the drive indicates it needs its motor started and the allow_restart flag is set in the scsi_device. If, after the scsi error handler invokes a host adapter reset due to error recovery, a device is in a unit attention state AND also needs a START_UNIT, that device will be placed offline. The disk array devices on an ipr RAID adapter will do exactly this when in a dual initiator configuration. This patch adds a single retry to the EH initiated START_UNIT. I have no objection to this patch. Just seems a pity that SCSI devices go to the trouble of sending unit attentions while OSes just throw them away. Perhaps the scsi_device sysfs directory could have entries like: last_ua_asc last_ua_ascq last_ua_timestamp where code could place the asc/ascq codes and a timestamp then continue doing a retry. Could we get a log entry, hotplug event? Logical units may queue unit attentions (sam4r10.pdf section 5.8.7) so it is possible that one retry may not be enough. With my suggestion above, only the last one would persist for a reasonable time. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Disabling block layer
Mark Lobo wrote: Hello! I had a question about disabling the block layer for SCSI devices. We have an embedded device, and it runs 2.4.30. We need to be able to support a lot of SCSI devices (in the thousands) for our device, and we talk to the devices via SG. We are facing a memory allocation problem after discovering a few thousand devices. For every device, there seems to be a lot of memory allocated in the block layer. This memory includes cache memory (which IIRC is reclaimable by the kernel memory subsystem when it needs it) and also pages that are used for the alloc_pages pool. My questions were relating to disabling the block layer for the devices. We always talk direct passthrough to the storage(except the local hard disk), and do not need the block layer at all. 1. Is there a way to disable the block layer for specific devices? 2. If yes, how can that be done, and are there any gotchas associated with that? Mark, Tempting thought that: linux without a block layer. I think you have no hope in the lk 2.4 series and even less in the lk 2.6 series. Now for some thoughts. If you don't need to mount any SCSI disks, you could build a kernel with sd as a module and remove/hide sd_mod.o . A more invasive method would be to modify the sd driver so that it was no longer interested in SCSI devices whose peripheral device type was zero (i.e. disks). On the sg driver side, if lots of sg file descriptors are open to those thousands of SCSI devices, then reducing the per fd SG_DEF_RESERVED_SIZE from 32 KB may help. This could be reduced by editing include/scsi/sg.h . Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RFC: sg driver addition: SG_FLAG_SHARED_MMAP_IO
I mentioned this idea a few weeks ago on this list: namely to allow a sg pass-through request to use the mmap-ed reserve buffer associated with another sg file descriptor. In my experience mmap-ed IO using sg's reserve buffer mapped into the user space is faster than direct IO schemes. However one shortcoming is that if you try to copy between two devices using this technique then you end up with two separate mmap-ed buffers in the user space program. Then the user space program needs to copy between the two buffers which would defeat much of the advantage of the mmap-ed IO. You could (and sgm_dd in sg3_utils does) use mmap-ed IO on the read side and direct IO on the write side (or vice versa). I used the sg driver as found in lk 2.6.21-rc4 as a baseline (and I don't think sg has changed since 2.6.19). A gzipped diff is attached. There is also some test code (a modified sgm_dd) in the sg3_utils-1.24 beta on the www.torque.net/sg site. Here is an example of a disk to disk copy: sgm_dd if=/dev/sg0 of=/dev/sg1 oflag=smmap bs=512 The new flag is 'oflag=smmap' which instructs the write SG_IO on /dev/sg1 to set SG_FLAG_SHARED_MMAP_IO and it passes the mmap-ed buffer used for /dev/sg0 in dxferp. [Add a 'verbose=1' option and it will indicate how many times shared mmap IO was requested and how many times it was actually done.] Features: - allow both side of a copy like operation to dma into and out of the same user space buffer - minimal per command overhead (i.e. building of scatter gather lists and pinning pages) - could copy a single source to multiple destinations efficiently - if shared reserve buffer unavailable (or not big enough) then fall back to indirect IO transparently - new info bit SG_INFO_SHARED_MMAP_IO indicates whether shared mmap-ed IO was done Restrictions (enforced by the sg driver): - confined to file descriptors in the same process - there can be only one user of a reserve buffer at a time - low_dma is honoured Complexity - it does have a few more corner cases than usual. For example in above sgm_dd invocation: closing /dev/sg0 while /dev/sg1 is sharing its mmap-ed reserve buffer ... Here are some timings copying between two ramdisks. It is assumed the 'bs=8k' given to dd is equivalent to 'bs=512 bpt=16' given to sgm_dd. # lsscsi -g [4:0:0:0]diskLinuxscsi_debug 1.82 /dev/sda /dev/sg0 [5:0:0:0]diskLinuxscsi_ses 1.06 /dev/sdb /dev/sg1 # ./dd_tsts.sh Usage: dd_tsts.sh ifile ofile times bs # ./dd_tsts.sh /dev/sda /dev/sdb 50 8k Indirect IO with dd dd if=/dev/sda of=/dev/sdb bs=8k real0m7.448s user0m0.080s sys 0m7.046s Direct IO with dd dd if=/dev/sda iflag=direct of=/dev/sdb oflag=direct bs=8k real0m4.529s user0m0.114s sys 0m3.799s # ./sg_dd_tsts.sh /dev/sg0 /dev/sg1 50 16 Indirect IO with sg_dd sg_dd if=/dev/sg0 of=/dev/sg1 bs=512 bpt=16 real0m6.304s user0m0.171s sys 0m5.268s Direct IO with sg_dd sg_dd if=/dev/sg0 iflag=dio of=/dev/sg1 oflag=dio bs=512 bpt=16 real0m4.246s user0m0.135s sys 0m3.395s Mmap read, indirect IO write with sgm_dd sgm_dd if=/dev/sg0 of=/dev/sg1 bs=512 bpt=16 real0m4.023s user0m0.127s sys 0m3.259s Mmap read, direct IO write with sgm_dd sgm_dd if=/dev/sg0 of=/dev/sg1 oflag=dio bs=512 bpt=16 real0m4.057s user0m0.164s sys 0m3.264s Mmap read, shared mmap write with sgm_dd sgm_dd if=/dev/sg0 of=/dev/sg1 oflag=smmap bs=512 bpt=16 real0m3.871s user0m0.131s sys 0m3.111s Don't expect drastic improvements in real IO unless it is in the gigabyte per second range. Doug Gilbert sg2621rc4smm2.diff.gz Description: GNU Zip compressed data
Re: [PATCH 2/3] sd: implement START/STOP management
Tejun Heo wrote: Implement SBC START/STOP management. sdev-mange_start_stop is added. When it's set to one, sd STOPs the device on suspend and shutdown and STARTs it on resume. sdev-manage_start_stop defaults is in sdev instead of scsi_disk cdev to allow -slave_config() override the default configuration but is exported under scsi_disk sysfs node as sdev-allow_restart is. When manage_start_stop is zero (the default value), this patch doesn't introduce any behavior change. Signed-off-by: Tejun Heo [EMAIL PROTECTED] --- drivers/scsi/scsi_sysfs.c | 31 +++-- drivers/scsi/sd.c | 102 + include/scsi/scsi_device.h |1 3 files changed, 130 insertions(+), 4 deletions(-) Index: work/drivers/scsi/sd.c === --- work.orig/drivers/scsi/sd.c +++ work/drivers/scsi/sd.c @@ -142,6 +142,8 @@ static void sd_rw_intr(struct scsi_cmnd static int sd_probe(struct device *); static int sd_remove(struct device *); static void sd_shutdown(struct device *dev); +static int sd_suspend(struct device *dev, pm_message_t state); +static int sd_resume(struct device *dev); static void sd_rescan(struct device *); static int sd_init_command(struct scsi_cmnd *); static int sd_issue_flush(struct device *, sector_t *); @@ -206,6 +208,20 @@ static ssize_t sd_store_cache_type(struc return count; } +static ssize_t sd_store_manage_start_stop(struct class_device *cdev, + const char *buf, size_t count) +{ + struct scsi_disk *sdkp = to_scsi_disk(cdev); + struct scsi_device *sdp = sdkp-device; + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + + sdp-manage_start_stop = simple_strtoul(buf, NULL, 10); + + return count; +} + static ssize_t sd_store_allow_restart(struct class_device *cdev, const char *buf, size_t count) { @@ -238,6 +254,14 @@ static ssize_t sd_show_fua(struct class_ return snprintf(buf, 20, %u\n, sdkp-DPOFUA); } +static ssize_t sd_show_manage_start_stop(struct class_device *cdev, char *buf) +{ + struct scsi_disk *sdkp = to_scsi_disk(cdev); + struct scsi_device *sdp = sdkp-device; + + return snprintf(buf, 20, %u\n, sdp-manage_start_stop); +} + static ssize_t sd_show_allow_restart(struct class_device *cdev, char *buf) { struct scsi_disk *sdkp = to_scsi_disk(cdev); @@ -251,6 +275,8 @@ static struct class_device_attribute sd_ __ATTR(FUA, S_IRUGO, sd_show_fua, NULL), __ATTR(allow_restart, S_IRUGO|S_IWUSR, sd_show_allow_restart, sd_store_allow_restart), + __ATTR(manage_start_stop, S_IRUGO|S_IWUSR, sd_show_manage_start_stop, +sd_store_manage_start_stop), __ATTR_NULL, }; @@ -267,6 +293,8 @@ static struct scsi_driver sd_template = .name = sd, .probe = sd_probe, .remove = sd_remove, + .suspend= sd_suspend, + .resume = sd_resume, .shutdown = sd_shutdown, }, .rescan = sd_rescan, @@ -1776,6 +1804,32 @@ static void scsi_disk_release(struct cla kfree(sdkp); } +static int sd_start_stop_device(struct scsi_device *sdp, int start) +{ + unsigned char cmd[6] = { START_STOP }; /* START_VALID */ + struct scsi_sense_hdr sshdr; + int res; + + if (start) + cmd[4] |= 1;/* START */ + + if (!scsi_device_online(sdp)) + return -ENODEV; + + res = scsi_execute_req(sdp, cmd, DMA_NONE, NULL, 0, sshdr, +SD_TIMEOUT, SD_MAX_RETRIES); Tejun, I note at this point that the IMMED bit in the START STOP UNIT cdb is clear. [The code might note that as well.] All SCSI disks that I have seen, implement the IMMED bit and according to the SAT standard, so should SAT layers like the one in libata. With the IMMED bit clear: - on spin up, it will wait until disk is ready. Okay unless there are a lot of disks, in which case we could ask Matthew Wilcox for help - on spin down, will wait until media is stopped. That could be 20 seconds, and if there were multiple disks I guess the question is do we need to wait until a disk is spun down before dropping power to it and suspending. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bsg: iovec support
FUJITA Tomonori wrote: From: Pete Wyckoff [EMAIL PROTECTED] Subject: [PATCH] bsg: iovec support Date: Thu, 1 Mar 2007 17:29:08 -0500 Support vectored IO as in SGv3. The iovec structure uses explicit sizes to avoid the need for compat conversion. Signed-off-by: Pete Wyckoff [EMAIL PROTECTED] --- My application definitely can take advantage of scatter/gather IO, which is supported in sgv3 but not in the bsg implementation of sgv4. I understand Tomo's concerns about code bloat and the need for 32/64 compat translations, but this will make things much easier on users of bsg who read or write out of multiple buffers in a single SCSI operation. (snip) + * Vector of address/length pairs, used when dout_iovec_count (or din_) + * is non-zero. In that case, dout_xferp is a list of struct sg_io_v4_vec + * and dout_iovec_count is the number of entries in that list. dout_xfer_len + * is the total length of the list. Note the use of u64 instead of a + * native pointer to avoid compat issues, and padding to avoid structure + * alignment problems. + */ +struct sg_io_v4_vec { +__u64 iov_base; +__u32 iov_len; +__u32 __pad1; +}; I don't think that it's a good idea to add a new scatter/gather structure and export it to user space. User space scatter gather is not a new feature. It is defined and works in sg v3. It was also partially defined in sg v4 and dropped out in the bsg implementation. I agree with Pete that it should be put back. Pete is also suggesting (shown above) a revised sg_io_vec structure that uses a uint64_t for the pointer to simplify 32, 64 bit thunking. bsg can support scatter/gather IO with ioctl (SG_IO) easily (I mean, without adding ugly compat code to bsg.c). I guess that SG_IO doesn't work for you because it works synchronously. However, all system calls might work asynchronously in the future. User space scatter gather is completely decoupled from in-kernel scatter gather lists built for HBA DMA engines. Same technique but at different levels. Someone might user space scatter gather to efficiently fetch several OSD objects implemented in a block device as adjacent blocks Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bsg: iovec support
FUJITA Tomonori wrote: From: Douglas Gilbert [EMAIL PROTECTED] Subject: Re: [PATCH] bsg: iovec support Date: Mon, 19 Mar 2007 08:56:39 -0400 FUJITA Tomonori wrote: From: Pete Wyckoff [EMAIL PROTECTED] Subject: [PATCH] bsg: iovec support Date: Thu, 1 Mar 2007 17:29:08 -0500 Support vectored IO as in SGv3. The iovec structure uses explicit sizes to avoid the need for compat conversion. Signed-off-by: Pete Wyckoff [EMAIL PROTECTED] --- My application definitely can take advantage of scatter/gather IO, which is supported in sgv3 but not in the bsg implementation of sgv4. I understand Tomo's concerns about code bloat and the need for 32/64 compat translations, but this will make things much easier on users of bsg who read or write out of multiple buffers in a single SCSI operation. (snip) + * Vector of address/length pairs, used when dout_iovec_count (or din_) + * is non-zero. In that case, dout_xferp is a list of struct sg_io_v4_vec + * and dout_iovec_count is the number of entries in that list. dout_xfer_len + * is the total length of the list. Note the use of u64 instead of a + * native pointer to avoid compat issues, and padding to avoid structure + * alignment problems. + */ +struct sg_io_v4_vec { + __u64 iov_base; + __u32 iov_len; + __u32 __pad1; +}; I don't think that it's a good idea to add a new scatter/gather structure and export it to user space. User space scatter gather is not a new feature. It is defined and works in sg v3. It was also partially defined in sg v4 and dropped out in the bsg implementation. I agree with Pete that it should be put back. I'm fine with supporting iovec (though I don't like it). Tomo, You don't need to support it if you don't want to. So if din_iovec_count or dout_iovec_count are other than zero, bsg can return an error. By dropping those fields, other implementations are precluded from supporting that feature. Pete is also suggesting (shown above) a revised sg_io_vec structure that uses a uint64_t for the pointer to simplify 32, 64 bit thunking. All I said is that it would be better to use the existing compat infrastructure (sg_build_iovec, sg_ioctl_trans, etc in fs/compat_ioctl.c) instead of adding another compat code. Won't sg v4 make this even a bigger mess, at least initially anyway? Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] fusion - honour return value of pci_enable_device() in mpt_resume()
Randy Dunlap wrote: On Fri, 16 Mar 2007 11:14:51 -0500 James Bottomley wrote: On Fri, 2007-03-16 at 08:06 -0700, Randy Dunlap wrote: On Fri, 16 Mar 2007 09:27:26 -0500 James Bottomley wrote: On Fri, 2007-03-16 at 16:05 +0900, Horms wrote: + err = pci_enable_device(pdev); + if (err 0) + return err; Traditionally, this should be if (err) return err; The reason is that 0 is a signed comparison which can be slightly more expensive on some architectures and it's unnecessary if zero is the only successful return. Tradition vs. Linus, eh? Linus wrote (2007-Mar-06, on lkml, Message-ID: [EMAIL PROTECTED]): Sure ... we can all maintain our own traditions .. what was the subject of this email? The subject was coding style and return/error codes. The Subject: line was: Re: [5/6] 2.6.21-rc2: known regressions Randy, While on the subject of traditions, how about the C90 and C99 ones? C identifiers starting with __ are reserved! Reference: ISO/IEC 9899:1999 (C99) section 7.1.3 All identifiers that start with an underscore and either an upper case letter or another underscore are always reserved for any use. It was the same in C90. Now we might start getting rid of __u32 and its friends first :-) Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
SCSI Generic version 4 interface, release 1.2
After reviewing this post by Pete Wyckoff: http://marc.theaimsgroup.com/?l=linux-scsim=117278879816029w=2 I decided to update my sg v4 interface document originally posted 20061106 which I will now call release 1.1 : http://lwn.net/Articles/208082/ Pete was proposing to put back din_iovec_count and dout_iovec_count that had been dropped out of bsg but had been in release 1.1 . Hmm. Some other items have been picked up from the bsg implementation plus the suggestion from LSF'07 to add dout_resid. See the attachment, comments welcome. Doug Gilbert SCSI Generic version 4 interface structure == Release 1.2 Goals: - handle both generalized request/response and data_out/data_in independently in same invocation (i.e. synchronous usage). - alternatively the request and data_out could be instigated in one invocation with pointers given for the incoming response and data_in. Then a second invocation (as a result of polling or asynchronous notification) reports the response and/or data_in is done, plus provides error/resid/timing information. This is asynchronous usage. This allows for the most complicated SCSI commands: tagged, variable length cdbs with bidirectional data transfers. - support multiple protocols. If they are generalized request-response protocols then they can choose either the request/response part of the interface or the data_out/data_in part. - layered error/condition reporting: (OS) driver, transport and device (logical unit). Method used to present this struct to OS (e.g. ioctl()) may also report error (e.g. EPERM). - allow for auxiliary information to be passed back for the application client to consider - same structure can be used for a synchronous (e.g. interruptible ioctl) or asynchronous (e.g. ioctl()/read() ) pass through. - leave device (lu) or target addressing issues to some other mechanism (what SCSI standards call the I_T_L or the I_T nexus respectively) as they are transport dependent. However do include the tag level (the _Q part of a I_T_L_Q nexus). - stay close enough to struct sg_io_hdr (sg version 3 interface) to use with existing SG_IO ioctls, current implementations expect 'S' in 'guard' Comments: - unsigned 64 bit integers used as pointer carriers to ease 32/64 bit code interworking (e.g. 32 bit app on 64 bit kernel) - should there be more (or less) spare fields? - the write() usage in the sg driver's asynchronous interface has caused problems when mistakenly applied to a block device node rather than a sg device node. Using an ioctl(flag_async) followed by a read() for asynchronous work offers similar functionality and is safer. Using ioctl(flag_async_start) and ioctl(flag_async_finish) is another possibility. - rather than have a separate ATA pass through mechanism, the SAT defined ATA PASS THROUGH SCSI commands could be used with the driver implementation routing the ATA commands to their subsystem. This could be flagged so it didn't preclude a SAT layer in a SCSI transport (e.g. MPT SAS HBA firmware). - if SAM/SPC does not define an enumeration for lesser used input fields, then use the value 0 for inert/off/don't_care . - the SCSI command tag field as currently defined in SAM-4 can be up to 64 bits (with a proposal to increase that to 96 bits for FCP) Should we let the transport layer/LLD worry about that? ChangeLog for release 1.2 [20070314] - add dout_resid - re-arrange uint64_t types (i.e. pointer carriers) to be on a 8 byte boundary - reinstate dout_iovec_count and din_iovec_count (they were in release 1.1 but bsg dropped them) - change name: response_len_wr to response_len - pick up some descriptions from bsg ChangeLog for release 1.1 [20061106] - was called sg version 4 interface, version 1.1 so change the second version to release --- #include stdint.h struct sg_io_v4 { int32_t guard; /* [i] 'Q' to differentiate from v3 */ uint32_t protocol; /* [i] 0 - SCSI , */ uint32_t subprotocol; /* [i] 0 - SCSI command, 1 - SCSI task management function, */ uint32_t request_len; /* [i] in bytes {SCSI: cdb length} */ uint64_t request; /* [i], [*i] {SCSI: cdb} */ uint32_t request_attr; /* [i] {SCSI: task attribute} */ uint32_t request_tag; /* [i] {SCSI: task tag (only if flagged)} */ uint32_t request_priority; /* [i] {SCSI: task priority} */ uint32_t max_response_len; /* [i] in bytes */ uint64_t response; /* [i], [*o] {SCSI: (auto)sense data} */ /* dout_: data out (to device); din_: data in (from device) */
Re: How to send inquiry command to thorugh sd path (i.e. /dev/sda) by using SG_IO ioctl
MasthanUsha wrote: Hi All, Any one og you have any idea on scsi inquiry command ? I want to send an Inquiry command to a scsi device through sd path (.i.e. /dev/sda or /dev/sdb) by using SG_IO ioctl. Please explain me... If you look at http://www.torque.net/sg/sg3_utils.html and fetch a tarball (e.g. sg3_utils-1.23.tgz) then have a look at the examples/sg_simple1.c file. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to send inquiry command to thorugh sd path (i.e. /dev/sda) by using SG_IO ioctl
dudekula mastan wrote: Hi Gilbert, Thanks for quick reply. The example program (sg_Simple --- not only this all examples) is taking /dev/sg path as input but I want /dev/sd path as input. In the lk 2.6 series, it will also work for sd devices (and hd devices if they happen to be cd/dvd drives). Please explain me with an example, which takes /dev/sd path as input. You have one already. Actually you have lots of examples there. Is SG_IO supports sd driver ? yes, in the lk 2.6 series. I am not sure.. I think it will work only for sg driver. Am I correct ?. That was correct several years ago, not now. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: impact of 4k sector size on the IO FS stack
Bryan Henderson wrote: DOS partitions start partitions on odd-numbered sectors I don't get this. If you mean partitions defined by the classic DOS partition table format, then AFAICS, such a partition can start in any sector. Bryan, Typically the first partition on a DOS partitioned disk starts at the next available sector after the mbr which, for some bizarre reason, is 63 sectors long. Hence: # fdisk -lu /dev/hda Disk /dev/hda: 80.0 GB, 80026361856 bytes 255 heads, 63 sectors/track, 9729 cylinders, total 156301488 sectors Units = sectors of 1 * 512 = 512 bytes Device Boot Start End Blocks Id System /dev/hda1 * 6318314099 9157018+ c W95 FAT32 (LBA) /dev/hda21831410019551104 618502+ 82 Linux swap / Solaris /dev/hda419551105 15629638468372640 83 Linux so presuming you have odd-aligned disks, life is good. What is an odd-aligned disk? s/disk/partition/ ? Perhaps hda1 and hda4 above are examples. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Transport ID in Persistent Reservation
renuka apte wrote: The 'Specify Initiator Ports' support in persistent reservation allows the application client to send a bunch of transport IDs which identify initiator ports. I would like to know the suggested format for these transport IDs. http://www.t10.org/ftp/t10/drafts/spc4/spc4r09.pdf section 7.5.4 TransportID identifiers I am assuming that it must have something to do with the WWN of the initiator ports. and that is transport dependent. I tried using sg_utils to issue a persistent reservation IN command with READ FULL STATUS to a virtual SCSI disk which is returning the response in the format specified in SPC-4. However the sg_utils seems to be decoding the transport ID in some way that I cant find in the standard. Have another look :-) Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug or typo in scsi_debug.c
Mark Harvey wrote: Looking thru this driver, I saw what looks to be a bug/typo if an error occurs when calling driver_register() from scsi_debug_init() Cheers Mark --- scsi_debug-orig.c 2007-03-03 19:38:23.0 +1100 +++ scsi_debug.c2007-03-03 19:39:51.0 +1100 @@ -2841,7 +2841,7 @@ if (ret 0) { printk(KERN_WARNING scsi_debug: driver_register error: %d\n, ret); - goto bus_unreg; + goto driver_unreg; } ret = do_create_driverfs_files(); if (ret 0) { @@ -2873,6 +2873,7 @@ del_files: do_remove_driverfs_files(); +driver_unreg: driver_unregister(sdebug_driverfs_driver); bus_unreg: bus_unregister(pseudo_lld_bus); Mark, Um, I know my name is on that driver (with Eric's) but I didn't write the code in that function. I don't understand why your patch wants to call driver_unregister() after driver_register() has failed. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: improve sg_luns output for iscsi
Olaf Hering wrote: Upcoming IBM pSeries firmware can boot from iscsi. To configure the openfirmware boot-device string, we need to construct a correct devicepath. This path includes the lun. Its currently not 100% clear how exactly this lun value has to look like. sg_luns may be the tool to get the value. But its current output is not parseable by scripts. It even gives the same output for two different scsi devices: girgendwas:~ # lsscsi [0:0:0:0]diskDGC RAID 5 0219 /dev/sda [0:0:0:1]diskDGC RAID 5 0219 /dev/sdb [0:0:0:2]diskDGC RAID 5 0219 /dev/sdc [0:0:0:3]diskDGC RAID 5 0219 /dev/sdd girgendwas:~ # sg_luns -V sg_luns: version: 1.05 20060127 girgendwas:~ # sg_luns /dev/sdd Lun list length = 32 which imples 4 lun entries Report luns [select_report=0]: 0001 0002 0003 girgendwas:~ # sg_luns /dev/sdc Lun list length = 32 which imples 4 lun entries Report luns [select_report=0]: 0001 0002 0003 Is it possible to print the lun only for the requested scsi device? Olaf, sg_luns is an application client driving the SCSI REPORT LUNS command. It is a trick SCSI command since even though it addresses a logical unit, it is really the target that replies (as it is the target that knows about the sibling logical units) ***. The REPORT LUNS response gives no indication which (if any) 64 bit lun was addressed. Now I would not want to break the link between sg_luns and the SCSI REPORT LUNS command. Adding an extra parameter to try and find the lun associated with the file descriptor has a few problems (from my point of view): - it would be OS specific (sg_luns isn't currently) - within Linux there are different mechanisms in the 2.4 and 2.6 series kernels. In your example above a combination of lsscsi and sg_luns gives the answer (0003) but lsscsi is linux 2.6 series specific. sg_scan would probably work as a replacement for lsscsi (and sg_scan also works in the lk 2.4 series (and Windows)). To address the parsability of sg_luns output, I recently added a '--quiet' option to suppress the extraneous output. In summary sg_luns is probably not what you want! What about the lu _name_? For iSCSI the lu should yield a world wide unique SCSI name designator in the device identification VPD page (see SPC-4 and SAM-4 Annex A; the iSCSI standard woffles in this area). *** a better way to get a target to report its active luns is to use the REPORT LUNS well known logical unit but hardly anyone implements that. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] tgt: fix scsi command leak
FUJITA Tomonori wrote: From: Douglas Gilbert [EMAIL PROTECTED] Subject: Re: [PATCH 3/3] tgt: fix scsi command leak Date: Sat, 03 Mar 2007 11:58:19 -0500 FUJITA Tomonori wrote: The failure to map user-space pages leads to scsi command leak. It can happens mostly because of user-space daemon bugs (or OOM). This patch makes tgt just notify a LLD of the failure with sense when blk_rq_map_user() fails. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Signed-off-by: Mike Christie [EMAIL PROTECTED] --- drivers/scsi/scsi_tgt_lib.c | 23 --- 1 files changed, 20 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/scsi_tgt_lib.c b/drivers/scsi/scsi_tgt_lib.c index dc8781a..c05dff9 100644 --- a/drivers/scsi/scsi_tgt_lib.c +++ b/drivers/scsi/scsi_tgt_lib.c @@ -459,6 +459,16 @@ static struct request *tgt_cmd_hash_look return rq; } +static void scsi_tgt_build_sense(unsigned char *sense_buffer, unsigned char key, +unsigned char asc, unsigned char asq) +{ + sense_buffer[0] = 0x70; + sense_buffer[2] = key; + sense_buffer[7] = 0xa; + sense_buffer[12] = asc; + sense_buffer[13] = asq; +} + Tomo, Perhaps you could add a memset(sense_buffer, 0, 18) before those assignments and state that this is fixed sense buffer format. I think that it isn't necessary because when a target mode driver allocates scsi_cmnd, scsi_host_get_command() does that. What about an option for descriptor sense format? With SAT now a standard, we now have one more reason to support descriptor format when required. The ATA PASS-THROUGH SCSI commands in SAT use descriptor sense format to return ATA registers. tgt's kernel-space code doesn't know anything about SCSI devices that initiators talks to. So it's difficult to send proper sense buffer. Nomally, we don't have this problem because tgt user-space code builds sense buffer. The bug that we are trying to fix is that the scsi command leak due to the user-space's bugs. So we can't rely on the user-space for this. Not that, like open-iscsi, the user-space bugs are pretty critical for tgt as the kernel-space bugs. We don't think target mode drivers can continue to work. However, tgt should tell target mode drivers that unrecoverable problems happen and we should cleanly unload the kernel modules. Tomo, If I understand correctly, there is a target SCSI command interpreter in a user space daemon (plus lu support) and the target transport end point in kernel space (roughly speaking). So if there is some problem in the kernel module, or the user space daemon goes away (or won't respond) then what you have is a transport error at the target end. The error should be lower level than SCSI commands (i.e. transport level). The kernel module doesn't know the state of target SCSI command interpreter (by design). For example the application client may have set the D_SENSE bit in the control mode page prior to the failure that your code is addressing. So the application client won't be expecting fixed sense data format thereafter. So what the code is doing is definitely better than nothing, but IMO it isn't quite right either. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: convert sg to block layer helpers - v5
[EMAIL PROTECTED] wrote: There is no big changes between v4 and v5. I was able to fix things in scsi tgt, so I could remove the weird arguements the block helpers were taking for it. I also tried to break up the patchset for easier viewing. The final patch also takes care of the access_ok regression. These patches were made against linus's tree since Tomo needed me to break part of it out for his scsi tgt bug fix patches. 0001-rm-bio-hacks-in-scsi-tgt.txt - Drop scsi tgt's bio_map_user usage and convert it to blk_rq_map_user. Tomo is also sending this patch in his patchset since he needs it for his bug fixes. 0002-rm-block-device-arg-from-bio-map-user.txt - The block_device argument is never used in the bio map user functions, so this patch drops it. 0003-Support-large-sg-io-segments.txt - Modify the bio functions to allocate multiple pages at once instead of a single page. 0004-Add-reserve-buffer-for-sg-io.txt - Add reserve buffer support to the block layer for sg and st indirect IO use. 0005-Add-sg-io-mmap-helper.txt - Add some block layer helpers for sg mmap support. 0006-Convert-sg-to-block-layer-helpers.txt - Convert sg to block layer helpers. 0007-mv-user-buffer-copy-access_ok-test-to-block-helper.txt - Move user data buffer access_ok tests to block layer helpers. The goal of this patchset is to remove scsi_execute_async and reduce code duplication. People want to discuss further merging sg and bsg/scsi_ioctl functionality, but I did not handle and any of that in this patchset since people still disagree on what should supported with future interfaces. My only TODO is maybe make the bio reserve buffer mempoolable (make it work as mempool alloc and free functions). Since sg only supported one reserve buffer per fd I have not worked on it and it did not seem worth it if there are no users. *** Mike, I see you are removing the scatter_elem_sz parameter. What decides the scatter gather element size? Can it be greater than PAGE_SIZE? *** Generalizing the idea of a mmap-ed reserve buffer to something the user had more control over could be very powerful. For example allowing two file descriptors (to different devices) in the same process to share the same mmap-ed area. This would allow a device to device copy to DMA into and out of the same memory, potentially with large per command transfers and with no per command scatter gather build and tear down. Basically a zero copy copy with minimal CPU overhead. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: convert sg to block layer helpers - v5
Mike Christie wrote: Douglas Gilbert wrote: Mike, I see you are removing the scatter_elem_sz parameter. What decides the scatter gather element size? Can it be greater than PAGE_SIZE? Oh yeah, sorry I should have documented that. I just made the code try to allocate as large a element as possible. So the code looks at q-max_segment_size and tries to allocate segments that large initially. If that is too large then it will drop down by half like what sg.c used to do when it could not allocate large segments. I will add the param back if you want. I had thought it was a workaound due to the segment size of a device not being exported. *** Generalizing the idea of a mmap-ed reserve buffer to something the user had more control over could be very powerful. For example allowing two file descriptors (to different devices) in the same process to share the same mmap-ed area. This would allow a device to device copy to DMA into and out of the same memory, potentially with large per command transfers and with no per command scatter gather build and tear down. Basically a zero copy copy with minimal CPU overhead. I was thinking of something similar but not based on mmap. I have been trying to figure out a way to do sg io splice. I do not care what interface or method is used, I think it would be useful. I know we talked about the mmap approach a little, but I do not remember if we talked about how to tell both fds that they are going to use the same buffer. Would we need a modification to the sg header or would we need to add in a new IOCTL which would tell sg.c to share the buffer between two fds? Mike, Currently there is a flag in sgv3: #define SG_FLAG_MMAP_IO 4 and when it is active the dxferp field is ignored as it is assumed the user previously did a mmap() call to get the reserved buffer. We could add a: #define SG_FLAG_MMAP_IO_SHARED 8 and then the pointer in dxferp could taken as the already mmap-ed buffer from another device. Having more than one mmap-ed IO buffer per file descriptor would be nice but opening multiple file descriptors to the same device can give the same effect (with perhaps a POSIX thread per file descriptor). Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] SCSI: Printing cleanups
Martin K. Petersen wrote: This patch series is the first batch of cleanups in an attempt to make the SCSI printing more consistent and suitable for human consumption. Previously a typical error looked like this: sd 0:0:0:0: SCSI error: return code = 0x0802 sda: Current: sense key: Aborted Command Additional sense: Logical block reference tag check failed You had to have the magic return value decoder ring handy to figure out what had really happened. And you had to do the mapping between sd 0:0:0:0 and sda yourself. The following patches clean up various bits so that the same information can be presented in a more readable form: sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK sd 0:0:0:0: [sda] Sense Key : Aborted Command [current] sd 0:0:0:0: [sda] Add. Sense: Logical block reference tag check failed All printk's from sd.c now have the same prefix. If logging is turned on, for instance, we also get: sd 0:0:0:0: [sda] Send: 0x0fb89180 sd 0:0:0:0: [sda] CDB: Read(16): 88 20 00 00 00 00 00 00 00 20 00 00 00 08 00 00 sd 0:0:0:0: [sda] Done: 0x0fb89180 SUCCESS The patches need to be applied in order. Martin, Looks good. If you need to revise anything, perhaps you could add a comment with this url near the list of additional sense codes: http://www.t10.org/lists/asc-num.txt That is the official list of SCSI additional sense codes. Based on the date of my last additional sense code update only this one is missing: 2Fh/02h DTLPWROMAEBKVF COMMANDS CLEARED BY DEVICE SERVER Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: end to end error recovery musings
H. Peter Anvin wrote: Ric Wheeler wrote: We still have the following challenges: (1) read-ahead often means that we will retry every bad sector at least twice from the file system level. The first time, the fs read ahead request triggers a speculative read that includes the bad sector (triggering the error handling mechanisms) right before the real application triggers a read does the same thing. Not sure what the answer is here since read-ahead is obviously a huge win in the normal case. Probably the only sane thing to do is to remember the bad sectors and avoid attempting reading them; that would mean marking automatic versus explicitly requested requests to determine whether or not to filter them against a list of discovered bad blocks. Some disks are doing their own read-ahead in the form of a background media scan. Scans are done on request or periodically (e.g. once per day or once per week) and we have tools that can fetch the scan results from a disk (e.g. a list of unreadable sectors). What we don't have is any way to feed such information to a file system that may be impacted. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bsg: return SAM device status code
Pete Wyckoff wrote: Use the status codes from the standard, not the shifted-by-one codes that are marked deprecated in scsi.h. This makes bsg v4 status report the same value as sg v3 status too. Pete, Good pick up. We certainly don't want to re-introduce the SCSI status byte shift from the old days. Doug Gilbert Signed-off-by: Pete Wyckoff [EMAIL PROTECTED] --- block/bsg.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/block/bsg.c b/block/bsg.c index c85d961..e39a321 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -438,7 +438,7 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr, /* * fill in all the output members */ - hdr-device_status = status_byte(rq-errors); + hdr-device_status = rq-errors 0xff; hdr-transport_status = host_byte(rq-errors); hdr-driver_status = driver_byte(rq-errors); hdr-info = 0; - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: program inquiry is using a deprecated scsi_ioctl , please convert it to SG_IO
James Bottomley wrote: On Thu, 2007-02-22 at 11:59 +0530, MASTHAN DUDEKULA wrote: Hi JAMES, The following code is SG_IO equivalent of scsi ioctls SCSI_TEST_UNIT_READY unsigned char sense_b[32]; unsigned char turCmbBlk[] = {0x00, 0, 0, 0, 0, 0}; struct sg_io_hdr io_hdr; memset(io_hdr, 0, sizeof(struct sg_io_hdr)); io_hdr.interface_id = 'S'; io_hdr.cmd_len = sizeof(turCmbBlk); io_hdr.mx_sb_len = sizeof(sense_b); io_hdr.dxfer_direction = SG_DXFER_NONE; io_hdr.cmdp = turCmbBlk; io_hdr.sbp = sense_b; io_hdr.timeout = DEF_TIMEOUT; if (ioctl(fd, SG_IO, io_hdr) 0) { Like this What is the SG_IO equivalent for SCSI_IOCTL_SCSI_COMMAND ? Judging from the above you have found some sg3_utils code. In a recent version, if you go to the examples subdirectory, you will find the scsi_inquiry.c and sg_simple1.c files. The former shows the usage of the older, deprecated SCSI_IOCTL_SCSI_COMMAND ioctl while the latter does something very similar but uses the SG_IO ioctl interface. The equivalence is that they both programs send a SCSI INQUIRY cdb to a device and print out the response. Doug Gilbert I don't understand your question ... SCSI_IOCTL_SEND_COMMAND sends a SCSI command to the device. Your example of test unit ready above does just that ... it sends a Test Unit Ready command to the device using SG_IO ... exactly what do you not understand about using SG_IO to send commands to the device? James - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 7994] New: sleeping function called from invalid context at mm/slab.c:3034
Andrew Morton wrote: On Fri, 16 Feb 2007 22:59:31 -0500 Douglas Gilbert [EMAIL PROTECTED] wrote: The patch that I sent, shown at the end of this post, is incomplete as it doesn't check the return value from kzalloc(..., GFP_ATOMIC). The diff which is in mainline now looks to be OK? Yes. Doug Gilbert --- linux-2.6.20/drivers/scsi/scsi_debug.c2006-11-29 19:14:18.0 -0800 +++ devel/drivers/scsi/scsi_debug.c 2007-02-16 21:21:08.0 -0800 @@ -28,7 +28,6 @@ #include linux/module.h #include linux/kernel.h -#include linux/sched.h #include linux/errno.h #include linux/timer.h #include linux/types.h @@ -51,10 +50,10 @@ #include scsi_logging.h #include scsi_debug.h -#define SCSI_DEBUG_VERSION 1.80 -static const char * scsi_debug_version_date = 20061018; +#define SCSI_DEBUG_VERSION 1.81 +static const char * scsi_debug_version_date = 20070104; -/* Additional Sense Code (ASC) used */ +/* Additional Sense Code (ASC) */ #define NO_ADDITIONAL_SENSE 0x0 #define LOGICAL_UNIT_NOT_READY 0x4 #define UNRECOVERED_READ_ERR 0x11 @@ -65,9 +64,13 @@ static const char * scsi_debug_version_d #define INVALID_FIELD_IN_PARAM_LIST 0x26 #define POWERON_RESET 0x29 #define SAVING_PARAMS_UNSUP 0x39 +#define TRANSPORT_PROBLEM 0x4b #define THRESHOLD_EXCEEDED 0x5d #define LOW_POWER_COND_ON 0x5e +/* Additional Sense Code Qualifier (ASCQ) */ +#define ACK_NAK_TO 0x3 + #define SDEBUG_TAGGED_QUEUING 0 /* 0 | MSG_SIMPLE_TAG | MSG_ORDERED_TAG */ /* Default values for driver parameters */ @@ -95,15 +98,20 @@ static const char * scsi_debug_version_d #define SCSI_DEBUG_OPT_MEDIUM_ERR 2 #define SCSI_DEBUG_OPT_TIMEOUT 4 #define SCSI_DEBUG_OPT_RECOVERED_ERR 8 +#define SCSI_DEBUG_OPT_TRANSPORT_ERR 16 /* When every_nth 0 then modulo every_nth commands: * - a no response is simulated if SCSI_DEBUG_OPT_TIMEOUT is set * - a RECOVERED_ERROR is simulated on successful read and write * commands if SCSI_DEBUG_OPT_RECOVERED_ERR is set. + * - a TRANSPORT_ERROR is simulated on successful read and write + * commands if SCSI_DEBUG_OPT_TRANSPORT_ERR is set. * * When every_nth 0 then after - every_nth commands: * - a no response is simulated if SCSI_DEBUG_OPT_TIMEOUT is set * - a RECOVERED_ERROR is simulated on successful read and write * commands if SCSI_DEBUG_OPT_RECOVERED_ERR is set. + * - a TRANSPORT_ERROR is simulated on successful read and write + * commands if SCSI_DEBUG_OPT_TRANSPORT_ERR is set. * This will continue until some other action occurs (e.g. the user * writing a new value (other than -1 or 1) to every_nth via sysfs). */ @@ -315,6 +323,7 @@ int scsi_debug_queuecommand(struct scsi_ int target = SCpnt-device-id; struct sdebug_dev_info * devip = NULL; int inj_recovered = 0; + int inj_transport = 0; int delay_override = 0; if (done == NULL) @@ -352,6 +361,8 @@ int scsi_debug_queuecommand(struct scsi_ return 0; /* ignore command causing timeout */ else if (SCSI_DEBUG_OPT_RECOVERED_ERR scsi_debug_opts) inj_recovered = 1; /* to reads and writes below */ + else if (SCSI_DEBUG_OPT_TRANSPORT_ERR scsi_debug_opts) + inj_transport = 1; /* to reads and writes below */ } if (devip-wlun) { @@ -468,7 +479,11 @@ int scsi_debug_queuecommand(struct scsi_ mk_sense_buffer(devip, RECOVERED_ERROR, THRESHOLD_EXCEEDED, 0); errsts = check_condition_result; - } + } else if (inj_transport (0 == errsts)) { +mk_sense_buffer(devip, ABORTED_COMMAND, +TRANSPORT_PROBLEM, ACK_NAK_TO); +errsts = check_condition_result; +} break; case REPORT_LUNS: /* mandatory, ignore unit attention */ delay_override = 1; @@ -531,6 +546,9 @@ int scsi_debug_queuecommand(struct scsi_ delay_override = 1; errsts = check_readiness(SCpnt, 0, devip); break; + case WRITE_BUFFER: + errsts = check_readiness(SCpnt, 1, devip); + break; default: if (SCSI_DEBUG_OPT_NOISE scsi_debug_opts) printk(KERN_INFO scsi_debug: Opcode: 0x%x not @@ -954,7 +972,9 @@ static int resp_inquiry(struct scsi_cmnd int alloc_len, n, ret; alloc_len = (cmd[3] 8) + cmd[4]; - arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_KERNEL); + arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC); + if (! arr) + return DID_REQUEUE 16; if (devip-wlun) pq_pdt = 0x1e; /* present, wlun */ else if (scsi_debug_no_lun_0 (0 == devip-lun)) @@ -1217,7 +1237,9 @@ static int
Re: [RFC] How to implement linux_block commands in scsi midlayer
Elias, If you want to define a SCSI operation code for internal use within the kernel, please make sure that the byte isn't in the range 0 to 255 (inclusive). Those ones are either t10 defined, reserved or vendor specific for logical_unit or target use. IOW don't do it! Better would be to flag the request for internal use. If you really want to tweak SCSI cdb's, try the last byte (a.k.a. the control byte). Also consider that we a broadening the application of the pass-through code and other packet based protocols could be present. Doug Gilbert Elias Oltmanns wrote: Hi there, in 2.6.19 the request type REQ_TYPE_LINUX_BLOCK has been introduced. This is meant for generic block layer commands to the lower level drivers. I'd like to use this mechanism for a generic queue freezing and disk parking facility. The idea is to issue a command like REQ_LB_OP_PROTECT to the device driver associated to the queue so it can do about it what ever it sees fit. On command completion, the block layer then stops the queue until the unfreeze command is passed in. The IDLE IMMEDIATE command in recent ATA specs provides an unload disk heads feature which I'd like to use when the generic block layer command is issued to an ATA device. Since ATA is implemented as a subsystem of the scsi subsystem, I thought it would be best to add an scsi_cmnd opcode LINUX_BLOCK_CMD to include/scsi/scsi.h and deal with commands of this type very much like block_pc commands. The difference between these two types is that when LINUX_BLOCK_CMD commands are taken off the queue, it is dealt with by a special function of the midlayer to see if there is something to be done about it regardless of the lld associated with the device in question, and then the very same command is passed on to the low level driver to give it a chance to do the more specific stuff. In my particular case of a generic disk protect command, the midlayer would be responsible for setting sdev_state to SDEV_BLOCK and the ATA subsystem would issue the actual park command. The patch attached is a first attempt of a generic implementation of LINUX_BLOCK commands into the scsi midlayer. It probably doesn't apply cleanly to 2.6.19 as I've just extracted it from my disk parking branch, so it mainly serves as an example to comment on. Please let me know what you think about this approach and whether I should post a seperate patch for official integration into main line or whether it would be sufficient to leave it a part of the disk parking patch to be submitted later on. Regards, Elias --- drivers/ata/libata-scsi.c | 39 ++- drivers/scsi/scsi_lib.c | 50 + include/linux/blkdev.h|1 + include/scsi/scsi.h |1 + 4 files changed, 86 insertions(+), 5 deletions(-) drivers/scsi/scsi.c |3 ++- diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index a8acf71..6f1c351 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -2558,6 +2558,41 @@ static struct ata_device * ata_find_dev( return NULL; } +/** + * ata_scsi_linux_block - handling of generic block layer commands + * @dev: ATA device to which the command is addressed + * @cmd: SCSI command to execute + * @done: SCSI command completion function + * + * This function checks to see if we recognise the generic block layer + * command and should do anything about it. If we don't know the command, + * we indicate this in a sense response. However, we should fail + * gracefully since the midlayer might handle this command appropriately + * anyway, even without low level intervention. + * + * LOCKING: + * spin_lock_irqsave(host lock) + * + * RETURNS: + * Zero on success, non-zero on failure. + */ + +static int ata_scsi_linux_block(struct ata_device *dev, struct scsi_cmnd *cmd, + void (*done)(struct scsi_cmnd *)) +{ + struct request *req = cmd-request; + int ret = 0; + + switch (req-cmd[0]) { + default: + ata_scsi_set_sense(cmd, ILLEGAL_REQUEST, 0x20, 0x0); + /* Invalid command operation code */ + done(cmd); + break; + } + return ret; +} + static struct ata_device * __ata_scsi_find_dev(struct ata_port *ap, const struct scsi_device *scsidev) { @@ -2856,7 +2891,9 @@ static inline int __ata_scsi_queuecmd(st { int rc = 0; - if (dev-class == ATA_DEV_ATA) { + if (cmd-cmnd[0] == LINUX_BLOCK_CMD) + rc = ata_scsi_linux_block(dev, cmd, done); + else if (dev-class == ATA_DEV_ATA) { ata_xlat_func_t xlat_func = ata_get_xlat_func(dev, cmd-cmnd[0]); diff --git a/drivers/scsi/scsi_lib.c
Re: [Bug 7994] New: sleeping function called from invalid context at mm/slab.c:3034
Andrew, The patch that I sent, shown at the end of this post, is incomplete as it doesn't check the return value from kzalloc(..., GFP_ATOMIC). As I suspected this bug has been exposed before: Jens reported this problem in early January. A more complete patch, with some other changes, was posted 6 weeks ago: http://marc.theaimsgroup.com/?l=linux-scsim=116797354920256w=2 I'm not sure if this patch is in the works or not. Doug Gilbert Douglas Gilbert wrote: James Bottomley wrote: On Mon, 2007-02-12 at 20:06 -0800, Andrew Morton wrote: This is fixed in mainline and I expect that the fix is also lined up for 2.6.20.1. (?) It's definitely in mainline. I've cc'd Doug Gilbert, the scsi_debug maintainer to assess what should be done for 2.6.20.1 James, I thought this had been addressed but I can't find a trail on my laptop. A minimal patch is attached. ChangeLog: - Use GFP_ATOMIC for allocations that can be called from the queuecommand() entry point Signed-off-by: Douglas Gilbert [EMAIL PROTECTED] Doug Gilbert --- linux/drivers/scsi/scsi_debug.c 2006-11-30 07:00:01.0 -0800 +++ linux/drivers/scsi/scsi_debug.c2620atom 2007-02-13 06:43:28.0 -0800 @@ -954,7 +954,7 @@ int alloc_len, n, ret; alloc_len = (cmd[3] 8) + cmd[4]; - arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_KERNEL); + arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC); if (devip-wlun) pq_pdt = 0x1e; /* present, wlun */ else if (scsi_debug_no_lun_0 (0 == devip-lun)) @@ -1217,7 +1217,7 @@ alen = ((cmd[6] 24) + (cmd[7] 16) + (cmd[8] 8) + cmd[9]); - arr = kzalloc(SDEBUG_MAX_TGTPGS_ARR_SZ, GFP_KERNEL); + arr = kzalloc(SDEBUG_MAX_TGTPGS_ARR_SZ, GFP_ATOMIC); /* * EVPD page 0x88 states we have two ports, one * real and a fake port with no device connected. @@ -2044,7 +2044,7 @@ } } if (NULL == open_devip) { /* try and make a new one */ - open_devip = kzalloc(sizeof(*open_devip),GFP_KERNEL); + open_devip = kzalloc(sizeof(*open_devip),GFP_ATOMIC); if (NULL == open_devip) { printk(KERN_ERR %s: out of memory at line %d\n, __FUNCTION__, __LINE__); - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sg version 4 tools
FUJITA Tomonori wrote: I created a git tree for makeshift sg version 4 tools: http://www.kernel.org/git/?p=linux/kernel/git/tomo/sgv4-tools.git;a=summary # not synchronized yet. The interface has changed continuously (and will do). After mainline inclusion, Doug's sg tools support sg v4, I think. Until then, I put tools that I use for sg v4 development. Currently, there is only one tool, sgv4_dd, which can read/write from/to a device via the bsg interface (both ioctl and the read/write interfaces are supported). Here are some examples: # ./sgv4_dd -i /dev/sdb -o /dev/null --count 2 succeeded (read/write interface) # ./sgv4_dd -i /dev/sdb -o /dev/null --count 2 --sgio succeeded (SG_IO) # ./sgv4_dd -i /dev/zero -o /dev/sdb --count 3 --sgio succeeded (SG_IO) # ./sgv4_dd -i /dev/zero -o /dev/sdb --count 3 succeeded (read/write interface) Tomo, Just a few points. While the sgv4_dd command line interface (cli) looks sensible, it diverges from the dd command (which is non-unix like but reasonably fit for service for the function that dd performs). So even though the Unix dd command syntax takes a while to get used to, other testers will be most likely to be comfortable with existing dd syntax. Of the 41 utilities in (the main directory of) sg3_utils, 29 are ported to FreeBSD and Windows. This is done by putting a generic pass-through layer between those 29 utilities and the OS specific pass-throughs ***. The remaining 12 utilities are either: a) linux specific (e.g. sg_reset and sg_map26) b) or a bit too complicated due to other system calls (e.g. sg_dd) to convert c) both a) and b) (e.g. sgm_dd) The generic pass through layer is defined with bi-directional in mind. It also should be relatively easy to allow for two linux specific pass-throughs (i.e. sgv3 and sgv4) so that the common 29 utilities just work on either pass-through (by compile or run time switch). In summary, I don't think that there needs to be a sg4_utils. As you suggest, sgv4_dd can be incorporated into the existing sg3_utils at a convenient time. sg v4 represents an alternate interface for a linux pass-through and the bulk of sg3_utils already supports 4 pass-throughs via a common code base. [The four are linux (sg v3), FreeBSD, Tru64, Windows (from NT forward).] *** smartmontools takes the same approach and it supports several pass-thoughs for Windows. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bind bsg to request_queue instead of gendisk
Jeff Garzik wrote: On Wed, Feb 14, 2007 at 02:53:31AM +0900, FUJITA Tomonori wrote: It seems that it would be better to bind bsg devices to request_queue instead of gendisk. This enables any objects to define own request_handler and create own bsg device (under sysfs). Possible enhancements: - I removed gendisk but it would be better for objects having gendisk to keep it for nice features like disk stats. - Objects that wants to use bsg need to setup a request_queue. Maybe wrapper functions to setup a request_queue for them would be useful. This patch was tested only with disk drivers. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- block/bsg.c| 37 + What is this patch against? scsi-misc? I certainly like the bsg solution, but block/bsg.c does not exist in my vanilla linux-2.6.git tree :) www.kernel.org/pub/scm/linux/kernel/git/axboe/linux-2.6-block.git branch: bsg - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 7994] New: sleeping function called from invalid context at mm/slab.c:3034
James Bottomley wrote: On Mon, 2007-02-12 at 20:06 -0800, Andrew Morton wrote: This is fixed in mainline and I expect that the fix is also lined up for 2.6.20.1. (?) It's definitely in mainline. I've cc'd Doug Gilbert, the scsi_debug maintainer to assess what should be done for 2.6.20.1 James, I thought this had been addressed but I can't find a trail on my laptop. A minimal patch is attached. ChangeLog: - Use GFP_ATOMIC for allocations that can be called from the queuecommand() entry point Signed-off-by: Douglas Gilbert [EMAIL PROTECTED] Doug Gilbert --- linux/drivers/scsi/scsi_debug.c 2006-11-30 07:00:01.0 -0800 +++ linux/drivers/scsi/scsi_debug.c2620atom 2007-02-13 06:43:28.0 -0800 @@ -954,7 +954,7 @@ int alloc_len, n, ret; alloc_len = (cmd[3] 8) + cmd[4]; - arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_KERNEL); + arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC); if (devip-wlun) pq_pdt = 0x1e; /* present, wlun */ else if (scsi_debug_no_lun_0 (0 == devip-lun)) @@ -1217,7 +1217,7 @@ alen = ((cmd[6] 24) + (cmd[7] 16) + (cmd[8] 8) + cmd[9]); - arr = kzalloc(SDEBUG_MAX_TGTPGS_ARR_SZ, GFP_KERNEL); + arr = kzalloc(SDEBUG_MAX_TGTPGS_ARR_SZ, GFP_ATOMIC); /* * EVPD page 0x88 states we have two ports, one * real and a fake port with no device connected. @@ -2044,7 +2044,7 @@ } } if (NULL == open_devip) { /* try and make a new one */ - open_devip = kzalloc(sizeof(*open_devip),GFP_KERNEL); + open_devip = kzalloc(sizeof(*open_devip),GFP_ATOMIC); if (NULL == open_devip) { printk(KERN_ERR %s: out of memory at line %d\n, __FUNCTION__, __LINE__);
Re: Random scsi disk disappearing
Randy Dunlap wrote: [lkml dropped] [old thread] On Fri, 18 Aug 2006 11:11:39 +0200 Andreas Herrmann wrote: On 18.08.2006 00:33 Stefan Richter [EMAIL PROTECTED] wrote: Andreas Herrmann wrote: Anyone interested in a script to conveniently interpret or change the SCSI logging level? Such a script (scsi_logging_level) exists in the s390-tools package (version 1.5.3). That would be very welcome. Hi Doug, Did you give any thought to adding this script (or a current version of it, from http://www-128.ibm.com/developerworks/linux/linux390/s390-tools-1.5.4.html) to sg3-utils? or would you give it some thought? I think that's a better solution than adding to the kernel tree (and better than getting it from developerworks :). Thanks. Randy, The recently released sg3_utils version 1.23 contains a scripts directory. The files in there are: README sas_disk_blink scsi_logging_level scsi_mandat scsi_readcap scsi_ready scsi_satl scsi_start scsi_stop scsi_temperature Does one look familiar? If there is a later version, I can put it in sg3_utils-1.24 . Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Writing performance problem with SAS1068
Bernardo Innocenti wrote: Hello, I've stumbled onto a strange performance problem on a new server: reading from disks is fast (70-80MB/s), but writing is extremely slow (13-15MB/s). I've measured it like this: dd if=/dev/zero of=/dev/sdd bs=4096 count=65536 conv=fdatasync 65536+0 records in 65536+0 records out 268435456 bytes (268 MB) copied, 17.7004 seconds, 15.2 MB/s # dd if=/dev/zero of=/dev/sdj bs=4096 count=65536 conv=fdatasync 65536+0 records in 65536+0 records out 268435456 bytes (268 MB) copied, 2.24953 seconds, 119 MB/s # dd if=/dev/zero of=/dev/sdd bs=4096 count=65536 conv=fdatasync 65536+0 records in 65536+0 records out 268435456 bytes (268 MB) copied, 2.3246 seconds, 115 MB/s Both /dev/sdj and /dev/sdd connect via an expander to the same SAS disk. /dev/sdj is via the LT aic94xx driver and a PCI-X HBA. /dev/sdd is via the mptsas driver and a SAS1068 (PCIe) based HBA. The kernel version is 2.6.20-rc5. Looks good to me. You may like to check that Write Cache Enable is on with: 'sdparm --get=WCE /dev/sdd'. Doug Gilbert *but*: if I rebuild the kernel and change CONFIG_FUSION_MAX_SGE from 40 (Fedora's default) to 128 (maximum value), it suddenly gets much faster: 31MB/s! Looks very much like an interrupt problem to me. Maybe increasing the scatter gather mitigates the problem of missing completion notifications. Evidence: Exhibit A: custom kernel config for 2.6.18-1.2257.fc5.bernie http://www.codewiz.org/helium_logs/config Exhibit B: dmesg output from said kernel http://www.codewiz.org/helium_logs/dmesg Exhibit C: misc proc files, and all that http://www.codewiz.org/helium_logs/ Exhibit D: motherboard and chipset specification http://www.supermicro.com/products/motherboard/Xeon3000/3010/PDSME+.cfm Circumstantial evidence: - Seems to affect just the LSI SAS1068 PCI-X controller. The on-board AHCI controller writes very fast (60MB/s) - I've seen a very similar writing bottleneck with a Promise TX4 SATA controller (not PCI-X) on a server with a similar motherboard (Supermicro with Mukilteo 3000). - Passing mpt_msi_enable=1 doesn't change anything - FreeBSD 6.2 is even slower: writes at 7MB/s - OpenSolaris is much, much slower... less than 1MB/s. - Windows Vista (rc something) writes at 90MB/s. Too fast to believe, maybe dd from Cygwin is misbehaving. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SG_IO weirdness
Cameron, Steve wrote: I noticed that when I do two SG_IO ioctls to a target device (say, tape drive, disk drive, whatever) in which the first request is well formed (e.g. an inquiry) and the second one has a malformed CDB, such that it gets check condition with sense key == 5 (ILLEGAL REQUEST), the data buffer returned for the second malformed SG_IO request is filled out with the same data as was returned for the first successful command (e.g. the same inquiry data again.) I'm using separate data buffers for the two commands, and memsetting them to zero before calling ioctl(). I don't think this data is coming from the device, as it happens with every device I've tried. Is that normal? Seems like for a malformed request, the data buffer should not be transferred at all, much less transferred with contents of a prior request's data buffer. Kernel is 2.6.18 from kernel.org. Steve, Even though the SCSI status is CHECK CONDITION, the data-in buffer may still be transferred. One obvious example is a READ command when the sense key is RECOVERED ERROR. The sg driver and I suspect the block layer SG_IO do not check the SCSI status to determine whether or not to transfer the data-in buffer (or where it would have been DMA-ed to if the command worked) back to user space. If it was _direct_ IO then the block layer SG_IO and the sg driver would have no control over the data-in transfer (apart from setting it up). Both the sg driver and the block layer SG_IO could check the resid field which a LLD should set after a DMA (especially inbound). However LLDs are not compelled to set resid properly. So a few questions: - block layer SG_IO, the sg driver or both? - indirect IO (i.e. O_DIRECT not set)? - did the offending process have superuser permissions? - did the resid field indicate a short data-in transfer? The two requests were done from the same process, I haven't tried two separate processes to see if one process could by this method access another process's data. I did try using two devices, so the first well formed command went to one device, and the 2nd, malformed command went to another device. In that case, I didn't get the same buffer back again, but garbage. (some recognizeable strings, en_US was in there...) Is this a problem, or is this a matter of just don't do that.? As long as the SCSI status and sense buffer are conveyed back properly _and_ this is only observed when the process has superuser permissions, then I wouldn't regard it as serious. Others may disagree. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SG_IO weirdness
Cameron, Steve wrote: Steve, Even though the SCSI status is CHECK CONDITION, the data-in buffer may still be transferred. One obvious example is a READ command when the sense key is RECOVERED ERROR. Yep, of course. The sg driver and I suspect the block layer SG_IO do not check the SCSI status to determine whether or not to transfer the data-in buffer (or where it would have been DMA-ed to if the command worked) back to user space. If it was _direct_ IO then the block layer SG_IO and the sg driver would have no control over the data-in transfer (apart from setting it up). Both the sg driver and the block layer SG_IO could check the resid field which a LLD should set after a DMA (especially inbound). However LLDs are not compelled to set resid properly. So a few questions: - block layer SG_IO, the sg driver or both? sg driver. - indirect IO (i.e. O_DIRECT not set)? indirect IO, O_DIRECT was not set. - did the offending process have superuser permissions? Yes. - did the resid field indicate a short data-in transfer? resid == 64, the requested buffer was 1088 bytes. (If I interpret that right, it means that all but 64 bytes were transferred, that is, 1024 bytes were transferred? Odd, considering the CDB was nonsense.) Steve, From memory, between SPC-2 and SPC-3 the INQUIRY allocation length field went from 8 bits to 16 bits. If you do the above calculation modulo 256 it comes out correct :-) The moral here is don't set INQUIRY lengths 252 unless the target can handle it. There is no point anyway for a standard INQUIRY (EVPD=0, CmdDt is obsolete). With VPD pages you can do a double fetch, the first one 4 bytes long to pick up page length field. But then again you said the cdb was nonsense. Now it is still a bit fuzzy because there is the allocation length field in some cdbs and the dxfer_len given to sg_io_hdr. I would think that the LLD should concentrate on the latter and set resid accordingly. That makes me wonder about the LLD involved (the sg driver just passes resid through). As long as the SCSI status and sense buffer are conveyed back properly _and_ this is only observed when the process has superuser permissions, then I wouldn't regard it as serious. Others may disagree. I haven't tried it as non-superuser. (And couldn't, unless I chmod'ed /dev/sg* ) The sg driver zeros out the scatter gather elements for non-superusers. chmod'ing is not always needed, for example a non-superuser may well have permissions on a USB cd/dvd drive (including the sg device node) in some distributions. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR
Alan wrote: The interesting point of this question is about the typically pattern of IO errors. On a read, it is safe to assume that you will have issues with some bounded numbers of adjacent sectors. Which in theory you can get by asking the drive for the real sector size from the ATA7 info. (We ought to dig this out more as its relevant for partition layout too). I really like the idea of being able to set this kind of policy on a per drive instance since what you want here will change depending on what your system requirements are, what the system is trying to do (i.e., when trying to recover a failing but not dead yet disk, IO errors should be as quick as possible and we should choose an IO scheduler that does not combine IO's). That seems to be arguing for a bounded live time including retry run time for a command. That's also more intuitive for real time work and for end user setup. Either work or fail within n seconds Which is more or less the streaming feature set in recent ATA standards. [Alas, streaming and NCQ/TCQ can't be done with the same access.] SCSI has its Read Write Error Recovery mode page which doesn't have timeouts but does have Read and Write Retry Counts amongst other fields that control the amount (and indirectly the time) of attempted error recovery. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[Announce] sg3_utils-1.23 available
sg3_utils is a package of command line utilities for sending SCSI (and some ATA) commands to devices. This package targets the linux kernel (lk) 2.6 and lk 2.4 series. In the lk 2.6 series these utilities (except sgp_dd) can be used with any devices that support the SG_IO ioctl. Ported to FreeBSD, Tru64 and Windows (cygwin and mingw). This version adds sg_read_buffer and sg_write_buffer utilities. Cleans up command line interface of older utilities and all man pages have been reworked. Package synchronized with SPC-4 revision 8 and SBC-3 revision 8. Copy of ChangeLog below. For an overview of sg3_utils and downloads see this page: http://www.torque.net/sg/sg3_utils.html The sg_dd utility has its own page at: http://www.torque.net/sg/sg_dd.html The SG_IO ioctl is discussed at: http://www.torque.net/sg/sg_io.html A full changelog can be found at: http://www.torque.net/sg/p/sg3_utils.CHANGELOG A release announcement has been sent to freshmeat.net . Top of Changelog: Changelog for sg3_utils-1.23 [20070131] - sg_read_buffer: new utility - sg_write_buffer: new utility - sg_opcodes, sg_senddiag, sg_logs, sg_modes, sg_start, sg_inq, sg_turs, sg_readcap, sg_rbuf: add getopt_long() based cli; old and new cli selectable, new getopt_long cli is default - scripts: new subdirectory containing some bash scripts - add scripts/README file - sg_reassign: add '--hex' option for grown and primary lists - sg_rtpg: add '--raw' option - sg_lib.h, sg_cmds_basic.h + sg_cmds_extra.h: add C++ 'extern C ' wrappers - cleanup C code so it will compile as C++ - sg_lib: sync with spc4r08 - include inttypes.h, use PRId64 instead of %lld form - fix sg_get_sense_str() when empty sense buffer - win32 port: add Makefile.mingw + related support for MinGW - sg_cmds_extra: add sg_ll_read_buffer() and sg_ll_write_buffer() - sg_dd, sgp_dd, sgm_dd, sg_read: use lseek64() instead of llseek.c - sgm_dd: accept coe=n for interworking with sg_dd - sg_rdac: fix on non-linux ports - sg_ses: fix spurious warning in additional element status page - '-rr' option outputs a diagnostic page in binary to stdout - sg_opcodes: add command timeout descriptor support (spc4r08) - change linux specific pass through to generic pass through - sg_logs: add 'name=value' decoding for SAS specific lpage - examples+utils subdirectories: remove symlinks - synchronize man pages with usage messages - sg3_utils.spec: rework Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] RESEND: SCSI, libata: add support for ATA_16 commands to libata ATAPI devices
James Bottomley wrote: On Thu, 2007-02-01 at 04:54 -0500, Jeff Garzik wrote: Agreed... but that doesn't make it the /right/ thing to do ;-) The logic behind the current code, which limits to the maximum size allowed by an attached device on the port, is mainly to leverage the SCSI layer as a filter for bad CDB lengths. IOW, it's called being lazy ;-) But you're requesting code changes in the SCSI layer because of this incorrect usage. max_cdb is supposed to be the *host* limit. The mid layer finds out and respects device limits separately from this. To be more pedantic: actual_max_cdb = min(MAX_COMMAND_SIZE, host_limit) Since the host is a bridge, that could be a limit on near side (i.e. PCI (unlikely)) or the outer side (i.e. transport initiator (port)). In modern HBAs the host_limit is likely to be greater than 16, to allow for advanced SBC and OSD commands. However currently MAX_COMMAND_SIZE (defined in scsi/scsi_cmnd.h) is 16. It is the ATAPI _transport_ that has the 12 byte cdb limit *** (at least according to MMC-5 rev Annex A; is S-ATAPI any better?). Other MMC transports referred to in MMC-5 are SPI, SBP(IEEE 1394) and USB mass storage; and no mention is made of cdb length limits for them. Since ATAPI is the dominant transport for cd/dvd drives, MMC doesn't define any commands over 12 bytes in length, but both SPC (which MMC should honour) and SSC-3 (think tape drives, ATAPI connected) do. My point is that the linux block layer and scsi mid level should get out of the business of putting hard limits place. Why? Since kernel limits are at best necessary but not sufficient, the upper layers still need to be able to cope with errors associated with that limit. So why have the limit? Does the kernel do analysis to find out whether a USB connected DVD drive has a USB to ATAPI bridge externally? I don't think so. There is a role to fetch information that may act as a guide when a ULD has a choice of commands to build (e.g. sd deciding between READ(10) and READ(16)). Let the cdb size bottleneck (or whatever) report an error and upper layers that are impacted, including user space programs, can act accordingly. If the kernel really wants to offload complexity to the user space, the kernel needs to get out of the business of trying to foresee errors. It needs to get better at coping with errors and if possible adapting its behaviour. *** not the host nor the device Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR
Ric Wheeler wrote: Jeff Garzik wrote: Mark Lord wrote: Eric D. Mudama wrote: Actually, it's possibly worse, since each failure in libata will generate 3-4 retries. With existing ATA error recovery in the drives, that's about 3 seconds per retry on average, or 12 seconds per failure. Multiply that by the number of blocks past the error to complete the request.. It really beats the alternative of a forced reboot due to, say, superblock I/O failing because it happened to get merged with an unrelated I/O which then failed.. Etc.. FWIW -- speaking generally -- I think there are inevitable areas where libata error handling combined with SCSI error handling results in suboptimal error handling. Just creating a list of this behavior should be handled this way, but in reality is handled in this silly way would be very helpful. I agree - Tejun has done a great job at giving us a great base. Next step is to get clarity on what the types of errors are and how to differentiate between them (and maybe how that would change by class of device?). Error handling is tough to get right, because the code is exercised so infrequently. Tejun has actually done an above-average job here, by making device probe, hotplug and other exceptions go through the libata EH code, thereby exercising the EH code more than one might normally assume. Some errors in libata probably should not be retried more than once, when we have a definitive diagnosis. Suggestions for improvements are welcome. Jeff One thing that we find really useful is to inject real errors into devices. Mark has some patches that let us inject media errors, we also bring back failed drives and run them through testing and occasionally get to use analyzers, etc to inject odd ball errors. Ric, Both ATA (ATA8-ACS) and SCSI (SBC-3) have recently added command support to flag a block as uncorrectable. There is no need to send bad long data to it and suppress the disk's automatic re-allocation logic. In the case of ATA it is the WRITE UNCORRECTABLE command. In the case of SCSI it is the WR_UNCOR bit in the WRITE LONG command. It seems that due to SAT any useful capability in the ATA command set will soon appear in the corresponding SCSI command set, if it is not already there. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
SAS illegal toplogies [was Re: [PATCH 1/4 v2] libsas: Don't BUG when connecting two expanders via wide port]
Darrick J. Wong wrote: libsas: Don't BUG when connecting two expanders via wide port When a device is connected to an expander, the discovery process goes through sas_ex_discover_dev to figure out what's attached to the phy. If it is the case that the phy being discovered happens to be the second phy of a wide link to an expander, that discover_dev function will incorrectly call sas_ex_discover_expander, which creates another sas_port and tries to attach the other sas_phys to the new port, thus triggering a BUG. The correct thing to do is to check the other ex_phys of the expander to see if there's a sas_port for this sas_phy, and attach the sas_phy to the existing sas_port. This is easily triggered if one enables the phys of a wide port between expanders one by one. This second version of the patch fixes a small regression in the case where all the phys show up at once and we accidentally try to attach to a port that hasn't been created yet. Darrick, Okay. Now I'm wondering what the discovery algorithm in libsas does if it finds truly illegal connections between expanders. The spec defines what is illegal but says it is vendor specific what will be done. One approach is to use the SMP PHY CONTROL function to disable the phy (or the phys at both ends of the illegal link). The next trick is how to tell the user who just connected a cable between expanders that you can't do that!. Tools like my smp_discover could alert a user to a disabled phy but without turning it back on (and causing the libsas discovery algorithm another headache) my SMP utilities don't know what it is connected to. Another question is which link to disable. Imagine three expanders interconnected with 3 links which is illegal. Breaking any one link makes it legal, but which one to break? Last seen, or perhaps the link which has the largest SAS address sum ... Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR
Ric Wheeler wrote: Mark Lord wrote: Eric D. Mudama wrote: Actually, it's possibly worse, since each failure in libata will generate 3-4 retries. With existing ATA error recovery in the drives, that's about 3 seconds per retry on average, or 12 seconds per failure. Multiply that by the number of blocks past the error to complete the request.. It really beats the alternative of a forced reboot due to, say, superblock I/O failing because it happened to get merged with an unrelated I/O which then failed.. Etc.. Definitely an improvement. The number of retries is an entirely separate issue. If we really care about it, then we should fix SD_MAX_RETRIES. The current value of 5 is *way* too high. It should be zero or one. Cheers I think that drives retry enough, we should leave retry at zero for normal (non-removable) drives. Should this be a policy we can set like we do with NCQ queue depth via /sys ? The transport might also want a say. I see ABORTED COMMAND errors often enough with SAS (e.g. due to expander congestion) to warrant at least one retry (which works in my testing). SATA disks behind SAS infrastructure would also be susceptible to the same random failures. Transport Layer Retries (TLR) in SAS should remove this class of transport errors but only SAS tape drives support TLR as far as I know. Doug Gilbert We need to be able to layer things like MD on top of normal drive errors in a way that will produce a system that provides reasonable response time despite any possible IO error on a single component. Another case that we end up doing on a regular basis is drive recovery. Errors need to be limited in scope to just the impacted area and dispatched up to the application layer as quickly as we can so that you don't spend days watching a copy of huge drive (think 750GB or more) ;-) ric - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] Roll-up of libsas and aic94xx patches
Darrick J. Wong wrote: Hi all, This is a roll-up of all of my uncommitted patches against libsas and aic94xx to date. The first patch features an important fix for an incorrect port deformation after a phy reset event. The next two patches in this set complete the reorganization of the sas_rphy_{delete,free} calls after errors during discovery. The next two patches amend the SAS error handler to be able to handle scsi_cmnds that have completed successfully but with a failure code, first by trying START UNIT if the disk is not spinning, second by trying to reset the device, and finally offlining the device if nothing works. Darrick, The reset the device is a bit vague. Would that be a LU reset (task management function) or a hard reset? If the latter then it will cause collateral damage if the target contains multiple logical units (i.e. it will reset all of them, not just the one failing to spin up). Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
lsscsi-0.19 released
lsscsi is a utility that uses sysfs in linux 2.6 series kernels to list information about SCSI devices and SCSI hosts. Both a compact format (default) which is one line per device and a classic format (like the output of 'cat /proc/scsi/scsi') are supported. Version 0.19 is available at http://www.torque.net/scsi/lsscsi.html More information can be found on that page including examples and a Download section for tarballs, rpm and deb packages. This version adds transport specific information. ChangeLog: Version 0.19 2007/1/25 - add transport information (target + initiator) - start with FC, SAS, SPI, iSCSI and SBP - alter ISCSI for 2.6.20 changes - SAS fix for lk 2.6.20 (SYSFS_DEPRECATED=n) - enhance host name search when proc_name is NULL - implement filter option for '--hosts' - accept 'hostn' as first item in filter to mean host n - output more host attributes when '-Hll' given - add '--list' (or '-L') option output attribute=value entries, one per line Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux Virtual SCSI HBAs and Virtual disks
Aboo Valappil wrote: Hi Stefan Richter, Thanks everyone for their advice on this. As per your advice, I did the following when the last user space target serving the scsi_host quits, the queue command will do the following on the new commands coming through. sc-result = DID_NO_CONNECT 16; sc-resid = sc-request_bufflen; set_sensedata_commfailure(sc); - This sets the sense buffer with Device Not ready/Logical Unit Commincation failure. done(sc); The scsi_host will remain in the kernel. Let the EH thread handle the queued commands (If any). If the user target wants to reconnects to the same scsi_host, it can do so (Just re-run the user space target again with same command line paramters). This connection from newly started target will make the HBA healthy again and start serving IO. I implemented a new IOCTL to remove this scsi_host if the user process really needs to. This removal will first finish all the SCSI commands (With the above status results) queued on the scsi_host (If at all) and then remove the scsi_host. Also the module unload will delete all the scsi_hosts created after finishing all the commands queued with the above status and sense information. I also implemented passing of sense code information from user space to sense_buffer. A little more work needs to be done on this. Also, I need to make sure that all the locking used inside is correctly implemented to prevent dead locks and improve efficiency. The new version is available http://vscsihba.aboo.org/vscsihbav204.gz A few observations from testing this version: # ./start_target.sh id=3 -files ../../zz_lun0 -v # lsscsi [0:0:0:0]diskLinuxscsi_debug 0004 /dev/sda [1:0:0:0]diskVirtualH VHD 0 /dev/sdb So id=3 doesn't look the target identifier. If not, what is it? Here is an attempt to fetch the Read Write Error Recovery mode page: # sdparm -p rw -vv /dev/sg1 inquiry cdb: 12 00 00 00 24 00 /dev/sg1: VirtualH VHD 0 mode sense (10) cdb: 5a 00 01 00 00 00 00 00 08 00 mode sense (10): Probably uninitialized data. Try to view as SCSI-1 non-extended sense: AdValid=0 Error class=0 Error code=0 Read write error recovery mode page [0x1] failed That implies a sense buffer full of zeroes. The debug output from start_target.sh associated with that attempt: SCSI cmd Lun=00 id=2D CDB=12 00 00 00 24 00 00 00 08 00 00 00 00 00 00 00 SCSI cmd Lun=00 id=2D completed, status=0 SCSI cmd Lun=00 id=2E CDB=5A 00 01 00 00 00 00 00 08 00 00 00 00 00 00 00 SCSI cmd Lun=00 id=2E completed, status=2 SCSI cmd Lun=00 id=2F CDB=03 00 00 00 FC 00 00 00 08 00 00 00 00 00 00 00 SCSI cmd Lun=00 id=2F completed, status=0 SCSI cmd Lun=00 id=30 CDB=00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 SCSI cmd Lun=00 id=30 completed, status=0 So that is an INQUIRY [expected], MODE SENSE(10) [expected], REQUEST SENSE [what, no autosense??] and TEST UNIT READY [ah oh, error recovery??] sequence. Perhaps you could examine the way scsi_debug (or most other LLDs) does autosense. This modern technique (used for about the last 12 years) relieves the scsi midlevel of having to send a follow up REQUEST SENSE. It would be easier to read those SCSI commands in the debug output if they were trimmed to their actual lengths (e.g. the INQUIRY is 12 00 00 00 24 00). Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/6] bidi support: request dma_data_direction
Benny Halevy wrote: Douglas Gilbert wrote: Boaz Harrosh wrote: - Introduce a new enum dma_data_direction data_dir member in struct request. and remove the RW bit from request-cmd_flag - Add new API to query request direction. - Adjust existing API and implementation. - Cleanup wrong use of DMA_BIDIRECTIONAL Perhaps the right use of DMA_BIRECTIONAL needs to be defined. Could it be used with a XDWRITE(10) SCSI command defined in sbc3r07.pdf at http://www.t10.org ? I suspect using two scatter gather lists would be a better approach. - Introduce new blk_rq_init_unqueued_req() and use it in places ad-hoc requests were used and bzero'ed. With a bi-directional transfer is it always unambiguous which transfer occurs first (or could they occur at the same time)? The bidi transfers can occur in any order and in parallel. Then it is not sufficient for modern SCSI transports in which certain bidirectional commands (probably most) have a well defined order. So DMA_BIDIRECTIONAL looks PCI specific and it may have been a mistake to replace other subsystem's direction flags with it. RDMA might be an interesting case. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/6] bidi support: request dma_data_direction
Boaz Harrosh wrote: - Introduce a new enum dma_data_direction data_dir member in struct request. and remove the RW bit from request-cmd_flag - Add new API to query request direction. - Adjust existing API and implementation. - Cleanup wrong use of DMA_BIDIRECTIONAL - Introduce new blk_rq_init_unqueued_req() and use it in places ad-hoc requests were used and bzero'ed. With a bi-directional transfer is it always unambiguous which transfer occurs first (or could they occur at the same time)? Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: no utility / method to show association between HBA non-sg BLOCK (scsi) devices - register_blkdev()
Thayne Harmon wrote: On Thu, Jan 11, 2007 at 1:15 PM, in message [EMAIL PROTECTED], Douglas Gilbert [EMAIL PROTECTED] wrote: Thayne Harmon wrote: Gentlemen, hwinfo, lshal, sysfs do not show the relationship for non- sg BLOCK devices with there associated Host Bus Adapter. All devices (i.e. logical units) have a 4 element tuple associated with them and the first element is the host number. A HBA contains one or more hosts. Then you can datamine in /sys/class/scsi_host/hostn for whatever information you want. Do you know of a utility or method that can show this? May I suggest lsscsi. That won't help you in the lk 2.4 series and earlier though There are other methods by which the sg device corresponding to a non- sg block device (e.g. /dev/sdc) can be found. [context - Linux testserver 2.6.16.21-0.8-smp i586] There is no corresponding sg device. The device file is /dev/cciss/c0d1. Ok, I'm not familiar with the cciss driver. It looks like it lives outside the linux scsi subsystem but according to Documentation/cciss.txt it can subsequently engage the scsi subsystem?? If it is outside the scsi subsystem then it doesn't get corresponding sg devices. However as part of the block subsystem it might accept the SG_IO ioctl (if it accepts SCSI commands and it is implemented). I tried lsscsi, however it would not print out the non-sg block devices. I have attached the output of tree /sys and the output of lsscsi and uname. One can search for cciss to find the devices and the driver. I still cannot see a relationship. snip sysfs dump [0:0:0:0]storage COMPAQ MSA1000 4.32 - [0:0:0:3]diskCOMPAQ MSA1000 VOLUME 4.32 /dev/sda [0:0:0:4]diskCOMPAQ MSA1000 VOLUME 4.32 /dev/sdb [0:0:0:5]diskCOMPAQ MSA1000 VOLUME 4.32 /dev/sdc [0:0:0:6]diskCOMPAQ MSA1000 VOLUME 4.32 /dev/sdd [0:0:0:7]diskCOMPAQ MSA1000 VOLUME 4.32 /dev/sde [1:0:0:0]storage COMPAQ MSA1000 4.32 - [1:0:0:3]diskCOMPAQ MSA1000 VOLUME 4.32 /dev/sdf [1:0:0:4]diskCOMPAQ MSA1000 VOLUME 4.32 /dev/sdg [1:0:0:5]diskCOMPAQ MSA1000 VOLUME 4.32 /dev/sdh [1:0:0:6]diskCOMPAQ MSA1000 VOLUME 4.32 /dev/sdi [1:0:0:7]diskCOMPAQ MSA1000 VOLUME 4.32 /dev/sdj Well this looks like output from lsscsi. And those devices look like they could be associated with cciss, especially the compaq storage devices. These devices should have corresponding sg device nodes. Try lsscsi -g. Still a bit unclear as hosts 0 and 1 are Fibre Channel judging from the sysfs output for them. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux Virtual SCSI HBAs and Virtual disks
Aboo Valappil wrote: Hi All, Thanks everyone to have a look at this. I think i modified to have the latest kernel support. Unfortunately I could not test it with 2.6.20 kernel due to some issues in my laptop and 2.6.20 kernel. But it should work with 2.6.20 with this modification. The modified version is available through http://vscsihba.aboo.org/vscsihbav202.tgz. 1. I fixed the kmem_cache issue for sure. 2. I think i got around with INIT_WORK ... Made the following modifications ... Perhaps you could get some of my scsi tools (e.g. sdparm and sg3_utils) and make sure that vscsihba can handle everything they can throw at it. If the user space doesn't support a SCSI command then your driver should fail gracefully (i.e. CHECK CONDITION, etc). Here is a worrying example: sdparm sends an INQUIRY and a couple of MODE SENSE(10) commands to a device. /dev/sda was created by your script: $ ./start_target.sh id=3 -files zz_lun0 $ sdparm /dev/sda /dev/sda: VirtualH VHD 0 long wait $ However dmesg showed this: vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device sd 0:0:0:0: SCSI error: return code = 0x0002 end_request: I/O error, dev sda, sector 10240 Buffer I/O error on device sda, logical block 10240 vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device sd 0:0:0:0: SCSI error: return code = 0x0002 end_request: I/O error, dev sda, sector 10240 Buffer I/O error on device sda, logical block 10240 vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device sd 0:0:0:0: SCSI error: return code = 0x0002 end_request: I/O error, dev sda, sector 10240 Buffer I/O error on device sda, logical block 10240 vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device sd 0:0:0:0: SCSI error: return code = 0x0002 end_request: I/O error, dev sda, sector 10240 Buffer I/O error on device sda, logical block 10240 vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device sd 0:0:0:0: SCSI error: return code = 0x0002 end_request: I/O error, dev sda, sector 10240 Buffer I/O error on device sda, logical block 10240 vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device vscsihba:3: In Reset Device sd 0:0:0:0: SCSI error: return code = 0x0002 end_request: I/O error, dev sda, sector 10240 Buffer I/O error on device sda, logical block 10240 BUG: at kernel/sched.c:3388 sub_preempt_count() [e1bf029c] scsitap_eh_abort+0x1c/0x90 [vscsihba] [c024fe22] scsi_error_handler+0x3e2/0xbe0 [c02d74f1] __sched_text_start+0x2f1/0x660 [c024fa40] scsi_error_handler+0x0/0xbe0 [c0131679] kthread+0xa9/0xe0 [c01315d0] kthread+0x0/0xe0 [c0103d0f] kernel_thread_helper+0x7/0x18 === vscsihba:3: Abortng command serial number : 94 BUG: scheduling while atomic: scsi_eh_0/0x0001/4749 [c02d7684] __sched_text_start+0x484/0x660 [c013183b] autoremove_wake_function+0x1b/0x50 [c01264a8] lock_timer_base+0x28/0x70 [c01265f2] __mod_timer+0x92/0xd0 [c02d826b] schedule_timeout+0x4b/0xd0 [c01269c0] process_timeout+0x0/0x10 [c02d7bbc] wait_for_completion_timeout+0x9c/0x130 [c0119ee0] default_wake_function+0x0/0x10 [c024f3c9] scsi_send_eh_cmnd+0x1b9/0x390 [c011df3e] vprintk+0x1fe/0x3a0 [c024f805] scsi_delete_timer+0x15/0x60 [c024f624] scsi_eh_tur+0x34/0xa0 [c024fe69] scsi_error_handler+0x429/0xbe0 [c02d74f1] __sched_text_start+0x2f1/0x660 [c024fa40] scsi_error_handler+0x0/0xbe0 [c0131679] kthread+0xa9/0xe0 [c01315d0] kthread+0x0/0xe0 [c0103d0f] kernel_thread_helper+0x7/0x18 === vscsihba:3: Abortng command serial number : 95 vscsihba:3: In Reset Device BUG: scheduling while atomic: scsi_eh_0/0x0001/4749 [c02d7684] __sched_text_start+0x484/0x660 [c011df3e] vprintk+0x1fe/0x3a0 [c01264a8] lock_timer_base+0x28/0x70 [c01265f2] __mod_timer+0x92/0xd0 [c02d826b] schedule_timeout+0x4b/0xd0 [c01269c0] process_timeout+0x0/0x10 [c02d7bbc] wait_for_completion_timeout+0x9c/0x130 [c0119ee0] default_wake_function+0x0/0x10 [c024f3c9] scsi_send_eh_cmnd+0x1b9/0x390 [c024f805] scsi_delete_timer+0x15/0x60 [c024f624] scsi_eh_tur+0x34/0xa0 [e1bf00cd] scsitap_eh_device_reset+0x1d/0x30 [vscsihba] [c02503a8] scsi_error_handler+0x968/0xbe0 [c02d74f1] __sched_text_start+0x2f1/0x660 [c024fa40] scsi_error_handler+0x0/0xbe0 [c0131679] kthread+0xa9/0xe0 [c01315d0] kthread+0x0/0xe0 [c0103d0f] kernel_thread_helper+0x7/0x18
Re: Linux Virtual SCSI HBAs and Virtual disks
Aboo Valappil wrote: Hi All, I have tried this before but I guess I was unsuccessful in presenting it properly in the mailing list. I think it is really useful especially for prototyping and also for people who wants to develop their own scsi targets and transports. There are few people told me about the SCSI target and initiator implementation of XEN. But I do not think it is this simple and might take a while to port it to normal linux kernel. At the moment, there is nothing like this available in a simplest form. Please visit this site http://vscsihba.aboo.org. I put a complete description of the project and the source code. I appreciate if you could go through it and put your thoughts This is my final attempt in this mailing list before I throw away whole of my work. Throwing it away sounds a bit drastic. It tooks me a while finding the tarball on your site. Perhaps you could put it in a table under a Downloads section. The table would be for different versions as it looks like you may need a new one for bleeding edge kernels. I didn't get far trying to build the kernel module against lk 2.6.20-rc5: # make make -C /lib/modules/2.6.20-rc5/build M=/home/upgrades/apps/vscsihba1/vscsihba1/kernel modules make[1]: Entering directory `/usr/src/linux-2.6.19' CC [M] /home/upgrades/apps/vscsihba1/vscsihba1/kernel/hba.o /home/upgrades/apps/vscsihba1/vscsihba1/kernel/hba.c:26: warning: ‘kmem_cache_t’ is deprecated CC [M] /home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.o /home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.c:263:51: error: macro INIT_WORK passed 3 arguments, but takes just 2 /home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.c: In function ‘scsitap_ctl_ioctl’: /home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.c:263: error: ‘INIT_WORK’ undeclared (first use in this function) /home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.c:263: error: (Each undeclared identifier is reported only once /home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.c:263: error: for each function it appears in.) make[2]: *** [/home/upgrades/apps/vscsihba1/vscsihba1/kernel/device.o] Error 1 make[1]: *** [_module_/home/upgrades/apps/vscsihba1/vscsihba1/kernel] Error 2 make[1]: Leaving directory `/usr/src/linux-2.6.19' make: *** [modules] Error 2 Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ieee1394: sbp2: remove bogus emulated host flag
Kristian Høgsberg wrote: On 1/14/07, Stefan Richter [EMAIL PROTECTED] wrote: There is no emulation going on here. ... - .emulated= 1, Not sure what this flag does, but I copied it over to fw-sbp2.c. If it's bogus, I guess we should drop it from fw-sbp2.c too. Kristian, The 'emulated' flag dates from the original ide-scsi driver (lk 2.0 series or earlier) when some app wanted to know if there was a real SCSI cd drive attached or a fudged one (i.e. ATAPI) via the ide-scsi bridging driver. So it is unclear to me why the sbp driver (and USB mass storage) sets emulated. Hopefully if any app cares these days there are much better ways to find out what the transport is. Also now we have the transport the LLD can see and the transport the device (i.e. logical unit) can see; and they aren't necessarily the same. In the case of a CD/DVD drive there is the GET CONFIGURATION command for finding out what the lu can see: $ sg_get_config /dev/hdc HL-DT-ST RW/DVD GCC-4242N 0201 Peripheral device type: cd/dvd No current profile Features: Profile list feature version=0, persist=1, current=1 [0x0] available profiles [more recent typically higher in list]: profile: DVD-ROM , currentP=0 profile: CD-ROM , currentP=0 profile: CD-R , currentP=0 profile: CD-RW , currentP=0 Core feature version=0, persist=1, current=1 [0x1] Physical interface standard: ATAPI .. So IMO 'emulated' is best retired. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Adaptect 9405w: What is the best solution?
Tarjei Huse wrote: Hi, I'm working on getting Linux to use my SATA drives on an IBM x306 running a HostRaid controller that uses the adaptech 9405w chipset. I found this thread on the list: http://thread.gmane.org/gmane.linux.scsi/29040/focus=29040 Basically Luben Tuikov developed the original aic94xx SAS driver for Linux when he was working for Adaptec. Soon after he made it available, the Linux SCSI community decided to fork the development for various reasons. The Linux SCSI community version seems to have passed through various hands, currently it seems to Darrick Wong's headache. In the meantime Mr Tuikov has continued developing his version which I can report is now very stable and feature rich on my hardware (an adaptec 48300 HBA with a aic-9410w chip in it). Some people involved are still quite upset about what happened and it shows in the email exchange you referred to. One unfortunate aspect of the GPL and what the community did is that the source code still has the copyright notices of Adaptec and Luben Tuikov. That may lead an observer to think that either or both still have an interest or control over that driver. As far as I can see that is not the case and a note should be added to the source code to that end. So given the above, Linux distribution vendors have a problem when they try to certify hardware containing Adaptec SAS aic94xx series chips. What I'm wondering about is: a) Where can I get the patches that mr Tuikov maintains for this chipset? You need to contact Luben Tuikov [EMAIL PROTECTED] directly. b) Are they maintained wrt to different kernel versions, i.e. do they apply cleanly to a 2.6.20 or 2.6.16 kernel? You need different driver versions for different kernels. c) In the thread Darrick Wong refers to another branch[1] that contains (according to him) a lot of fixes for this chipset. Is that branch confirmed to work with my chipset? I haven't checked recently but Adaptec used to have their own linux driver, named the adp94xx driver. It would seem that Adaptec has lost interest in Linux. My main goal is to get this box up and running using debian or ubuntu. I have managed to get it to run the debian 2.6.9 kernel as outlined in [2] using the adp94xx driver, but I cannot get that driver to compile on newer kernels. See my previous note. I have also tried to compile the latest rc of the 2.6.20 kernel and loaded up the aic94xx driver + firmware. I then ended up getting the same errors that started the above mentioned thread on this list. Here are my thoughts on the 48300 HBA and the available drivers. I bought the device about 12 months ago so I assumed it at least had production firmware on it. It required two firmware upgrades before it was stable in POST+scan in non-trivial SAS toplologies (by which I mean connected to SAS expanders). [Since I also had a LSI MPT Fusion SAS controller, it was relatively simple for me to determine the source of my problems.] All drivers I tried (although I could never get the adp94xx driver working because it was too old) showed the same problems (tmf timeouts). Then Luben Tuikov sent me a version of his driver with a use_msi related fix in it. My 48300 has been rock solid since. The last time I tried Darrick's driver (about a week ago) it failed in the fashion unto which I have become accustomed. I have encouraged people to talk amongst themselves about the use_msi patch, but I don't believe that I should be reverse engineering that patch. There are other issues. As you may understand from the above, I am walking on a bit of a tight rope here. I can also report that the same hardware works fine in Windows 2000, Vista RC1 and that FreeBSD doesn't have an aic94xx driver yet. So what is the best solution here? Has anyone managed to get a newer kernel to run on this chipset? 1. http://www.kernel.org/git/?p=linux/kernel/git/jejb/aic94xx-sas-2.6.git;a=summary 2. http://www.jimmy.co.at/weblog/?p=71 This last thread may help explain IBM's interest in getting the aic94xx working reliably. Is there a best solution? Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: no utility / method to show association between host bus adapter and non-sg BLOCK devices
Thayne Harmon wrote: Gentlemen, hwinfo, lshal, sysfs do not show the relationship for non-sg BLOCK devices with there associated Host Bus Adapter. All devices (i.e. logical units) have a 4 element tuple associated with them and the first element is the host number. A HBA contains one or more hosts. Then you can datamine in /sys/class/scsi_host/hostn for whatever information you want. Do you know of a utility or method that can show this? May I suggest lsscsi. That won't help you in the lk 2.4 series and earlier though There are other methods by which the sg device corresponding to a non-sg block device (e.g. /dev/sdc) can be found. Example is the HP/Compaq CCISS block driver. The HBA and devices are listed, but no association is given or can be determine, only by the user knowing which is which. The kernel certainly knows, surely the above apps could be made to determine this or some utility exits that will show this? See http://www.torque.net/scsi/lsscsi.html Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IO transfer limits
Stefan Richter wrote: Douglas Gilbert wrote: john clyne wrote: What do the different hostX in /sys/class/scsi_host corespond to? There are seven hostX directories, 5 with sg_tablesize set to 128 and two set to 255. A Linux host is a SCSI initiator port (e.g. FC) or a SCSI initiator device (e.g. SAS). Another way of looking at a host is as a bridge between a computer bus (e.g. PCI) and a storage transport. There is usually one (low level) driver (LLD) controlling all hosts associated with a specific class of hardware. If you fetch the lsscsi utility and load it then you can try 'lsscsi --hosts' to list the active hosts on a system (numbered on the left) and see the names of the various LLDs associated with them. Here is an example: # lsscsi --hosts [0]sata_nv [1]sata_nv [2]sata_nv [3]sata_nv [4]mptsas [5]aic94xx [6]sbp2 The first four are SATA ports (connectors) on the motherboard, all controlled by the same driver. Then there is a LSI SAS HBA (whose driver is mptsas), an Adaptec SAS HBA (48300) and finally an Adaptec IEEE 1394 controller. A side note: I don't think a Scsi_Host has a well-defined meaning beyond the kernel-internal resource which LLDs use to connect to the Linux SCSI mid layer. It may have further meaning for many LLDs, but not for all. Specifically, the host6 in your example is in the current implementation indeed nothing more than an internal resource. lsscsi is nevertheless able to determine the actual initiator port by means of knowledge of the implementation. There are two sides to a host: a kernel side (e.g. a PCI device or virtual) and a storage transport side. A host can be seen as a bridge between the two sides. SCSI command sets need the concept of an initiator. For queuing, mode page policy and reservations (i.e. in multi-initiator environments) those initiators (actually initiator ports) need domain unique identifiers, preferably world wide unique. The identifier is an attribute of the external side (i.e. the storage transport side) of a linux host. So even if you consider the kernel side of a host is a kludge, there is still the storage transport side to consider. In the case of sbp, the initiator (device and port) has a EUI-64 wwn. SBP, USB mass storage, and iSCSI all set up SCSI hosts in Linux on a session basis (just-in-time if you like). As long as the initiator port identifiers are stable (predictable from one session to the next) it seems to me little different to SAS, SPI and FC which maintain their hosts for as long as their HBAs are present. There are cheap external boxes out there that have 1394, USB and (e)SATA interfaces. I wonder what would happen if one tried to use two interfaces connected to different machines at the same time :-) Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IO transfer limits
john clyne wrote: Can anyone give me some guidance on where in the IO stack I might be running into a 512KB limit on IO transfer sizes to an external FC device? I've checked IO scheduler parameter (/sys/block/dev/queue/{max_sectors_kb,max_hw_sectors_kb}. Both are set to 32767. I'm using Qlogic HBAs (qla2312), but I don't see any relevent parameters. I'm running RHEL 4.0 with a 2.6.9-34 kernel. Any pointers would be greatly appreciated. John, I discuss the subject in this page: http://www.torque.net/sg/sg_io.html in the section titled: Maximum transfer size per command Mike C. has given you the answer for the block device interface (e.g. via /dev/sda); you should be able to do about 8 times better via the scsi generic interface (e.g. /dev/sg0). Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: IO transfer limits
john clyne wrote: What do the different hostX in /sys/class/scsi_host corespond to? There are seven hostX directories, 5 with sg_tablesize set to 128 and two set to 255. A Linux host is a SCSI initiator port (e.g. FC) or a SCSI initiator device (e.g. SAS). Another way of looking at a host is as a bridge between a computer bus (e.g. PCI) and a storage transport. There is usually one (low level) driver (LLD) controlling all hosts associated with a specific class of hardware. If you fetch the lsscsi utility and load it then you can try 'lsscsi --hosts' to list the active hosts on a system (numbered on the left) and see the names of the various LLDs associated with them. Here is an example: # lsscsi --hosts [0]sata_nv [1]sata_nv [2]sata_nv [3]sata_nv [4]mptsas [5]aic94xx [6]sbp2 The first four are SATA ports (connectors) on the motherboard, all controlled by the same driver. Then there is a LSI SAS HBA (whose driver is mptsas), an Adaptec SAS HBA (48300) and finally an Adaptec IEEE 1394 controller. Is the implication that the hard limit is 255 * page_size, or is page_size simply the default? There are big pages (around 1 MB in size) but I'm unaware that anything in the SCSI subsystem uses them. Otherwise the kernel page size is typically 4 KB. When the scsi generic driver builds its scatter gather lists, then it attempts to place 8 contiguous pages in each scatter gather element. Arm waving was a term used when I tried to explain to several kernel people that there were users out there that needed larger IO transfer limits. So I suggest that you talk the the management and tell them why you need higher limits. Linux is retarded in this area. Doug Gilbert Mike Christie wrote: john clyne wrote: Can anyone give me some guidance on where in the IO stack I might be running into a 512KB limit on IO transfer sizes to an external FC device? I've checked IO scheduler parameter (/sys/block/dev/queue/{max_sectors_kb,max_hw_sectors_kb}. Both are set to 32767. I'm using Qlogic HBAs (qla2312), but I don't see any relevent parameters. I'm running RHEL 4.0 with a 2.6.9-34 kernel. Any pointers would be greatly appreciated. There are also scatterlist limits. /sys/class/scsi_host/hostX/sg_tablesize is a limit for the number of scatter list entries. For qla2xxx it is 255. The scsi layer sets the queue's max_phys_segments to 128 by default. I thought there was ia scsi compile time option to increase this, but maybe you have to just modify the SCSI_MAX_PHYS_SEGMENTS define by hand. So with the default value and with 4 K pages if you end up getting pages that cannot be clustered you will end up with 4K * 128. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: lsscsi version 0.19 beta
Further to the announcement 1 month ago (shown below), there is another lsscsi-0.19 beta at: http://www.torque.net/sg [News section]. It provides target (and sometimes host) transport information for: - IEEE1394 (sbp) - FC - ISCSI - SAS (two representations) - SPI It has tested with lk 2.6.20-rc3 . Unfortunately fetching information out of sysfs could become a maze of kernel version dependencies as various maintainers change representations. This beta was tested with the intriguingly named SYSFS_DEPRECATED config option deselected. Sysfs is not deprecated yet (sigh) but deselecting SYSFS_DEPRECATED removes various symlinks which breaks earlier lsscsi betas. Thanks to the maintainers of various SCSI transports for helping me find what information is available in sysfs and testing code for me. They are named in the CREDITS file. Doug Gilbert Douglas Gilbert wrote [2006/12/7]: The last announcement I made to this list about lsscsi was back in March and that was a beta for lsscsi version 0.18 . The change proposed by James Bottomley that prompted the beta has not materialized. So I decided to release version 0.18 without fanfare a week ago and start adding transport information to lsscsi, dubbing it version 0.19 beta. See http://www.torque.net/scsi/lsscsi.html for downloads. The mushrooming of information (and different representations) in /sys has made it possible for lsscsi to provide a lot more information than it has previously. Ironically what storage device identification really needs is not available, namely the logical unit _name_ which, for SCSI devices, is obtained from the device identification VPD page (0x83). As a consolation there is lots of transport information. So this beta adds transport information, both target and initiator (host) for these transports: - FC - SAS I hope to add iSCSI if I can find a way through its maze. Perhaps USB and 1394 are candidates as well, even SPI. In the case of SAS, both the SAS transport layer and the SAS class (i.e. Luben Tuikov's design) representations are supported. The new options are '--transport' (or '-t') and '--list' (or '-L'). Here is an example where disk strings are insufficient: # lsscsi [4:0:0:0]diskATA ST3160812AS D /dev/sda [4:0:1:0]diskSEAGATE ST336754SS 0003 /dev/sdb [4:0:2:0]diskSEAGATE ST336754SS 0003 /dev/sdc [4:0:3:0]diskATA ST380013AS 3.18 /dev/sdd [4:0:4:0]diskSEAGATE ST336754SS 0003 /dev/sde [4:0:5:0]diskSEAGATE ST336754SS 0003 /dev/sdf [5:0:0:0]diskSEAGATE ST336754SS 0003 /dev/sdg [5:0:1:0]diskSEAGATE ST336754SS 0003 /dev/sdh [5:1:0:0]diskSEAGATE ST336754SS 0003 /dev/sdi [5:1:1:0]diskSEAGATE ST336754SS 0003 /dev/sdj How many disks are there? Looking at the transport information: # lsscsi -t [4:0:0:0]disksas:0x0b1d2c035c7e5d4c /dev/sda [4:0:1:0]disksas:0x5000c55208ed /dev/sdb [4:0:2:0]disksas:0x5000c5520a29 /dev/sdc [4:0:3:0]disksas:0x500605b033e1 /dev/sdd [4:0:4:0]disksas:0x5000c55208ee /dev/sde [4:0:5:0]disksas:0x5000c5520a2a /dev/sdf [5:0:0:0]disksas:5000c55208ed/dev/sdg [5:0:1:0]disksas:5000c5520a29/dev/sdh [5:1:0:0]disksas:5000c55208ee/dev/sdi [5:1:1:0]disksas:5000c5520a2a/dev/sdj So everything is SAS attached, including two SATA disks. Something strange is happening with 4:0:0:0 which is directly attached to the host4. From the target SAS addresses it can be seen that /dev/sdc and /dev/sdh are the same port (and because the lun is 0 in both cases, it must be the same lu). There are three other pairs there, reducing what looks like 10 disks to six. The adjacent SAS addresses are dual ports on the same disk, so the actual number of disks is 4. Why are some SAS addresses prefixed with 0x and other not? lsscsi simply prints out what is in /sys ! To fetch further information about the target that contains /dev/sdf using a filter to reduce clutter: # lsscsi --transport --list 4:0:5:0 [4:0:5:0]disksas:0x5000c5520a2a /dev/sdf transport=sas initiator_port_protocols=none initiator_response_timeout=0 I_T_nexus_loss_timeout=1744 phy_identifier=11 ready_led_meaning=0 sas_address=0x5000c5520a2a target_port_protocols=ssp A similar check on the target containing /dev/sdj # lsscsi -t -L 5:1:1 [5:1:1:0]disksas:5000c5520a2a/dev/sdj transport=sas sub_transport=sas_class device_name= dev_type=end device iproto= iresp_timeout=0x linkrate=3,0 Gbps max_linkrate=3,0 Gbps max_pathways=1 min_linkrate=3,0 Gbps pathways=1 ready_led_meaning=0
Re: [PATCH] SCSI core: better initialization for sdev-scsi_level
Alan Stern wrote: Both scsi_device and scsi_target include a scsi_level field, and the SCSI core makes a half-hearted effort to keep the values equal. Ultimately this effort may be doomed, since as far as I know there is no reason why all LUNs in a target must report the same ANSI-approved version number. But for the most part it should work okay. This patch (as834) changes the SCSI core so that after the first LUN has been probed and the target's scsi_level is known, further LUNs default to the target's scsi_level and not to SCSI_2. Alan, Umm, scsi_level is a mangled value derived from the version field in an INQUIRY standard response. As such it is per logical unit ***. There is nothing to stop a single target (especially if it is a bridge that maps targets at the remote end to luns) having a wide variety of lus with different version values (and different peripheral device types). IMO scsi_level should only be per lu which means it should only exist in the scsi_device structure. If the scsi mid level was really advanced it could track the version value in the INQUIRY response to well known logical units (see spc4r08.pdf section 8) because these really are per target. I don't expect that to happen any time soon (and there wouldn't be much benefit). So the existing code seems broken but I'm not sure your patch advances things. *** this statement assumes the peripheral qualifier field is 0, otherwise there is no real lu at the given lun Doug Gilbert Signed-off-by: Alan Stern [EMAIL PROTECTED] --- This patch will affect the CDB in INQUIRY commands sent to LUNs above 0 when LUN-0 reports a scsi_level of 0; the LUN bits will no longer be set in the second byte of the CDB. This is as it should be. Nevertheless, it's possible that some wacky device might be adversely affected. I doubt anyone will complain... Alan Stern Index: usb-2.6/drivers/scsi/scsi_scan.c === --- usb-2.6.orig/drivers/scsi/scsi_scan.c +++ usb-2.6/drivers/scsi/scsi_scan.c @@ -382,6 +382,7 @@ static struct scsi_target *scsi_alloc_ta INIT_LIST_HEAD(starget-siblings); INIT_LIST_HEAD(starget-devices); starget-state = STARGET_RUNNING; + starget-scsi_level = SCSI_2; retry: spin_lock_irqsave(shost-host_lock, flags); Index: usb-2.6/drivers/scsi/scsi_sysfs.c === --- usb-2.6.orig/drivers/scsi/scsi_sysfs.c +++ usb-2.6/drivers/scsi/scsi_sysfs.c @@ -922,7 +922,7 @@ void scsi_sysfs_device_initialize(struct snprintf(sdev-sdev_classdev.class_id, BUS_ID_SIZE, %d:%d:%d:%d, sdev-host-host_no, sdev-channel, sdev-id, sdev-lun); - sdev-scsi_level = SCSI_2; + sdev-scsi_level = starget-scsi_level; transport_setup_device(sdev-sdev_gendev); spin_lock_irqsave(shost-host_lock, flags); list_add_tail(sdev-same_target_siblings, starget-devices); - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] scsi bug fixes for 2.6.20-rc4
James Bottomley wrote: On Sun, 2007-01-07 at 11:16 -0700, Matthew Wilcox wrote: On Sun, Jan 07, 2007 at 10:04:03AM -0600, James Bottomley wrote: This is mainly bug fixes, although there are a few harmless updates (like email addresses and driver PCI IDs). The patch is available here: Could I ask that you add http://marc.theaimsgroup.com/?l=linux-scsim=116793460427798w=2 OK ... how about a title, changelog and signoff for it? Titles and sign-offs don't necessarily help. For example: http://marc.theaimsgroup.com/?l=linux-scsim=116797255528029w=2 Followed by this misdirected ** post from Eric Moore which you and I also received: On Thursday, January 04, 2007 9:39 PM, Douglas Gilbert wrote: This patch makes the mptctl pass through available if the mptsas driver is selected. Without this patch if mptsas is the only fusion driver chosen, then the mptctl is not presented as an option. smp_utils uses the mptctl driver to pass SAS SMP functions through a MPT SAS HBA. I have discussed this patch with Eric and it may be in one of his coming patchsets (but I didn't see it in today's patches). So this one is for the record. The patch is against lk 2.6.20-rc3 . Signed-off-by: Douglas Gilbert [EMAIL PROTECTED] Sorry I overlooked this, but please apply. ACK ** [EMAIL PROTECTED] strikes again. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] scsi_debug error processing
After discussions in the thread titled: [PATCH] scsi_debug: illegal blocking memory allocation here is a patch containing the discussed fix and some other fixes and additions. The patch is against lk 2.6.20-rc3 . The version is bumped to 1.81 . ChangeLog: - Change several GFP_KERNEL allocations to GFP_ATOMIC as they can be called from queuecommand() context - check above allocation returns and if out of memory report DID_REQUEUE in two cases, DID_NO_CONNECT in another, and fail slave configure() in another - add support for WRITE BUFFER command - add aborted_command error injection support (opts mask 0x10), similar mechanism to recovered_error injection. Signed-off-by: Douglas Gilbert [EMAIL PROTECTED] Doug Gilbert --- linux/drivers/scsi/scsi_debug.c 2006-11-30 10:00:01.0 -0500 +++ linux/drivers/scsi/scsi_debug.c2620rc3abo1 2007-01-04 21:49:33.0 -0500 @@ -51,10 +51,10 @@ #include scsi_logging.h #include scsi_debug.h -#define SCSI_DEBUG_VERSION 1.80 -static const char * scsi_debug_version_date = 20061018; +#define SCSI_DEBUG_VERSION 1.81 +static const char * scsi_debug_version_date = 20070104; -/* Additional Sense Code (ASC) used */ +/* Additional Sense Code (ASC) */ #define NO_ADDITIONAL_SENSE 0x0 #define LOGICAL_UNIT_NOT_READY 0x4 #define UNRECOVERED_READ_ERR 0x11 @@ -65,9 +65,13 @@ #define INVALID_FIELD_IN_PARAM_LIST 0x26 #define POWERON_RESET 0x29 #define SAVING_PARAMS_UNSUP 0x39 +#define TRANSPORT_PROBLEM 0x4b #define THRESHOLD_EXCEEDED 0x5d #define LOW_POWER_COND_ON 0x5e +/* Additional Sense Code Qualifier (ASCQ) */ +#define ACK_NAK_TO 0x3 + #define SDEBUG_TAGGED_QUEUING 0 /* 0 | MSG_SIMPLE_TAG | MSG_ORDERED_TAG */ /* Default values for driver parameters */ @@ -95,15 +99,20 @@ #define SCSI_DEBUG_OPT_MEDIUM_ERR 2 #define SCSI_DEBUG_OPT_TIMEOUT 4 #define SCSI_DEBUG_OPT_RECOVERED_ERR 8 +#define SCSI_DEBUG_OPT_TRANSPORT_ERR 16 /* When every_nth 0 then modulo every_nth commands: * - a no response is simulated if SCSI_DEBUG_OPT_TIMEOUT is set * - a RECOVERED_ERROR is simulated on successful read and write * commands if SCSI_DEBUG_OPT_RECOVERED_ERR is set. + * - a TRANSPORT_ERROR is simulated on successful read and write + * commands if SCSI_DEBUG_OPT_TRANSPORT_ERR is set. * * When every_nth 0 then after - every_nth commands: * - a no response is simulated if SCSI_DEBUG_OPT_TIMEOUT is set * - a RECOVERED_ERROR is simulated on successful read and write * commands if SCSI_DEBUG_OPT_RECOVERED_ERR is set. + * - a TRANSPORT_ERROR is simulated on successful read and write + * commands if SCSI_DEBUG_OPT_TRANSPORT_ERR is set. * This will continue until some other action occurs (e.g. the user * writing a new value (other than -1 or 1) to every_nth via sysfs). */ @@ -315,6 +324,7 @@ int target = SCpnt-device-id; struct sdebug_dev_info * devip = NULL; int inj_recovered = 0; + int inj_transport = 0; int delay_override = 0; if (done == NULL) @@ -352,6 +362,8 @@ return 0; /* ignore command causing timeout */ else if (SCSI_DEBUG_OPT_RECOVERED_ERR scsi_debug_opts) inj_recovered = 1; /* to reads and writes below */ + else if (SCSI_DEBUG_OPT_TRANSPORT_ERR scsi_debug_opts) + inj_transport = 1; /* to reads and writes below */ } if (devip-wlun) { @@ -468,7 +480,11 @@ mk_sense_buffer(devip, RECOVERED_ERROR, THRESHOLD_EXCEEDED, 0); errsts = check_condition_result; - } + } else if (inj_transport (0 == errsts)) { +mk_sense_buffer(devip, ABORTED_COMMAND, +TRANSPORT_PROBLEM, ACK_NAK_TO); +errsts = check_condition_result; +} break; case REPORT_LUNS: /* mandatory, ignore unit attention */ delay_override = 1; @@ -531,6 +547,9 @@ delay_override = 1; errsts = check_readiness(SCpnt, 0, devip); break; + case WRITE_BUFFER: + errsts = check_readiness(SCpnt, 1, devip); + break; default: if (SCSI_DEBUG_OPT_NOISE scsi_debug_opts) printk(KERN_INFO scsi_debug: Opcode: 0x%x not @@ -954,7 +973,9 @@ int alloc_len, n, ret; alloc_len = (cmd[3] 8) + cmd[4]; - arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_KERNEL); + arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC); + if (! arr) + return DID_REQUEUE 16; if (devip-wlun) pq_pdt = 0x1e; /* present, wlun */ else if (scsi_debug_no_lun_0 (0 == devip-lun)) @@ -1217,7 +1238,9 @@ alen = ((cmd[6] 24) + (cmd[7] 16) + (cmd[8] 8
Re: [PATCH] scsi_debug: illegal blocking memory allocation
Jens Axboe wrote: On Thu, Jan 04 2007, James Bottomley wrote: On Thu, 2007-01-04 at 12:21 +0100, Jens Axboe wrote: I guess it's fully up to you how you want to solve it. The scheme seems a little elaborate, but these error conditions are unlikely to ever been seen in the wild, so no objections from me. Actually, there's already a DID_ code that does what you want. Instead of DID_ERROR, which will retry immediately, there's DID_REQUEUE which will halt the device queue and wait for a returning command to retry. As long as it keeps firing the queue at some intervals even without any commands pending at all, then that'll work just fine. I like that approach a lot better than coding the error into some sense value that is (at best) some vague approximation of what has happened (calling memory shortage a transport error is a bit of a stretch). True, but both happen. The scsi_debug driver is a virtual host, virtual target and a lu (ram disk). The failure that you pointed out stopped a response being built. In the real world that would in the target or lu. The reason that I mentioned aborted_command sense key is that it is also a out of resources (bandwidth) error and it broke sg_dd. I would bet money that it would also break the block layer/sd. The block layer should leave it alone as it is simply a matter of sd retrying a few times. However the st driver could have a real problem (e.g. did that state changing command work, fail or partially work??). Anyway, I have submitted a patch that reports DID_REQUEUE for an allocation failures and adds the ability to inject aborted_command errors. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi_debug: illegal blocking memory allocation
Jens Axboe wrote: Hi Doug, resp_inquiry() does a GFP_KERNEL memory allocation, but it's not allowed to from the queuecommand context. There's no good way to return this error, so I used DID_ERROR which is used from similar paths. That doesn't seem quite right though, it would be better to return an error indicating a later retry would be more appropriate. Signed-off-by: Jens Axboe [EMAIL PROTECTED] diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 30ee3d7..0c80ed3 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -954,7 +954,9 @@ static int resp_inquiry(struct scsi_cmnd * scp, int target, int alloc_len, n, ret; alloc_len = (cmd[3] 8) + cmd[4]; - arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_KERNEL); + arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC); + if (!arr) + return DID_ERROR 16; if (devip-wlun) pq_pdt = 0x1e; /* present, wlun */ else if (scsi_debug_no_lun_0 (0 == devip-lun)) Jens, I had to read that twice. I'm always happy to convert a GFP_KERNEL to a GFP_ATOMIC (as I'm sure it started as a GFP_ATOMIC). There are a couple more that may be in queuecommand context. Taking up your point about retries and seeing that the scsi_debug driver has a SAS flavour, I'm inclined towards a aborted command, initiator response timeout [Bh,4Bh,6] CHECK CONDITION. There is a group of transport injected error messages in SAS (see sas2r07.pdf section 10.2.3) that pop up from time to time. Due to conjestion in connection-switched SAS expanders these error messages should be interpreted as try again depending on the topology. The patch below adds a aborted_command bit in opts that will cause every nth command to be aborted (with an ack/nak timeout). Note that SAS has an optional transport layer retries state machine to lessen the incidence of aborted commands. Evidently SAS tape drives use the facility. Doug Gilbert --- linux/drivers/scsi/scsi_debug.c 2006-11-30 10:00:01.0 -0500 +++ linux/drivers/scsi/scsi_debug.c2620rc3abo 2007-01-04 00:19:49.0 -0500 @@ -51,10 +51,10 @@ #include scsi_logging.h #include scsi_debug.h -#define SCSI_DEBUG_VERSION 1.80 -static const char * scsi_debug_version_date = 20061018; +#define SCSI_DEBUG_VERSION 1.81 +static const char * scsi_debug_version_date = 20070104; -/* Additional Sense Code (ASC) used */ +/* Additional Sense Code (ASC) */ #define NO_ADDITIONAL_SENSE 0x0 #define LOGICAL_UNIT_NOT_READY 0x4 #define UNRECOVERED_READ_ERR 0x11 @@ -65,9 +65,14 @@ #define INVALID_FIELD_IN_PARAM_LIST 0x26 #define POWERON_RESET 0x29 #define SAVING_PARAMS_UNSUP 0x39 +#define TRANSPORT_PROBLEM 0x4b #define THRESHOLD_EXCEEDED 0x5d #define LOW_POWER_COND_ON 0x5e +/* Additional Sense Code Qualifier (ASCQ) */ +#define ACK_NAK_TO 0x3 +#define INIT_RESPONSE_TO 0x6 + #define SDEBUG_TAGGED_QUEUING 0 /* 0 | MSG_SIMPLE_TAG | MSG_ORDERED_TAG */ /* Default values for driver parameters */ @@ -95,15 +100,20 @@ #define SCSI_DEBUG_OPT_MEDIUM_ERR 2 #define SCSI_DEBUG_OPT_TIMEOUT 4 #define SCSI_DEBUG_OPT_RECOVERED_ERR 8 +#define SCSI_DEBUG_OPT_TRANSPORT_ERR 16 /* When every_nth 0 then modulo every_nth commands: * - a no response is simulated if SCSI_DEBUG_OPT_TIMEOUT is set * - a RECOVERED_ERROR is simulated on successful read and write * commands if SCSI_DEBUG_OPT_RECOVERED_ERR is set. + * - a TRANSPORT_ERROR is simulated on successful read and write + * commands if SCSI_DEBUG_OPT_TRANSPORT_ERR is set. * * When every_nth 0 then after - every_nth commands: * - a no response is simulated if SCSI_DEBUG_OPT_TIMEOUT is set * - a RECOVERED_ERROR is simulated on successful read and write * commands if SCSI_DEBUG_OPT_RECOVERED_ERR is set. + * - a TRANSPORT_ERROR is simulated on successful read and write + * commands if SCSI_DEBUG_OPT_TRANSPORT_ERR is set. * This will continue until some other action occurs (e.g. the user * writing a new value (other than -1 or 1) to every_nth via sysfs). */ @@ -315,6 +325,7 @@ int target = SCpnt-device-id; struct sdebug_dev_info * devip = NULL; int inj_recovered = 0; + int inj_transport = 0; int delay_override = 0; if (done == NULL) @@ -352,6 +363,8 @@ return 0; /* ignore command causing timeout */ else if (SCSI_DEBUG_OPT_RECOVERED_ERR scsi_debug_opts) inj_recovered = 1; /* to reads and writes below */ + else if (SCSI_DEBUG_OPT_TRANSPORT_ERR scsi_debug_opts) + inj_transport = 1; /* to reads and writes below */ } if (devip-wlun) { @@ -468,7 +481,11 @@ mk_sense_buffer(devip, RECOVERED_ERROR, THRESHOLD_EXCEEDED, 0); errsts = check_condition_result; - } + } else if
Re: [Patch] scsi: megaraid_{mm,mbox}: init fix for kdump
Randy Dunlap wrote: On Fri, 29 Dec 2006 08:02:17 -0800 Sumant Patro wrote: See Documentation/SubmittingPatches: Please include output of diffstat -p1 -w70 so that we can easily see the scope of the changes. and see Documentation/CodingStyle for comments below: diff -uprN linux-2.6.orig/drivers/scsi/megaraid/megaraid_mbox.c linux-2.6.new/drivers/scsi/megaraid/megaraid_mbox.c --- linux-2.6.orig/drivers/scsi/megaraid/megaraid_mbox.c 2006-12-28 09:56:04.0 -0800 +++ linux-2.6.new/drivers/scsi/megaraid/megaraid_mbox.c 2006-12-29 05:31:48.0 -0800 @@ -779,6 +780,22 @@ megaraid_init_mbox(adapter_t *adapter) goto out_release_regions; } +// initialize the mutual exclusion lock for the mailbox +spin_lock_init(raid_dev-mailbox_lock); Linux uses /*...*/ C89-style comments, not // C99 comments. Randy It is about time this absurd stipulation was dropped. Are there any C compilers that can compile the linux kernel and that don't accept both _standard_ comment styles? Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sas_device/end_device-*/phy_identifier flipped
Luben Tuikov wrote: --- Douglas Gilbert [EMAIL PROTECTED] wrote: In lk 2.6.20-rc2 (and probably earlier) the phy_identifier attribute in the /sys/class/sas_device/end_device-* directory is showing the wrong end of the point to point link. Phy identifiers on (dual ported) SAS disks are typically 0 and 1. For SATA disks the phy identifier should be 0. # lsscsi [4:0:0:0]diskATA ST3160812AS D /dev/sda [4:0:1:0]diskSEAGATE ST336754SS 0003 /dev/sdb # lsscsi -t [4:0:0:0]disksas:0x500605b033e6 /dev/sda [4:0:1:0]disksas:0x5000c55208ee /dev/sdb # lsscsi -tL 4:0:1:0 [4:0:1:0]disksas:0x5000c55208ee /dev/sdb transport=sas initiator_port_protocols=none initiator_response_timeout=1 I_T_nexus_loss_timeout=1744 phy_identifier=7 ready_led_meaning=1 sas_address=0x5000c55208ee target_port_protocols=ssp # smp_discover -mb Device 500605b033ef, expander (only connected phys shown): phy 5:T:attached:[500605b6f260:03 i(SSP+STP+SMP)] 3 Gbps phy 6:T:attached:[500605b033e6:00 t(SATA)] 1.5 Gbps phy 7:T:attached:[5000c55208ee:01 t(SSP)] 3 Gbps The SATA and SAS disks are connected via an expander which lets me look at sysfs for 4:0:1:0 and the expander configuration with smp_discover. The port in use on the SAS disk has the address: 5000c55208ee . The expander says that cable is attached to phy 1 which agrees with what I can see. However sysfs reports phy_identifier=7 which is wrong (and happens to be the attached phy_id seen from the SAS disk). Both aic94xx and mptsas drivers do the same thing so it looks like a SAS transport problem. Have you tested this with the SAS Stack as I distribute it? Luben, Yes, but it is boring because it just works ***. With your driver for a different port on the same SAS disk, lsscsi outputs: # lsscsi -tL 6:0:0:0 [6:0:0:0]disksas:5000c55208ed/dev/sdd transport=sas sub_transport=sas_class device_name= dev_type=end device iproto= iresp_timeout=0x2710 linkrate=3,0 Gbps max_linkrate=3,0 Gbps max_pathways=1 min_linkrate=3,0 Gbps pathways=1 ready_led_meaning=1 rl_wlun=0 sas_addr=5000c55208ed tproto=SSP transport_layer_retries=0 lsscsi is data mining this directory: /sys/class/scsi_device/6:0:0:0/device/sas_device which contains: # ls device_nameitnl_timeout max_pathways rl_wlun dev_type linkrate min_linkrate sas_addr iproto LUNS pathways tproto iresp_timeout max_linkrate ready_led_meaning transport_layer_retries Interestingly there is no phy_id entry (and a single entry wouldn't be sufficient if the target was wide port). I can live without the phy_id there (as it can be found other ways: SMP and the protocol specific (SAS) log page). So the bottom line is that the phy_id(s) doesn't need to be there but if it is it should be correct. *** I plan to write another mail on the aic94xx driver mess. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ieee1394: sbp2: pass REQUEST_SENSE through to the target
Stefan Richter wrote: Delete some incorrect code, left over from the initial driver submission in March 2001. SBP-2 targets should provide sense data via the SBP-2 status block (autosense). We have to pass the REQUEST_SENSE command through to targets which don't implement autosense, if there are any. Umm, REQUEST SENSE has several other useful capabilities. It can convey information about low power conditions, a progress indicator (e.g. during FORMAT with IMMED=1) and informational exception warnings. It is also defined to work any time INQUIRY works (e.g. on lun=0 when there is no lu there but there is a lun0). smartmontools sets MRIE to 6 in the control mode page so it can periodically poll a disk with REQUEST SENSE to see if it has tripped a threshold . It could use other techniques but they would most likely interfere with normal error processing on the host OS (and linux is one of about 8). So, this patch is a step in the right direction. Hopefully not too many other LLDs are playing games with REQUEST SENSE. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
KERN_NOTICE very big device. try to use READ CAPACITY(16)
This message is generated by sd when a disk is larger than 2 TB. Does it need to be? Could it be a logging message? It is also badly worded as the imperative try suggests that the user needs to find a utility that sends a READ CAPACITY(16). And I was recently contacted by a user with that in mind. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
sas_device/end_device-*/phy_identifier flipped
In lk 2.6.20-rc2 (and probably earlier) the phy_identifier attribute in the /sys/class/sas_device/end_device-* directory is showing the wrong end of the point to point link. Phy identifiers on (dual ported) SAS disks are typically 0 and 1. For SATA disks the phy identifier should be 0. # lsscsi [4:0:0:0]diskATA ST3160812AS D /dev/sda [4:0:1:0]diskSEAGATE ST336754SS 0003 /dev/sdb # lsscsi -t [4:0:0:0]disksas:0x500605b033e6 /dev/sda [4:0:1:0]disksas:0x5000c55208ee /dev/sdb # lsscsi -tL 4:0:1:0 [4:0:1:0]disksas:0x5000c55208ee /dev/sdb transport=sas initiator_port_protocols=none initiator_response_timeout=1 I_T_nexus_loss_timeout=1744 phy_identifier=7 ready_led_meaning=1 sas_address=0x5000c55208ee target_port_protocols=ssp # smp_discover -mb Device 500605b033ef, expander (only connected phys shown): phy 5:T:attached:[500605b6f260:03 i(SSP+STP+SMP)] 3 Gbps phy 6:T:attached:[500605b033e6:00 t(SATA)] 1.5 Gbps phy 7:T:attached:[5000c55208ee:01 t(SSP)] 3 Gbps The SATA and SAS disks are connected via an expander which lets me look at sysfs for 4:0:1:0 and the expander configuration with smp_discover. The port in use on the SAS disk has the address: 5000c55208ee . The expander says that cable is attached to phy 1 which agrees with what I can see. However sysfs reports phy_identifier=7 which is wrong (and happens to be the attached phy_id seen from the SAS disk). Both aic94xx and mptsas drivers do the same thing so it looks like a SAS transport problem. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[Announce] smp_utils-0.92 available
smp_utils is a package of command line utilities for invoking SMP functions to monitor and manage SAS expanders. SMP is the Serial Attached SCSI (SAS) Management Protocol (SMP). A SAS Host Bus Adapter (HBA) includes a SMP initiator (along with a SSP and STP initiator). A SAS expander contains a SMP target. Several SAS HBAs have a SMP pass through interface that can be used to send SMP requests and receive the responses. This package targets the linux kernel (lk) 2.6 and lk 2.4 series. Two interfaces are available: the mptctl pass through used by MPT Fusion SAS HBAs and the smp_portal sysfs attribute pass through used by at least one aic94xx based Linux driver. For an overview and examples of smp_utils see: http://www.torque.net/sg/smp_utils.html A tarball, rpm and deb can be found in table 2. CHANGELOG (since version 0.91): - all: suggest using '-v' if smp_send_req() fails - smp_lib: sync function names and results with sas2r07 - smp_rep_general: sync with sas2r07 - smp_rep_route_info: add '--multiple' and '--num= options. Underlying SMP function may be called multiple times to show one line per phy's route table. - smp_lib.h: add C++ 'extern C ' wrapper - smp_discover+smp_discover_list: sync with sas2r07 - smp_conf_general: add new SMP function - smp_utils.8: suggestions for finding expander SAS addresses and mptsas ioc_num - Makefile.freebsd: builds utilities on FreeBSD - man pages: cleanup Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: lsscsi version 0.19 beta
James Bottomley wrote: On Thu, 2006-12-07 at 01:10 -0500, Douglas Gilbert wrote: The change proposed by James Bottomley that prompted the beta has not materialized. You'll have to remind me: which change was this? James, Yes, I'm fuzzy on the details as well. Here is part of the lsscsi ChangeLog. Do the last two entries ring a bell? Version 0.19 2006/12/06 - add transport identifiers (target+initiator, port+node) - enhance host name search when proc_name is NULL - implement filter option for '--hosts' - accept 'hostn' as first item in filter to mean host n - output more host attributes when '-Hll' given - add '--list' (or '-L') option output attribute=value entries, one per line Version 0.18 2006/3/24 - cope with dropping of 'generic' symlink post lk 2.6.16 - anticipate the future removal of 'tape' symlink Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] libata: Simulate REPORT LUNS for ATAPI devices
Jeff Garzik wrote: Darrick J. Wong wrote: The Quantum GoVault SATAPI removable disk device returns ATA_ERR in response to a REPORT LUNS packet. If this happens to an ATAPI device that is attached to a SAS controller (this is the case with sas_ata), the device does not load because SCSI won't touch a SCSI device that won't report its LUNs. Since most ATAPI devices don't support multiple LUNs anyway, we might as well fake a response like we do for ATA devices. Signed-off-by: Darrick J. Wong [EMAIL PROTECTED] Seems sane to me, but I would like additional comment/testing/etc. before applying... A SCSI target contains zero or more logical units. As in this case, those logical units may use a different transport. In such cases a SCSI target needs to emulate responses to some SCSI commands (and modify the responses to others). Here is a list that is probably not comprehensive: - INQUIRY (peripheral qualifier in standard response) - INQUIRY, device identification VPD page (0x83) - obviously for the device name+identifier and port name+identifier - may need to concatenate those with the lu's name+identifier - INQUIRY, SCSI ports VPD page - INQUIRY, ATA Information VPD page (for SAT) - REPORT LUNS [mandatory in SPC-3 hence mandatory in SAT] - protocol specific port mode page (0x19) - protocol specific lu mode page (0x18) [could simulate] - PATA control mode page (0xa,0xf1) (for SAT) - protocol specific port _log_ page (0x18) And for SAT you could add the ATA PASS-THROUGH commands to that list. Those that are really ambitious could implement well know logical units (wluns) which are essentially a clean way to talk directly to the target rather than a logical unit. About the multi-lun ATAPI devices comment: how would libata represent multiple S-ATAPI devices connected to a SATA port multiplier? Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] libata: Simulate REPORT LUNS for ATAPI devices
James Bottomley wrote: On Mon, 2006-12-04 at 15:32 -0800, Darrick J. Wong wrote: The Quantum GoVault SATAPI removable disk device returns ATA_ERR in response to a REPORT LUNS packet. If this happens to an ATAPI device that is attached to a SAS controller (this is the case with sas_ata), the device does not load because SCSI won't touch a SCSI device that won't report its LUNs. Since most ATAPI devices don't support multiple LUNs anyway, we might as well fake a response like we do for ATA devices. Actually, there may be a standards conflict here. SPC says that all devices reporting compliance with this standard (as the inquiry data for this device claims) shall support REPORT LUNS. On the other hand, MMC doesn't list REPORT LUNS in its table of mandatory commands. MMC-5 rev 4 section 7.1: Some commands that may be implemented by MM drives are not described in this standard, but are found in other SCSI standards. For a complete list of these commands refer to [SPC-3]. Hmm, may be implemented yet REPORT LUNS is mandatory in SPC-3 (and SPC-3 is a normative reference for MMC-5). I guess there is wriggle room there. In practice, MMC diverges from SPC a lot more than other SCSI device type command sets (e.g. SBC and SSC). I'm starting to think that even if they report a SCSI compliance level of 3 or greater, we still shouldn't send REPORT LUNS to devices that return MMC type unless we have a white list override. There is also SAT compliance. For the ATA command set (i.e. disks) sat-r09 lists REPORT LUNS and refers to SPC-3. For ATAPI sat-r09 is far less clear. It does recommend, for example, that the ATA Information VPD pages is implemented in the SATL for ATAPI devices. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 7026] CD/DVD burning with USB writer doesn't work
James Bottomley wrote: On Wed, 2006-12-06 at 11:32 -0500, Alan Stern wrote: But how did he get the file descriptor? He opened a device name, which could have been used to get the sysfs file. The device name was probably something like /dev/sg0. This doesn't easily permit one to find the corresponding sysfs filename for the real underlying device, although it can be done with difficulty. (That's why I used the excessively-ornate sysfs pathname in the Bugzilla entry.) It certainly wouldn't be as easy as using an ioctl. It wouldn't be as uniform either. The search through sysfs would have to be different depending on whether the device name was /dev/sr0 or /dev/sg0. Realistically, no-one makes SCSI CDs or DVDs any more ... I know, I've tried to get some for some of my older boxes. Most of them nowadays are IDE attachments, which don't have a /dev/sg node. So /dev/sg is really the legacy mode for burning. The correct way to do it today is to use the actual device name ... then you don't have to worry about what the transport is any more. All CD and DVD drive these days use SCSI. That is SCSI command sets: MMC and SPC. Very few use the SCSI Parallel Interface (SPI). An increasing number will be using S-ATAPI and they could be seen by the OS via SCSI transports: FC and SAS (+ SATA). Is the patch below acceptable? Really, no. The parameter you're fishing for is a block parameter, not a SCSI parameter ... it should really be a block ioctl if we have to have an ioctl at all. I could easily enough rewrite the patch to put the ioctl somewhere else (although I'm not quite sure exactly where would be best). But do non-block devices have request queues? What about the points that Doug raised: All CD/DVD burners are block devices, which is the problem set under discussion. CD/DVD burners are block device for read operations only. When they are burning they are not block devices in the normal sense. So if this was classic Unix a block device node would be used for reading and a raw device node for writing. Just like I'm wasting keystrokes. On Tue, 5 Dec 2006, Douglas Gilbert wrote: Apart from sensibly yielding the max size in bytes, your patch has the added benefit of allowing non-block devices (e.g. tape, processor and enclosure services) to find out what limit the OS/host has placed on each command's maximum transfer size. They all possess block queues, yes, so we should really allow access to the block ioctls for them. If you manage to get that ioctl in, then ungrateful people will ask for the corresponding set operation as well. To illustrate the /sys mess look at naming of the sysfs approach to this problem. For example: /sys/block/sde/queue/max_sectors_kb - it is not only a block property - sde is an end device and suggests information from that device's Block Limits VPD page, actually it is a limit imposed by the OS and the host used to access that device - what has queue got to do with it? - max_sectors_kb should have units of bytes In addition to all of these points, there remains the peculiar location of the SG_ ioctls. They are implemented it two places in the kernel: block/scsi_ioctl.c and drivers/scsi/sg.c. And the two implementations of e.g. SG_SET_RESERVED_SIZE don't even do the same thing! I have no idea why the block layer even implements SG_SET_RESERVED_SIZE ... I suspect it was some legacy application compatibility problem, so it probably can be eliminated. It was put there to trick cdrecord! Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 7026] CD/DVD burning with USB writer doesn't work
James Bottomley wrote: On Wed, 2006-12-06 at 18:49 +0100, Joerg Schilling wrote: Please keep in mind: all CD/DVD burners are SCSI devices. This is probably semantics, but nowadays, SCSI means SPI (or parallel SCSI). I think you're trying to say that they're all devices that obey the MMC standard? Which is true, but not really relevant. James, SPI is dead. Get used to it. SCSI has not meant SPI for years. We should be in the business of disabusing people of that idea, not reinforcing it. If you went to www.t10.org and looked at draft documents and the reflector you would be lucky to find any documents or posts about SPI in the last two years. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 7026] CD/DVD burning with USB writer doesn't work
James Bottomley wrote: On Wed, 2006-12-06 at 13:38 -0500, Douglas Gilbert wrote: SPI is dead. Get used to it. SCSI has not meant SPI for years. We should be in the business of disabusing people of that idea, not reinforcing it. I don't believe I said anything in favour of or against SPI. James, My objection, and I believe Joerg's objection, is how people would interpret this statement by you: This is probably semantics, but nowadays, SCSI means SPI (or parallel SCSI). One could deduce from that statement, falsely, that the linux SCSI subsystem was the linux SPI subsystem. Hence we should mark it as legacy (and stop libata and the new ATA subsystem from using it). I think you'll find the whole point of SAM is separating the command set from the transport and interconnect. Saying a device speaks SCSI has no real meaning in that context anymore. It's commonly taken to mean SCSI-2 where the whole things was lumped together and SPI centric. SCSI is a storage architecture, a group of command sets and a group of transports. The original SCSI transport, now considered legacy (a horribly non-technical word) is SPI. In the SAM context, a modern IDE CD is MMC over an ATAPI or SATAPI transport. An old SCSI CD is MMC over SPI. The thing Alan's having trouble with is MMC over a USB transport. Agreed. And USB mass storage would probably be the most used SCSI transport nowadays. Folks can and have written their own subsystems for handling USB mass storage but sooner or later they are going to be looking at read capacity, sense buffers and mode pages. That is why the SCSI subsystem continues to be relevant. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
lsscsi version 0.19 beta
The last announcement I made to this list about lsscsi was back in March and that was a beta for lsscsi version 0.18 . The change proposed by James Bottomley that prompted the beta has not materialized. So I decided to release version 0.18 without fanfare a week ago and start adding transport information to lsscsi, dubbing it version 0.19 beta. See http://www.torque.net/scsi/lsscsi.html for downloads. The mushrooming of information (and different representations) in /sys has made it possible for lsscsi to provide a lot more information than it has previously. Ironically what storage device identification really needs is not available, namely the logical unit _name_ which, for SCSI devices, is obtained from the device identification VPD page (0x83). As a consolation there is lots of transport information. So this beta adds transport information, both target and initiator (host) for these transports: - FC - SAS I hope to add iSCSI if I can find a way through its maze. Perhaps USB and 1394 are candidates as well, even SPI. In the case of SAS, both the SAS transport layer and the SAS class (i.e. Luben Tuikov's design) representations are supported. The new options are '--transport' (or '-t') and '--list' (or '-L'). Here is an example where disk strings are insufficient: # lsscsi [4:0:0:0]diskATA ST3160812AS D /dev/sda [4:0:1:0]diskSEAGATE ST336754SS 0003 /dev/sdb [4:0:2:0]diskSEAGATE ST336754SS 0003 /dev/sdc [4:0:3:0]diskATA ST380013AS 3.18 /dev/sdd [4:0:4:0]diskSEAGATE ST336754SS 0003 /dev/sde [4:0:5:0]diskSEAGATE ST336754SS 0003 /dev/sdf [5:0:0:0]diskSEAGATE ST336754SS 0003 /dev/sdg [5:0:1:0]diskSEAGATE ST336754SS 0003 /dev/sdh [5:1:0:0]diskSEAGATE ST336754SS 0003 /dev/sdi [5:1:1:0]diskSEAGATE ST336754SS 0003 /dev/sdj How many disks are there? Looking at the transport information: # lsscsi -t [4:0:0:0]disksas:0x0b1d2c035c7e5d4c /dev/sda [4:0:1:0]disksas:0x5000c55208ed /dev/sdb [4:0:2:0]disksas:0x5000c5520a29 /dev/sdc [4:0:3:0]disksas:0x500605b033e1 /dev/sdd [4:0:4:0]disksas:0x5000c55208ee /dev/sde [4:0:5:0]disksas:0x5000c5520a2a /dev/sdf [5:0:0:0]disksas:5000c55208ed/dev/sdg [5:0:1:0]disksas:5000c5520a29/dev/sdh [5:1:0:0]disksas:5000c55208ee/dev/sdi [5:1:1:0]disksas:5000c5520a2a/dev/sdj So everything is SAS attached, including two SATA disks. Something strange is happening with 4:0:0:0 which is directly attached to the host4. From the target SAS addresses it can be seen that /dev/sdc and /dev/sdh are the same port (and because the lun is 0 in both cases, it must be the same lu). There are three other pairs there, reducing what looks like 10 disks to six. The adjacent SAS addresses are dual ports on the same disk, so the actual number of disks is 4. Why are some SAS addresses prefixed with 0x and other not? lsscsi simply prints out what is in /sys ! To fetch further information about the target that contains /dev/sdf using a filter to reduce clutter: # lsscsi --transport --list 4:0:5:0 [4:0:5:0]disksas:0x5000c5520a2a /dev/sdf transport=sas initiator_port_protocols=none initiator_response_timeout=0 I_T_nexus_loss_timeout=1744 phy_identifier=11 ready_led_meaning=0 sas_address=0x5000c5520a2a target_port_protocols=ssp A similar check on the target containing /dev/sdj # lsscsi -t -L 5:1:1 [5:1:1:0]disksas:5000c5520a2a/dev/sdj transport=sas sub_transport=sas_class device_name= dev_type=end device iproto= iresp_timeout=0x linkrate=3,0 Gbps max_linkrate=3,0 Gbps max_pathways=1 min_linkrate=3,0 Gbps pathways=1 ready_led_meaning=0 rl_wlun=0 sas_addr=5000c5520a2a tproto=SSP transport_layer_retries=0 Finally here is a listing of hosts, then a listing of hosts with their initiator identifier (if known) and finally a closer look at host4 (with and without transport specific information): # lsscsi --hosts [0]sata_nv [1]sata_nv [2]sata_nv [3]sata_nv [4]mptsas [5]aic94xx # lsscsi --hosts --transport [0]sata_nv [1]sata_nv [2]sata_nv [3]sata_nv [4]mptsassas:0x500605b6f260 [5]aic94xx sas:5d10002dc000 # lsscsi -H -t --list 4 [4]mptsassas:0x500605b6f260 transport=sas device_type=end device initiator_port_protocols=smp, stp, ssp invalid_dword_count=0 loss_of_dword_sync_count=0 maximum_linkrate=3.0 Gbit maximum_linkrate_hw=3.0 Gbit minimum_linkrate=1.5 Gbit minimum_linkrate_hw=1.5 Gbit negotiated_linkrate=Unknown phy_identifier=0 phy_reset_problem_count=0 running_disparity_error_count=0
Re: [Bug 7026] CD/DVD burning with USB writer doesn't work
Alan Stern wrote: I decided to do this by email instead of bugzilla so that it would be visible to everyone on the linux-scsi mailing list. Re: http://bugzilla.kernel.org/show_bug.cgi?id=7026 To recap: Joerg Schilling needs to be able to retrieve the max_sectors value for a SCSI device's request queue. Doing it via sysfs is rather clumsy, especially when only a file descriptor is available and not the device name. He has asked for an ioctl interface to provide the information. Is the patch below acceptable? Alan, I just spent an hour thinking about how to data mine through that dreadful mess that /sys has become as I try to add transport information to lsscsi. And then this post made my day. Fancy that, adding a new ioctl!! I hope the ioctl police aren't watching :-) Apart from sensibly yielding the max size in bytes, your patch has the added benefit of allowing non-block devices (e.g. tape, processor and enclosure services) to find out what limit the OS/host has placed on each command's maximum transfer size. If you manage to get that ioctl in, then ungrateful people will ask for the corresponding set operation as well. To illustrate the /sys mess look at naming of the sysfs approach to this problem. For example: /sys/block/sde/queue/max_sectors_kb - it is not only a block property - sde is an end device and suggests information from that device's Block Limits VPD page, actually it is a limit imposed by the OS and the host used to access that device - what has queue got to do with it? - max_sectors_kb should have units of bytes And /sys has the horrible side effect of enshrining a badly conceived design in a user interface (and SAS comes to mind). Best of luck Doug Gilbert BTW Joerg: SG_SET_RESERVED_SIZE simply makes it extremely unlikely that the sg driver will not be able to fetch enough memory from the kernel to move data associated with a SCSI command. The block layer SG_IO just fudges that. While a major concern in lk 2.0, memory starvation is typically not a major concern in lk 2.6 assuming modern hardware. The sg driver's reserved buffer has other uses as FUJITA Tomonori pointed out yesterday on the linux-scsi list. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: problems to expect with 2TB volumes
Bernd Schubert wrote: Hi, we have not bought the device yet, but presently in the process to do so. Before we buy it, I want to know about problems in advance... None that I'm aware of from the point of view of the Linux SCSI subsystem (starting at about half way through the lk 2.4 series or 4 years ago). I'm somewhat worried about this problem report http://lists.freebsd.org/pipermail/aic7xxx/2006-January/thread.html#4280 Especially as I don't see a final solution... We also want to buy the very same raid device and also connect it to an already existing aic79xx controller. On reviewing that thread, the original poster was jumping to premature conclusions. Justin Gibbs told him there was no such problem (and Justin is well placed to know). Then the final post shows a trace with READ(10) commands failing. They are 32 bit lba read operations that have been the default for about 10 years in the SCSI subsystem. If those fail on that transport it is probably a termination problem. When Justin saw that he probably didn't bother responding again. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: aic94xx panic on module load
Mark Haverkamp wrote: I got this panic when loading the aic94xx module. The adapter is connected to an HP MSA50 SAS enclosure with 3 72GB SAS disks. Kernel 2.6.19-rc6-scsi-misc on an x86_64 --- aic94xx: Adaptec aic94xx SAS/SATA driver version 1.0.2 loaded aic94xx: found Adaptec AIC-9410W SAS/SATA Host Adapter, device :08:01.0 aic94xx: BIOS present (1,2), 1673 aic94xx: ue num:3, ue size:88 aic94xx: manuf sect SAS_ADDR 5d100045af00 snip sas: phy1 matched wide port0 sas: phy1 added to port0, phy_mask:0x3 sas: phy2 matched wide port0 sas: phy2 added to port0, phy_mask:0x7 sas: phy3 matched wide port0 sas: phy3 added to port0, phy_mask:0xf sas: DOING DISCOVERY on port 0, pid:3524 sas: ex 500508b300a27a2f phy00:T attached: 500508b300a27a3f sas: ex 500508b300a27a2f phy01:T attached: 500508b300a27a3f sas: ex 500508b300a27a2f phy02:T attached: sas: ex 500508b300a27a2f phy03:T attached: sas: ex 500508b300a27a2f phy04:S attached: 5d100045af00 sas: ex 500508b300a27a2f phy05:S attached: 5d100045af00 sas: ex 500508b300a27a2f phy06:S attached: 5d100045af00 sas: ex 500508b300a27a2f phy07:S attached: 5d100045af00 sas: ex 500508b300a27a2f phy08:T attached: sas: ex 500508b300a27a2f phy09:T attached: sas: ex 500508b300a27a2f phy10:T attached: sas: ex 500508b300a27a2f phy11:T attached: sas: ex 500508b300a27a2f phy12:D attached: 500508b300a27a2c sas: ex 500508b300a27a3f phy00:D attached: 5000c595f8b5 sas: ex 500508b300a27a3f phy01:D attached: sas: ex 500508b300a27a3f phy02:D attached: 5000c595d3b5 sas: ex 500508b300a27a3f phy03:D attached: sas: ex 500508b300a27a3f phy04:D attached: 5000c595c0b9 sas: ex 500508b300a27a3f phy05:D attached: sas: ex 500508b300a27a3f phy06:D attached: sas: ex 500508b300a27a3f phy07:D attached: sas: ex 500508b300a27a3f phy08:D attached: sas: ex 500508b300a27a3f phy09:D attached: sas: ex 500508b300a27a3f phy10:S attached: 500508b300a27a2f sas: ex 500508b300a27a3f phy11:S attached: 500508b300a27a2f sas: task finished with resp:0x0, stat:0x89 sas: sas_discover_sata() for device 500508b300a27a2c at 500508b300a27a2f:0xc returned 0xff06 kobject_add failed for port-2:0:12 with -EEXIST, don't try to register things with the same name in the same directory. So this is an interesting expander setup within the enclosure. There are two expanders (500508b300a27a2f + 500508b300a27a3f) interconnected via a two wide link (0,1 - 10,11 (T-S)) with a four wide link back to the 94xx HBA (4,5,6,7 - 0,1,2,3). My guess is that 500508b300a27a2f:12 is virtual and contains a SES target. That leaves SAS disks on 500508b300a27a3f:0, 500508b300a27a3f,2 and 500508b300a27a3f,4 The pain starts immediately after the sas transport layer tries to process those expander SMP DISCOVER responses. The trace seems to suggest the device at 500508b300a27a2f:12 is SATA: extremely unlikely. Mark, do you have a LSI MPT Fusion SAS HBA handy? If so you might connect the enclosure to it, get smp_utils and do something like: # modprobe mptctl # smp_discover -p 12 -s 0x500508b300a27a2f /dev/mptctl and post the output. BTW Darrick, SATA disks connected to an expander usually get SAS addresses like expander_sas_address + n where n is small. The device attached to 500508b300a27a2f:12 is in that region: 500508b300a27a2c Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Disable SCSI-Reservation at the driver level ?
James Bottomley wrote: On Sun, 2006-11-26 at 17:31 +0100, roland wrote: VMWare ESX refuses to create VMFS Filesystem on SATA disk, attached to a onBoard SAS controller (lsi1068). When i raid1 two SATA disks, it works, if i use a single SATA disk, the controller seems to expose the disk differently to the operating system and creation of a VMFS fails due to missing ability to issue SCSI reservation command. There's no SCSI fix for this ... the SAT has no translation for the SCSI reservation commands, largely because there's no corresponding ATA equivalent and even for SCSI devices they may fail anyway. The application should cope with such a failure, so in this case it's the application that needs fixing. SAT originally did have persistent reservations and it was dropped and is back on the agenda for SAT-2. A SAT layer (such as the one found in libata) can do more that just translate command, it may also emulate SCSI commands. And PERSISTENT RESERVE IN and OUT (and maybe the older RESERVE and RELEASE) would be very good candidates for emulation. To do this however libata would need to be a lot more transport aware than it is now. To do such an emulation a SAT layer needs to know: a) whether it has full control over the SATA device (i.e. there is no other path to it) and failing that, it has some other mechanism such as affiliations in SAS with SMP available to control them b) the identity of the initiator (port) asking for the reservation. If libata could do this it would add a lot of value over and above simple command translation. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: aic94xx panic on module load
Mark Haverkamp wrote: On Tue, 2006-11-28 at 13:46 -0800, Mark Haverkamp wrote: On Tue, 2006-11-28 at 13:44 -0500, Douglas Gilbert wrote: [ ... ] I don't know if this helps, but I found the verbose option. Here is a little debug output. ./smp_discover -v -p 12 -s 0x500508b300a27a2f /dev/mptctl Discover request: 40 10 00 02 00 00 00 00 00 0c 00 00 00 00 00 00 send_req_mpt: subvalue=0 SAS address=0x500508b300a27a2f mptctl two scatter gather list interface IOCStatus=0x1 IOCStatus=0x1 IOCLogInfo=0xA27A2F SASStatus=0x0 smp_send_req failed, res=-1 Mark, The iocnum may be greater than 0 (especially if you have other MPT Fusion HBAs (any kind) in that computer). Have a look in the log around where the mptsas driver is registered and look for the string ioc. The number following ioc is what you need. If you find ioc3 then try: ./smp_discover -p 12 -s 0x500508b300a27a2f /dev/mptctl,3 To verify that expander SAS address, try this: find /sys -name sas_device:expander* cd to any directory found and try cat sas_address. BTW there is a smp_utils version 0.92 beta at http://www.torque.net/sg the error messages are somewhat clearer. Doug Gilbert - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html