Re: [PATCH] RESEND: SCSI, libata: add support for ATA_16 commands to libata ATAPI devices

2007-02-02 Thread Christoph Hellwig
On Thu, Feb 01, 2007 at 03:21:25PM -0500, Douglas Gilbert wrote: My point is that the linux block layer and scsi mid level should get out of the business of putting hard limits place. Why? Both of them never have been in the business of putting hard limits in place. We currently have a hard

[PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash

2007-02-02 Thread Nagendra Singh Tomar
Hi, sd_probe() calls class_device_add() even before initializing the sdkp-device variable. class_device_add() eventually results in the user mode udev program to be called. udev program can read the the allow_restart attribute of the newly created scsi device. This is resulting in a

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Andrew Vasquez
On Thu, 01 Feb 2007, Andrew Morton wrote: On Mon, 22 Jan 2007 10:35:10 -0800 Andrew Vasquez [EMAIL PROTECTED] wrote: Basically what is happening from the FC side is the initiator executes a simple dt test: dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats limit=2m

[PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE

2007-02-02 Thread Edward Goggin
Patch Set Summary: 1 Define new SCSI ML host status DID_COND_REQUEUE and add its handling code to scsi_decide_disposition. Scsi_decide_disposition returns ADD_TO_MLQUEUE IFF not REQ_FAILFAST. 2 Return DID_COND_REQUEUE instead of DID_BUS_BUSY host status

[PATCH 1/2] scsi: add DID_COND_REQUEUE SCSI ML host status

2007-02-02 Thread Edward Goggin
From: Ed Goggin [EMAIL PROTECTED] Add new SCSI ML host status DID_COND_REQUEUE for ADD_TO_MLQUEUE IFF not REQ_FAILFAST. Signed-off-by: Ed Goggin [EMAIL PROTECTED] diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 2dce06a..d8e884b 100644 --- a/drivers/scsi/scsi_error.c +++

Re: [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE

2007-02-02 Thread James Bottomley
On Fri, 2007-02-02 at 17:04 -0500, Edward Goggin wrote: Patch Set Summary: 1 Define new SCSI ML host status DID_COND_REQUEUE and add its handling code to scsi_decide_disposition. Scsi_decide_disposition returns ADD_TO_MLQUEUE IFF not REQ_FAILFAST. 2 Return

Re: [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE

2007-02-02 Thread Edward Goggin
On Fri, 2007-02-02 at 16:54 -0600, James Bottomley wrote: On Fri, 2007-02-02 at 17:04 -0500, Edward Goggin wrote: Patch Set Summary: 1 Define new SCSI ML host status DID_COND_REQUEUE and add its handling code to scsi_decide_disposition. Scsi_decide_disposition returns

Re: [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE

2007-02-02 Thread James Bottomley
On Fri, 2007-02-02 at 18:11 -0500, Edward Goggin wrote: That solution doesn't work for the RDAC/MPP driver as the BUSY status handler retries indefinitely. We need a solution which works for both a bare metal host running RDAC/MPP which for this use case, wants to get control over the failed

Re: [PATCH 0/2] : definion, code, and use of new SCSI ML host status DID_COND_REQUEUE

2007-02-02 Thread Edward Goggin
On Fri, 2007-02-02 at 17:18 -0600, James Bottomley wrote: On Fri, 2007-02-02 at 18:11 -0500, Edward Goggin wrote: That solution doesn't work for the RDAC/MPP driver as the BUSY status handler retries indefinitely. We need a solution which works for both a bare metal host running RDAC/MPP

RE: [PATCH 0/2] : definion, code, and use of new SCSI ML hoststatus DID_COND_REQUEUE

2007-02-02 Thread Moore, Eric
On Friday, February 02, 2007 4:34 PM, Edward Goggin wrote: On Fri, 2007-02-02 at 17:18 -0600, James Bottomley wrote: On Fri, 2007-02-02 at 18:11 -0500, Edward Goggin wrote: That solution doesn't work for the RDAC/MPP driver as the BUSY status handler retries indefinitely. We need a

RE: [PATCH 0/2] : definion, code, and use of new SCSI ML hoststatus DID_COND_REQUEUE

2007-02-02 Thread Qi, Yanling
-Original Message- From: [EMAIL PROTECTED] [mailto:linux-scsi- [EMAIL PROTECTED] On Behalf Of Edward Goggin Sent: Friday, February 02, 2007 5:34 PM To: James Bottomley Cc: linux-scsi@vger.kernel.org; Moore, Eric I think I see your argument ... retries for BUSY and all other

RE: [PATCH 0/2] : definion, code, and use of new SCSI ML hoststatus DID_COND_REQUEUE

2007-02-02 Thread James Bottomley
On Fri, 2007-02-02 at 17:31 -0700, Qi, Yanling wrote: [Qi, Yanling] The following code in the scsi_lib.c will be enough for RDAC/MPP. BTW, why do we do wait_for = (cmd-allowed + 1) * cmd-timeout_per_command. With a sd request, the wait_for will be 180 seconds. (SD_MAX_RETRIES=5 and

RE: [PATCH 0/2] : definion, code, and use of new SCSI ML hoststatus DID_COND_REQUEUE

2007-02-02 Thread Qi, Yanling
I agree with Eric. RDAC/MPP will survive with the straight SAM BUSY status. --yanling -Original Message- From: Moore, Eric Sent: Friday, February 02, 2007 5:59 PM To: Edward Goggin; James Bottomley; Qi, Yanling Cc: linux-scsi@vger.kernel.org Subject: RE: [PATCH 0/2] : definion,

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Andrew Morton
On Fri, 2 Feb 2007 12:56:30 -0800 Andrew Vasquez [EMAIL PROTECTED] wrote: On Thu, 01 Feb 2007, Andrew Morton wrote: On Mon, 22 Jan 2007 10:35:10 -0800 Andrew Vasquez [EMAIL PROTECTED] wrote: Basically what is happening from the FC side is the initiator executes a simple dt test:

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Andrew Morton
On Fri, 2 Feb 2007 12:56:30 -0800 Andrew Vasquez [EMAIL PROTECTED] wrote: dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats limit=2m passes=100 pattern=iot dlimit=2048 What is this mysterious dt command, btw? - To unsubscribe from this list: send the line

Re: [PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash

2007-02-02 Thread Andrew Morton
On Fri, 2 Feb 2007 17:34:56 +0530 Nagendra Singh Tomar [EMAIL PROTECTED] wrote: Hi, sd_probe() calls class_device_add() even before initializing the sdkp-device variable. class_device_add() eventually results in the user mode udev program to be called. udev program can read the the

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Randy Dunlap
On Fri, 2 Feb 2007 16:25:41 -0800 Andrew Morton wrote: On Fri, 2 Feb 2007 12:56:30 -0800 Andrew Vasquez [EMAIL PROTECTED] wrote: dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats limit=2m passes=100 pattern=iot dlimit=2048 What is this mysterious dt

Re: [PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash

2007-02-02 Thread Greg KH
On Fri, Feb 02, 2007 at 05:19:24PM -0800, Andrew Morton wrote: On Fri, 2 Feb 2007 17:34:56 +0530 Nagendra Singh Tomar [EMAIL PROTECTED] wrote: Hi, sd_probe() calls class_device_add() even before initializing the sdkp-device variable. class_device_add() eventually results in the user

Re: [PATCH 2.6.19.2] SCSI sd: udev accessing an uninitialized scsi_disk results in a crash

2007-02-02 Thread James Bottomley
On Fri, 2007-02-02 at 17:56 -0800, Greg KH wrote: Thanks - I'll queue this up for 2.6.20 also. No objection from me, as long as James says this is ok. I wonder why we haven't noticed this in the past? Because the race is so small ... I'll queue it in the rc-fixes tree .. I have three

Re: [BUG] Unable to handle kernel NULL pointer dereference...as_move_to_dispatch+0x11/0x135

2007-02-02 Thread Andrew Vasquez
On Fri, 02 Feb 2007, Randy Dunlap wrote: On Fri, 2 Feb 2007 16:25:41 -0800 Andrew Morton wrote: On Fri, 2 Feb 2007 12:56:30 -0800 Andrew Vasquez [EMAIL PROTECTED] wrote: dt of=/dev/raw/raw1 procs=8 oncerr=abort bs=16k disable=stats limit=2m passes=100 pattern=iot

Re: AIC7xxx on 2.6.18

2007-02-02 Thread Mark Rustad
On Feb 2, 2007, at 6:42 PM, Wakko Warner wrote: Andrew Morton wrote: Yes, getting the oops traces will help, thanks. And confirmation on a more recent kernel would be good. Here's what I get. I used netconsole so whatever was logged prior to it starting was lost. The PC is a

Re: AIC7xxx on 2.6.18

2007-02-02 Thread Sean Bruno
On Fri, 2007-02-02 at 23:12 -0600, Mark Rustad wrote: On Feb 2, 2007, at 6:42 PM, Wakko Warner wrote: Andrew Morton wrote: Yes, getting the oops traces will help, thanks. And confirmation on a more recent kernel would be good. Here's what I get. I used netconsole so whatever was

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Ric Wheeler
James Bottomley wrote: On Thu, 2007-02-01 at 15:02 -0500, Mark Lord wrote: I believe you made the first change in response to my prodding at the time, when libata was not returning valid sense data (no LBA) for media errors. The SCSI EH handling of that was rather poor at the time, and so

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Alan
The interesting point of this question is about the typically pattern of IO errors. On a read, it is safe to assume that you will have issues with some bounded numbers of adjacent sectors. Which in theory you can get by asking the drive for the real sector size from the ATA7 info. (We ought

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Alan
your system requirements are, what the system is trying to do (i.e., when trying to recover a failing but not dead yet disk, IO errors should be as quick as possible and we should choose an IO scheduler that does not combine IO's). If this is the right strategy for disk recovery for a

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Mark Lord
Alan wrote: If this is the right strategy for disk recovery for a given type of device then this ought to be an automatic strategy. Most end users will not have the knowledge to frob about in sysfs, and if the bad sector hits at the wrong moment a sensible automatic recovery strategy is going

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Ric Wheeler
James Bottomley wrote: On Fri, 2007-02-02 at 14:42 +, Alan wrote: The interesting point of this question is about the typically pattern of IO errors. On a read, it is safe to assume that you will have issues with some bounded numbers of adjacent sectors. Which in theory you

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Matt Mackall
On Fri, Feb 02, 2007 at 11:06:19AM -0500, Mark Lord wrote: Alan wrote: If this is the right strategy for disk recovery for a given type of device then this ought to be an automatic strategy. Most end users will not have the knowledge to frob about in sysfs, and if the bad sector hits at the

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Douglas Gilbert
Alan wrote: The interesting point of this question is about the typically pattern of IO errors. On a read, it is safe to assume that you will have issues with some bounded numbers of adjacent sectors. Which in theory you can get by asking the drive for the real sector size from the ATA7

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Mark Lord
Matt Mackall wrote: .. Also worth considering is that spending minutes trying to reread damaged sectors is likely to accelerate your death spiral. More data may be recoverable if you give up quickly in a first pass, then go back and manually retry damaged bits with smaller I/Os. All good

Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

2007-02-02 Thread Matt Mackall
On Fri, Feb 02, 2007 at 05:58:04PM -0500, Mark Lord wrote: Matt Mackall wrote: .. Also worth considering is that spending minutes trying to reread damaged sectors is likely to accelerate your death spiral. More data may be recoverable if you give up quickly in a first pass, then go back and