On Mon, 2 September 2013 02:31:29 -0700, Christoph Hellwig wrote:
I would defintively love enabling it globally.
Agreed. I carry a private patch based on an older version of Hannes'
patchset doing just that. Istr there was some problem exactly once on
a VM with a virtualized hba. We hit
On Tue, 27 August 2013 20:34:43 -0400, Steven Rostedt wrote:
On Tue, 27 Aug 2013 18:41:47 -0400
Jörn Engel jo...@logfs.org wrote:
without knowing in advance which device it will be
I have to trace all devices, because I don't know the interesting one
ahead of time. And after
On Tue, 27 August 2013 22:09:19 -0700, Nicholas A. Bellinger wrote:
On Tue, 2013-08-27 at 18:17 -0400, Jörn Engel wrote:
Here is a fun patch in an early state. Essentially I want to trace
scsi commands, which has already been done long ago. The problem I
have is that I care about all
On Wed, 28 August 2013 14:45:14 +0200, Bart Van Assche wrote:
Isn't this facility something that overlaps with what Wireshark can
already do today ? Wireshark can already capture SCSI traces for
several transport types (iSCSI, USB, ...). For other transports,
e.g. FC, I'd prefer to see
On Tue, 27 August 2013 19:54:22 -0400, Steven Rostedt wrote:
On Tue, 27 Aug 2013 18:03:25 -0400
Jörn Engel jo...@logfs.org wrote:
Here is a fun patch in an early state. Essentially I want to trace
scsi commands, which has already been done long ago. The problem I
have is that I care
On Wed, 24 July 2013 23:42:22 +0300, Baruch Even wrote:
On Sat, Jul 20, 2013 at 1:11 AM, Jörn Engel jo...@logfs.org wrote:
On Fri, 19 July 2013 18:06:59 -0400, Jörn Engel wrote:
The handling of broadcast primitives involves
_scsih_block_io_all_device(), which does what the name implies
On Sat, 20 July 2013 15:53:16 +0800, Xitao Cao wrote:
Thanks for your comment. Do I need to update it and resend?
Yes, please.
Jörn
--
When people work hard for you for a pat on the back, you've got
to give them that pat.
-- Robert Heinlein
--
To unsubscribe from this list: send the line
On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:
There is a race when open sg with O_EXCL flag. Also a race may happen between
sg_open and sg_remove.
Changes from v4:
* [3/4] use ERR_PTR series instead of adding another parameter in sg_add_sfp
* [4/4] fix conflict for cherry-pick
On Wed, 17 July 2013 23:34:03 +0800, Vaughan Cao wrote:
Date: Wed, 17 Jul 2013 23:34:03 +0800
From: Vaughan Cao vaughan@oracle.com
To: jo...@logfs.org
Cc: dgilb...@interlog.com, jbottom...@parallels.com,
linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org,
On Wed, 17 July 2013 23:34:06 +0800, Vaughan Cao wrote:
Date: Wed, 17 Jul 2013 23:34:06 +0800
From: Vaughan Cao vaughan@oracle.com
To: jo...@logfs.org
Cc: dgilb...@interlog.com, jbottom...@parallels.com,
linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org,
On Wed, 17 July 2013 23:34:04 +0800, Vaughan Cao wrote:
Date: Wed, 17 Jul 2013 23:34:04 +0800
From: Vaughan Cao vaughan@oracle.com
To: jo...@logfs.org
Cc: dgilb...@interlog.com, jbottom...@parallels.com,
linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org,
On Wed, 17 July 2013 23:34:05 +0800, Vaughan Cao wrote:
-static Sg_fd *sg_add_sfp(Sg_device * sdp, int dev);
+static Sg_fd *sg_add_sfp(Sg_device * sdp, int dev, int * reason);
You can use ERR_PTR and friends instead of adding another parameter.
static void sg_remove_sfp(struct kref *);
The handling of broadcast primitives involves
_scsih_block_io_all_device(), which does what the name implies. I have
observed cases with 60s of blocking io on all devices, caused by a
single bad device. The downsides of this code are obvious, while the
upsides are more elusive.
Signed-off-by:
On Fri, 19 July 2013 18:06:59 -0400, Jörn Engel wrote:
The handling of broadcast primitives involves
_scsih_block_io_all_device(), which does what the name implies. I have
observed cases with 60s of blocking io on all devices, caused by a
single bad device. The downsides of this code
On Sun, 7 July 2013 01:24:44 +0800, vaughan wrote:
Do you agree that I use a per device spin_lock 'sfd_lock' to protect
sfds list and leave sg_index_lock
only protect the global sg device lookup? I think it's reasonable
for concurrency.
That bit looks fine to me. I don't think it matters
On Mon, 8 July 2013 03:53:49 +0800, vaughan wrote:
Use rwsem to aid opens. Exclusive open has to get write lock and
non-exclusive open should get read lock.
Replace global sg_open_exclusive_lock with a per device lock - sfd_lock.
Since sfds list is now protected by the lock owned by the
Sorry about replying so late.
On Mon, 17 June 2013 21:10:53 +0800, vaughan wrote:
Rewrite the last patch.
Add a new field 'toopen' in sg_device to count ongoing sg_open's. By checking
both 'toopen' and 'exclude' marks when do exclusive open, old race conditions
can be avoided.
Replace
On Tue, 2 July 2013 06:37:05 +, James Bottomley wrote:
I don't understand what you're getting at. In a dual HBA situation,
whether the second HBA is implicated or not depends on configuration and
what the first HBA is doing. If it's just passively lost device state,
then the second HBA
On Tue, 2 July 2013 16:33:40 +, James Bottomley wrote:
On Tue, 2013-07-02 at 10:58 -0400, Jörn Engel wrote:
On Tue, 2 July 2013 06:37:05 +, James Bottomley wrote:
I don't understand what you're getting at. In a dual HBA situation,
whether the second HBA is implicated
On Mon, 1 July 2013 08:50:48 +0200, Hannes Reinecke wrote:
This patchset implements a new 'eh_deadline' attribute to the
SCSI host. It will limit the overall SCSI EH runtime by a given
timeout. If the timeout is reached all intermediate EH steps
will be skipped and host reset will be
On Mon, 1 July 2013 19:23:25 +, James Bottomley wrote:
On Mon, 2013-07-01 at 13:44 -0400, Jörn Engel wrote:
If a single device is bad, don't ever do a host
reset.
This isn't a tenable position. Sometimes a device looks bad because the
host state for it has gone insane. At that point
On Tue, 11 June 2013 08:18:51 +0200, Hannes Reinecke wrote:
On 06/11/2013 01:24 AM, Jörn Engel wrote:
On Mon, 10 June 2013 11:19:16 -0400, Jörn Engel wrote:
I don't care too much whether we use per-command work items or a
single system-global thread.
Actually, I do care. We have
On Mon, 10 June 2013 11:00:49 +0200, Hannes Reinecke wrote:
On 06/10/2013 10:20 AM, Christoph Hellwig wrote:
Why can't we use a work item per command? Linking things into a list
just to queue it up to workqueues missed half of the point of the
workqueue infrastructure.
Hmm. I felt
On Mon, 10 June 2013 09:40:57 +0200, Hannes Reinecke wrote:
Signed-off-by: Hannes Reinecke h...@suse.de
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 11 +++
1 file changed, 11 insertions(+)
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
On Mon, 10 June 2013 09:40:52 +0200, Hannes Reinecke wrote:
+
+ spin_lock_irqsave(sdev-list_lock, flags);
+ if (list_empty(sdev-eh_abort_list))
+ kick_worker = 1;
+ list_add(scmd-eh_entry, sdev-eh_abort_list);
+ spin_unlock_irqrestore(sdev-list_lock, flags);
+
On Fri, 7 June 2013 14:54:16 +0800, Ren Mingxin wrote:
On 06/06/2013 05:43 PM, Hannes Reinecke wrote:
This approach may not work for some LLDDs as you said, but I wonder
whether SAS is applicable(whether there will be later patches for
SAS).
We have tested a derived patch on SAS and it did
On Thu, 6 June 2013 11:43:54 +0200, Hannes Reinecke wrote:
+ spin_lock_irqsave(sdev-list_lock, flags);
+ list_for_each_entry_safe(scmd, tmp, sdev-eh_abort_list, eh_entry) {
+ list_del_init(scmd-eh_entry);
+ spin_unlock_irqrestore(sdev-list_lock, flags);
I
On Fri, 7 June 2013 21:34:16 +, Nicholas A. Bellinger wrote:
-static int transport_cmd_check_stop(struct se_cmd *cmd, bool
remove_from_lists)
+static int transport_cmd_check_stop(struct se_cmd *cmd, bool
remove_from_lists,
+ bool write_pending)
...
-
On Thu, 6 June 2013 11:43:52 +0200, Hannes Reinecke wrote:
The 'eh_entry' list might be used even before scsi_softirq_done()
is called. Hence we should rather initialize it together with
the other eh-related variables.
Signed-off-by: Hannes Reinecke h...@suse.de
Tested-by: Joern Engel
On Thu, 6 June 2013 11:43:53 +0200, Hannes Reinecke wrote:
Add a 'BLK_EH_SCHEDULED' return code for blk-timeout to indicate
that a delayed error recovery has been initiated.
Signed-off-by: Hannes Reinecke h...@suse.de
Since the patch I tested diverged from your patches 2-4, this is only
On Thu, 6 June 2013 11:43:54 +0200, Hannes Reinecke wrote:
When a command runs into a timeout we need to send an 'ABORT TASK'
TMF. This is typically done by the 'eh_abort_handler' LLDD callback.
Conceptually, however, this function is a normal SCSI command, so
there is no need to enter the
On Thu, 6 June 2013 22:39:14 +0200, Hannes Reinecke wrote:
+ spin_unlock_irqrestore(sdev-list_lock, flags);
+ SCSI_LOG_ERROR_RECOVERY(3,
+ scmd_printk(KERN_INFO, scmd,
+ aborting command %p\n, scmd));
+ rtn =
On Wed, 5 June 2013 17:18:33 +0800, vaughan wrote:
Check and set sdp-exclude should be atomic when set in sg_open().
The patch is line-wrapped. More importantly, it doesn't seem to do
what your description indicates it should do. And lastly, does this
fix a bug, possibly even one you have a
On Thu, 6 June 2013 00:16:45 +0800, vaughan wrote:
于 2013年06月05日 21:27, Jörn Engel 写道:
On Wed, 5 June 2013 17:18:33 +0800, vaughan wrote:
Check and set sdp-exclude should be atomic when set in sg_open().
The patch is line-wrapped. More importantly, it doesn't seem to do
It's shorter than
On Sat, 25 May 2013 11:55:58 +0200, Hannes Reinecke wrote:
It should be possible to move the code into scsi_lib and just have
small hooks for the individual transports to use this.
We have done something a bit tasteless and unconditionally enabled
your new error handling for all scsi drivers.
On Fri, 24 May 2013 11:50:47 +0200, Hannes Reinecke wrote:
The 'eh_entry' list might be used even before scsi_softirq_done()
is called. Hence we should rather initialize it together with
the other eh-related variables.
Signed-off-by: Hannes Reinecke h...@suse.de
---
On Thu, 24 January 2013 08:51:20 +0100, Bjørn Mork wrote:
How about the copy of this code in drivers/scsi/mpt3sas/mpt3sas_scsih.c?
Is that safe, or does it need fixing as well?
Well spotted, that appears to suffer from the same ailment. Will cook
up a second patch for that.
Are there any
Changes since v1:
- Also fixes mpt3sas, thanks to Bjørn Mork.
- Adds a missing put_sas_device to _scsih_probe_boot_devices
- Added a newline
- Moved a kref_get outside the spinlock
I noticed this one when list_del was called with poisoned list
pointers, but the real problem is a double-free (and
Changes since v1:
- Adds another missing put_sas_device to _scsih_probe_boot_devices,
proving the need for more coffee before sending out patches.
I noticed this one when list_del was called with poisoned list
pointers, but the real problem is a double-free (and a use-after-free
just before
I noticed this one when list_del was called with poisoned list
pointers, but the real problem is a double-free (and a use-after-free
just before that).
Both _scsih_probe_boot_devices() and _scsih_sas_device_add() put the
sas_device onto a list, thereby giving up control. Next they call
On Tue, 24 July 2012 09:01:41 +0400, James Bottomley wrote:
On Mon, 2012-07-23 at 15:24 -0400, Jörn Engel wrote:
On Mon, 23 July 2012 23:45:55 +0400, James Bottomley wrote:
Have you checked this with the patches in scsi-misc? There's a series
of patches in there that alters the way
At least two slightly different paths can lead to a NULL pointer
dereference in scsi_prep_state_check and have been hit in practice.
1.
Call Trace:
[812f8c4b] scsi_setup_blk_pc_cmnd+0x2b/0x170
[81305108] sd_prep_fn+0x568/0xdd0
[8121c594] blk_peek_request+0xb4/0x240
On Mon, 23 July 2012 23:45:55 +0400, James Bottomley wrote:
On Mon, 2012-07-23 at 13:32 -0400, Jörn Engel wrote:
At least two slightly different paths can lead to a NULL pointer
dereference in scsi_prep_state_check and have been hit in practice.
Have you checked this with the patches
43 matches
Mail list logo