Re: [systemd-devel] [dm-devel] RFC: one more time: SCSI device identification

2021-04-30 Thread Ewan D. Milne
On Wed, 2021-04-28 at 10:09 +1000, Erwin van Londen wrote:
> 
> On Tue, 2021-04-27 at 16:41 -0400, Ewan D. Milne wrote:
> > On Tue, 2021-04-27 at 20:33 +, Martin Wilck wrote:
> > > On Tue, 2021-04-27 at 16:14 -0400, Ewan D. Milne wrote:
> > > > There's no way to do that, in principle.  Because there could
> > > > be
> > > > other I/Os in flight.  You might (somehow) avoid retrying an
> > > > I/O
> > > > that got a UA until you figured out if something changed, but
> > > > other
> > > > I/Os can already have been sent to the target, or issued before
> > > > you
> > > > get to look at the status.
> 
> If something happens on a storage side where a lun gets it's
> attributes changed (any, doesn't matter which one) a UA should be
> sent. Also all outstanding IO's on that lun should be returning an
> Abort as it can no longer warrant the validity of any IO due to these
> changes. Especially when parameters are involved like reservations
> (PR's) etc. If that does not happen from an array side all bets are
> off as the only way to be able to get back in business is to start
> from scratch.

Perhaps an array might abort I/Os it has received in the Device Server
whensomething changes.  I have no idea if most or any arrays actually
do that.
But, what about I/O that has already been queued from the host to
thehost bus adapter?  I don't see how we can abort those I/Os
properly.Most high-performance HBAs have a queue of commands and a
queueof responses, there could be lots of commands queued before
wemanage to notice an interesting status.  And AFAIK there is no
conditionalmechanism that could hold them off (and, they could be in-
flight on thewire anyway).
I get what you are saying about what SAM describes, I just don't see
howwe can guarantee we don't send any further commands after the
statuswith the UA is sent back, before we can understand what happened.
-Ewan
> > > 
> > > Right. But in practice, a WWID change will hardly happen under
> > > full
> > > IO
> > > load. The storage side will probably have to block IO while this
> > > happens, at least for a short time period. So blocking and
> > > quiescing
> > > the queue upon an UA might still work, most of the time. Even if
> > > we
> > > were too late already, the sooner we stop the queue, the better.
> 
> I think in most cases when something happens on an array side you
> will see IO's being aborted. That might be a good time to start doing
> TUR's and if these come back OK do a new inquiry. From a host side
> there is only so much you can do.
> 
> > > The current algorithm in multipath-tools needs to detect a path
> > > going
> > > down and being reinstated. The time interval during which a WWID
> > > change
> > > will go unnoticed is one or more path checker intervals,
> > > typically on
> > > the order of 5-30 seconds. If we could decrease this interval to
> > > a
> > > sub-
> > > second or even millisecond range by blocking the queue in the
> > > kernel
> > > quickly, we'd have made a big step forward.
> > 
> > Yes, and in many situations this may help.  But in the general case
> > we can't protect against a storage array misconfiguration,
> > where something like this can happen.  So I worry about people
> > believing the host software will protect them against a mistake,
> > when we can't really do that.
> 
> My thought exactly. 
> 
> > All it takes is one I/O (a discard) to make a thorough mess of the
> > LUN.
> > 
> > -Ewan
> > 
> > > Regards
> > > Martin
> > > 
> > 
> > --
> > dm-devel mailing list
> > dm-de...@redhat.com
> > https://listman.redhat.com/mailman/listinfo/dm-devel
> > 
> 
> --dm-devel mailing listdm-de...@redhat.com
> https://listman.redhat.com/mailman/listinfo/dm-devel
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [dm-devel] RFC: one more time: SCSI device identification

2021-04-28 Thread Martin Wilck
On Wed, 2021-04-28 at 11:01 +1000, Erwin van Londen wrote:
> 
> The way out of this is to chuck the array in the bin. As I mentioned
> in one of my other emails when a scenario happens as you described
> above and the array does not inform the initiator it goes against the
> SAM-5 standard.
> 
> That standard shows:
> 5.14 Unit attention conditions
> 5.14.1 Unit attention conditions that are not coalesced
> Each logical unit shall establish a unit attention condition whenever
> one of the following events occurs:
>   a) a power on (see 6.3.1), hard reset (see 6.3.2), logical
> unit reset (see 6.3.3), I_T nexus loss (see 6.3.4), or power loss
> expected (see 6.3.5) occurs;
>   b) commands received on this I_T nexus have been cleared by
> a command or a task management function associated with another I_T
> nexus and the TAS bit was set to zero in the Control mode page
> associated with this I_T nexus (see 5.6);
>   c) the portion of the logical unit inventory that consists
> of administrative logical units and hierarchical logical units has
> been changed (see 4.6.18.1); or
>   d) any other event requiring the attention of the SCSI
> initiator device.
> 
> Especially the I_T nexus loss under a is an important trigger.
> 
> ---
> 6.3.4 I_T nexus loss
> An I_T nexus loss is a SCSI device condition resulting from:
> 
>  a) a hard reset condition (see 6.3.2);
>  b) an I_T nexus loss event (e.g., logout) indicated by a Nexus Loss
> event notification (see 6.4);
>  c) indication that an I_T NEXUS RESET task management request (see
> 7.6) has been processed; or
>  d) an indication that a REMOVE I_T NEXUS command (see SPC-4) has
> been processed.
> An I_T nexus loss event is an indication from the SCSI transport
> protocol to the SAL that an I_T nexus no
> longer exists. SCSI transport protocols may define I_T nexus loss
> events.
> 
> Each SCSI transport protocol standard that defines I_T nexus loss
> events should specify when those events
> result in the delivery of a Nexus Loss event notification to the SAL.
> 
> The I_T nexus loss condition applies to both SCSI initiator devices
> and SCSI target devices.
> 
> If a SCSI target port detects an I_T nexus loss, then a Nexus Loss
> event notification shall be delivered to
> each logical unit to which the I_T nexus has access.
> 
> In response to an I_T nexus loss condition a logical unit shall take
> the following actions:
> a) abort all commands received on the I_T nexus as described in 5.6;
> b) abort all background third-party copy operations (see SPC-4) that
> are using the I_T nexus;
> c) terminate all task management functions received on the I_T nexus;
> d) clear all ACA conditions (see 5.9.5) associated with the I_T
> nexus;
> e) establish a unit attention condition for the SCSI initiator port
> associated with the I_T nexus (see 5.14
> and 6.2); and
> f) perform any additional functions required by the applicable
> command standards.
> ---
> 
> This does also mean that any underlying transport protocol issues
> like on FC or TCP for iSCSI will very often trigger aborted commands
> or UA's as well which will be picked up by the kernel/respected
> drivers.

Thanks a lot. I'm not quite certain which of these paragraphs would
apply to the situation I had in mind (administrator remapping an
existing LUN on a storage array to a different volume). That scenario
wouldn't necessarily involve transport-level errors, or an I_T nexus
loss. 5.14.1 c) or d) might apply, is that what you meant?

Regards
Martin

-- 
Dr. Martin Wilck , Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [dm-devel] RFC: one more time: SCSI device identification

2021-04-27 Thread Martin Wilck
On Tue, 2021-04-27 at 13:48 +1000, Erwin van Londen wrote:
> > 
> > Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
> > afaics.
> > 
> In my view the WWID should never change. 

In an ideal world, perhaps not. But in the dm-multipath realm, we know
that WWID changes can happen with certain storage arrays. See 
https://listman.redhat.com/archives/dm-devel/2021-February/msg00116.html 
and follow-ups, for example.

Regards,
Martin

-- 
Dr. Martin Wilck , Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [dm-devel] RFC: one more time: SCSI device identification

2021-04-27 Thread Hannes Reinecke
On 4/27/21 10:10 AM, Martin Wilck wrote:
> On Tue, 2021-04-27 at 13:48 +1000, Erwin van Londen wrote:
>>>
>>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>>> afaics.
>>>
>> In my view the WWID should never change. 
> 
> In an ideal world, perhaps not. But in the dm-multipath realm, we know
> that WWID changes can happen with certain storage arrays. See 
> https://listman.redhat.com/archives/dm-devel/2021-February/msg00116.html 
> and follow-ups, for example.
> 
And it's actually something which might happen quite easily.
The storage array can unmap a LUN, delete it, create a new one, and map
that one into the same LUN number than the old one.
If we didn't do I/O during that interval upon the next I/O we will be
getting the dreaded 'Power-On/Reset' sense code.
_And nothing else_, due to the arcane rules for sense code generation in
SAM.
But we end up with a completely different device.

The only way out of it is to do a rescan for every POR sense code, and
disable the device eg via DID_NO_CONNECT whenever we find that the
identification has changed. We already have a copy of the original VPD
page 0x83 at hand, so that should be reasonably easy.

I had a rather lengthy discussion with Fred Knight @ NetApp about
Power-On/Reset handling, what with him complaining that we don't handle
is correctly. So this really is something we should be looking into,
even independently of multipathing.

But actually I like the idea from Martin Petersen to expose the parsed
VPD identifiers to sysfs; that would allow us to drop sg_inq completely
from the udev rules.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke Kernel Storage Architect
h...@suse.de   +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel