Re: [systemd-devel] RFC: one more time: SCSI device identification

2021-04-27 Thread Martin Wilck
On Tue, 2021-04-27 at 16:14 -0400, Ewan D. Milne wrote:
> On Mon, 2021-04-26 at 13:16 +, Martin Wilck wrote:
> > On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
> > > > > 
> > > > 
> > > > While we're at it, I'd like to mention another issue: WWID
> > > > changes.
> > > > 
> > > > This is a big problem for multipathd. The gist is that the
> > > > device
> > > > identification attributes in sysfs only change after rescanning
> > > > the
> > > > device. Thus if a user changes LUN assignments on a storage
> > > > system,
> > > > it can happen that a direct INQUIRY returns a different WWID as
> > > > in
> > > > sysfs, which is fatal. If we plan to rely more on sysfs for
> > > > device
> > > > identification in the future, the problem gets worse. 
> > > 
> > > I think many devices rely on the fact that they are identified by
> > > Vendor/model/serial_nr, because in most professional SAN storage
> > > systems you
> > > can pre-set the serial number to a custom value; so if you want a
> > > new
> > > disk
> > > (maybe a snapshot) to be compatible with the old one, just assign
> > > the
> > > same
> > > serial number. I guess that's the idea behind.
> > 
> > What you are saying sounds dangerous to me. If a snapshot has the
> > same
> > WWID as the device it's a snapshot of, it must not be exposed to
> > any
> > host(s) at the same time with its origin, otherwise the host may
> > happily combine it with the origin into one multipath map, and data
> > corruption will almost certainly result. 
> > 
> > My argument is about how the host is supposed to deal with a WWID
> > change if it happens. Here, "WWID change" means that a given
> > H:C:T:L
> > suddenly exposes different device designators than it used to,
> > while
> > this device is in use by a host. Here, too, data corruption is
> > imminent, and can happen in a blink of an eye. To avoid this,
> > several
> > things are needed:
> > 
> >  1) the host needs to get notified about the change (likely by an
> > UA
> > of
> > some sort)
> >  2) the kernel needs to react to the notification immediately, e.g.
> > by
> > blocking IO to the device,
> 
> There's no way to do that, in principle.  Because there could be
> other I/Os in flight.  You might (somehow) avoid retrying an I/O
> that got a UA until you figured out if something changed, but other
> I/Os can already have been sent to the target, or issued before you
> get to look at the status.

Right. But in practice, a WWID change will hardly happen under full IO
load. The storage side will probably have to block IO while this
happens, at least for a short time period. So blocking and quiescing
the queue upon an UA might still work, most of the time. Even if we
were too late already, the sooner we stop the queue, the better.

The current algorithm in multipath-tools needs to detect a path going
down and being reinstated. The time interval during which a WWID change
will go unnoticed is one or more path checker intervals, typically on
the order of 5-30 seconds. If we could decrease this interval to a sub-
second or even millisecond range by blocking the queue in the kernel
quickly, we'd have made a big step forward.

Regards
Martin

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] RFC: one more time: SCSI device identification

2021-04-27 Thread Ewan D. Milne
On Tue, 2021-04-27 at 20:33 +, Martin Wilck wrote:
> On Tue, 2021-04-27 at 16:14 -0400, Ewan D. Milne wrote:
> > 
> > There's no way to do that, in principle.  Because there could be
> > other I/Os in flight.  You might (somehow) avoid retrying an I/O
> > that got a UA until you figured out if something changed, but other
> > I/Os can already have been sent to the target, or issued before you
> > get to look at the status.
> 
> Right. But in practice, a WWID change will hardly happen under full
> IO
> load. The storage side will probably have to block IO while this
> happens, at least for a short time period. So blocking and quiescing
> the queue upon an UA might still work, most of the time. Even if we
> were too late already, the sooner we stop the queue, the better.
> 
> The current algorithm in multipath-tools needs to detect a path going
> down and being reinstated. The time interval during which a WWID
> change
> will go unnoticed is one or more path checker intervals, typically on
> the order of 5-30 seconds. If we could decrease this interval to a
> sub-
> second or even millisecond range by blocking the queue in the kernel
> quickly, we'd have made a big step forward.

Yes, and in many situations this may help.  But in the general case
we can't protect against a storage array misconfiguration,
where something like this can happen.  So I worry about people
believing the host software will protect them against a mistake,
when we can't really do that.

All it takes is one I/O (a discard) to make a thorough mess of the LUN.

-Ewan

> 
> Regards
> Martin
> 

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] RFC: one more time: SCSI device identification

2021-04-27 Thread Ewan D. Milne
On Mon, 2021-04-26 at 13:16 +, Martin Wilck wrote:
> On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
> > > > 
> > > 
> > > While we're at it, I'd like to mention another issue: WWID
> > > changes.
> > > 
> > > This is a big problem for multipathd. The gist is that the device
> > > identification attributes in sysfs only change after rescanning
> > > the
> > > device. Thus if a user changes LUN assignments on a storage
> > > system,
> > > it can happen that a direct INQUIRY returns a different WWID as
> > > in
> > > sysfs, which is fatal. If we plan to rely more on sysfs for
> > > device
> > > identification in the future, the problem gets worse. 
> > 
> > I think many devices rely on the fact that they are identified by
> > Vendor/model/serial_nr, because in most professional SAN storage
> > systems you
> > can pre-set the serial number to a custom value; so if you want a
> > new
> > disk
> > (maybe a snapshot) to be compatible with the old one, just assign
> > the
> > same
> > serial number. I guess that's the idea behind.
> 
> What you are saying sounds dangerous to me. If a snapshot has the
> same
> WWID as the device it's a snapshot of, it must not be exposed to any
> host(s) at the same time with its origin, otherwise the host may
> happily combine it with the origin into one multipath map, and data
> corruption will almost certainly result. 
> 
> My argument is about how the host is supposed to deal with a WWID
> change if it happens. Here, "WWID change" means that a given H:C:T:L
> suddenly exposes different device designators than it used to, while
> this device is in use by a host. Here, too, data corruption is
> imminent, and can happen in a blink of an eye. To avoid this, several
> things are needed:
> 
>  1) the host needs to get notified about the change (likely by an UA
> of
> some sort)
>  2) the kernel needs to react to the notification immediately, e.g.
> by
> blocking IO to the device,

There's no way to do that, in principle.  Because there could be
other I/Os in flight.  You might (somehow) avoid retrying an I/O
that got a UA until you figured out if something changed, but other
I/Os can already have been sent to the target, or issued before you
get to look at the status.

-Ewan

>  3) userspace tooling such as udev or multipathd need to figure out
> how
> to  how to deal with the situation cleanly, and eventually unblock
> it.
> 
> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
> afaics.
> 
> Martin
> 

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] Re: [dm-devel] RFC: one more time: SCSI device identification

2021-04-27 Thread Ewan D. Milne
On Tue, 2021-04-27 at 12:52 +0200, Ulrich Windl wrote:
> > > > Hannes Reinecke  schrieb am 27.04.2021 um 10:21
> > > > in Nachricht
> 
> <2a6903e4-ff2b-67d5-e772-6971db844...@suse.de>:
> > On 4/27/21 10:10 AM, Martin Wilck wrote:
> > > On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:
> > > > > 
> > > > > Wrt 1), we can only hope that it's the case. But 2) and 3)
> > > > > need work,
> > > > > afaics.
> > > > > 
> > > > 
> > > > In my view the WWID should never change. 
> > > 
> > > In an ideal world, perhaps not. But in the dm‑multipath realm, we
> > > know
> > > that WWID changes can happen with certain storage arrays. See 
> > > 
https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html
> > >  
> > > and follow‑ups, for example.
> > > 
> > 
> > And it's actually something which might happen quite easily.
> > The storage array can unmap a LUN, delete it, create a new one, and
> > map
> > that one into the same LUN number than the old one.
> > If we didn't do I/O during that interval upon the next I/O we will
> > be
> > getting the dreaded 'Power‑On/Reset' sense code.
> > _And nothing else_, due to the arcane rules for sense code
> > generation in
> > SAM.
> > But we end up with a completely different device.
> > 
> > The only way out of it is to do a rescan for every POR sense code,
> > and
> > disable the device eg via DID_NO_CONNECT whenever we find that the
> > identification has changed. We already have a copy of the original
> > VPD
> > page 0x83 at hand, so that should be reasonably easy.
> 
> I don't know the depth of the SCSI or FC protocol, but storage
> systems
> typically signal such events, maybe either via some unit attention or
> some FC
> event. Older kernels logged that there was a change, but a manual
> SCSI bus scan
> is needed, while newer kernels find new devices "automagically" for
> some
> products. The HP EVA 6000 series wored that way, a 3PAR SotorServ
> 8000 series
> also seems to work that way, but not Pure Storage X70 R3. FOr the
> latter you
> need something like a FC LIP to make the kernel detect the new
> devices (LUNs).
> I'm unsure where the problem is, but in principle the kernel can be
> notified...

There has to be some command on which the Unit Attention status
can be returned.  (In a multipath configuration, the path checker
commands may do this).  In absence of a command, there is no
asynchronous mechanism in SCSI to report the status.

On FC things related to finding a remote port will trigger a rescan.

-Ewan

> 
> > 
> > I had a rather lengthy discussion with Fred Knight @ NetApp about
> > Power‑On/Reset handling, what with him complaining that we don't
> > handle
> > is correctly. So this really is something we should be looking
> > into,
> > even independently of multipathing.
> > 
> > But actually I like the idea from Martin Petersen to expose the
> > parsed
> > VPD identifiers to sysfs; that would allow us to drop sg_inq
> > completely
> > from the udev rules.
> 
> Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there
> was a
> kernel change regarding trailing blanks in VPD data. That change blew
> up
> several configurations being unable to re-recognize the devices. In
> one case
> the software even had bound a license to a specific device with
> serial number,
> and that software found "new" devices while missing the "old" ones...
> 
> Regards,
> Ulrich
> 
> > 
> > Cheers,
> > 
> > Hannes
> > ‑‑ 
> > Dr. Hannes Reinecke Kernel Storage Architect
> > h...@suse.de   +49 911 74053
> > 688
> > SUSE Software Solutions Germany GmbH, 90409 Nürnberg
> > GF: F. Imendörffer, HRB 36809 (AG Nürnberg)
> 
> 
> 

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] A start job is running for /dev/gpt-auto-root

2021-04-27 Thread Arian van Putten
Hi all,

In case anybody is interested.
After some heavy debugging I was able to bisect the issue to the following
pull request (which was backported to v247)
https://github.com/systemd/systemd/pull/18802#issuecomment-827707662  . I
left comments there on what broke.

Thanks for the help either way!


On Sat, Apr 24, 2021 at 5:03 PM Arian van Putten 
wrote:

> Dear list,
>
> I've been working on trying to integrate systemd-gpt-auto and
> systemd-cryptsetup into NixOS's stage-1 init.
>
> Everything was working great on 246; but  when I updated from  kernel 5.4
> to 5.10 and from systemd 246 to 247, the setup stopped working.
>
> After entering my LUKS password, the boot keeps hanging with:
>
> A start job is running for /dev/gpt-auto-root
>
> and eventually fails with:
>
> [ TIME ] Timed out waiting for device /dev/gpt-auto-root.
>
> Interestingly.  Both /dev/gpt-auto-root  and /dev/gpt-auto-root-luks
> exist in the /dev tree so the udev rules are all fired correctly; but for
> some reason they are not propagating to the device unit.
> Also the btrfs kernel module didn't seem to get automatically loaded for
> some reason.
>
> For the failed boot on 247; this seems to be the interesting bit of the
> log:
> Apr 24 12:18:46 localhost systemd-udevd[209]: dm-0: Adding watch on
> '/dev/dm-0'
> Apr 24 12:18:46 localhost systemd-udevd[209]: dm-0: sd-device: Created db
> file '/run/udev/data/b254:0' for '/devices/virtual/block/dm-0'
> Apr 24 12:18:46 localhost systemd-udevd[209]: dm-0: Device (SEQNUM=1720,
> ACTION=change) processed
> Apr 24 12:18:46 localhost systemd-udevd[209]: dm-0: sd-device-monitor:
> Passed 947 byte to netlink monitor
> Apr 24 12:18:46 localhost systemd[1]: systemd-journald.service: Received
> EPOLLHUP on stored fd 46 (stored), closing.
> Apr 24 12:18:46 localhost systemd[1]: Received SIGCHLD from PID 192
> (systemd-cryptse).
> Apr 24 12:18:46 localhost systemd[1]: Child 192 (systemd-cryptse) died
> (code=exited, status=0/SUCCESS)
> Apr 24 12:18:46 localhost systemd[1]: systemd-cryptsetup@root.service:
> Child 192 belongs to systemd-cryptsetup@root.service.
> Apr 24 12:18:46 localhost systemd[1]: systemd-cryptsetup@root.service:
> Main process exited, code=exited, status=0/SUCCESS
> Apr 24 12:18:46 localhost systemd[1]: systemd-cryptsetup@root.service:
> Changed start -> exited
> Apr 24 12:18:46 localhost systemd[1]: systemd-cryptsetup@root.service:
> Job 59 systemd-cryptsetup@root.service/start finished, result=done
> Apr 24 12:18:46 localhost systemd[1]: Finished Cryptography Setup for root.
> Apr 24 12:18:46 localhost audit[1]: SERVICE_START pid=1 uid=0
> auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-cryptsetup@root
> comm="systemd"
> exe="/nix/store/ixpwj1cxl4rp2qbyn0s2h4k5zj731q7c-systemd-247.6/lib/systemd/systemd"
> hostname=? addr=? terminal=? res=success'
> Apr 24 12:18:46 localhost systemd[1]: systemd-cryptsetup@root.service:
> Control group is empty.
> Apr 24 12:18:46 localhost systemd[1]: blockdev@dev-mapper-root.target
> changed dead -> active
> Apr 24 12:18:46 localhost systemd[1]: blockdev@dev-mapper-root.target:
> Job 72 blockdev@dev-mapper-root.target/start finished, result=done
> Apr 24 12:18:46 localhost systemd[1]: Reached target Block Device
> Preparation for /dev/mapper/root.
> Apr 24 12:18:46 localhost kernel: kauditd_printk_skb: 4 callbacks
> suppressed
> Apr 24 12:18:46 localhost kernel: audit: type=1130
> audit(1619266726.241:15): pid=1 uid=0 auid=4294967295 ses=4294967295
> subj=kernel msg='unit=systemd-cryptsetup@root comm="systemd"
> exe="/nix/store/ixpwj1cxl4rp2qbyn0s2h4k5zj731q7c-systemd-247.6/lib/systemd/systemd"
> hostname=? addr=? terminal=? res=success'
> Apr 24 12:18:50 localhost systemd-udevd[148]: Cleanup idle workers
> Apr 24 12:18:50 localhost systemd-udevd[207]: Unload module index
> Apr 24 12:18:50 localhost systemd-udevd[209]: Unload module index
> Apr 24 12:18:50 localhost systemd-udevd[205]: Unload module index
> Apr 24 12:18:50 localhost systemd-udevd[206]: Unload module index
> Apr 24 12:18:50 localhost systemd-udevd[207]: Unloaded link configuration
> context.
> Apr 24 12:18:50 localhost systemd-udevd[209]: Unloaded link configuration
> context.
> Apr 24 12:18:50 localhost systemd-udevd[205]: Unloaded link configuration
> context.
> Apr 24 12:18:50 localhost systemd-udevd[206]: Unloaded link configuration
> context.
> Apr 24 12:18:50 localhost systemd-udevd[148]: Worker [205] exited
> Apr 24 12:18:50 localhost systemd-udevd[148]: Worker [206] exited
> Apr 24 12:18:50 localhost systemd-udevd[148]: Worker [207] exited
> Apr 24 12:18:50 localhost systemd-udevd[148]: Worker [209] exited
> Apr 24 12:19:48 localhost systemd[1]: dev-gpt\x2dauto\x2droot.device: Job
> dev-gpt\x2dauto\x2droot.device/start timed out.
> Apr 24 12:19:48 localhost systemd[1]: dev-gpt\x2dauto\x2droot.device: Job
> 56 dev-gpt\x2dauto\x2droot.device/start finished, result=timeout
> Apr 24 12:19:48 localhost systemd[1]: Timed out waiting for device
> 

Re: [systemd-devel] [dm-devel] RFC: one more time: SCSI device identification

2021-04-27 Thread Martin Wilck
On Tue, 2021-04-27 at 13:48 +1000, Erwin van Londen wrote:
> > 
> > Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
> > afaics.
> > 
> In my view the WWID should never change. 

In an ideal world, perhaps not. But in the dm-multipath realm, we know
that WWID changes can happen with certain storage arrays. See 
https://listman.redhat.com/archives/dm-devel/2021-February/msg00116.html 
and follow-ups, for example.

Regards,
Martin

-- 
Dr. Martin Wilck , Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Antw: [EXT] Re: [dm-devel] RFC: one more time: SCSI device identification

2021-04-27 Thread Ulrich Windl
>>> Hannes Reinecke  schrieb am 27.04.2021 um 10:21 in Nachricht
<2a6903e4-ff2b-67d5-e772-6971db844...@suse.de>:
> On 4/27/21 10:10 AM, Martin Wilck wrote:
>> On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:

 Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
 afaics.

>>> In my view the WWID should never change. 
>> 
>> In an ideal world, perhaps not. But in the dm‑multipath realm, we know
>> that WWID changes can happen with certain storage arrays. See 
>> https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html 
>> and follow‑ups, for example.
>> 
> And it's actually something which might happen quite easily.
> The storage array can unmap a LUN, delete it, create a new one, and map
> that one into the same LUN number than the old one.
> If we didn't do I/O during that interval upon the next I/O we will be
> getting the dreaded 'Power‑On/Reset' sense code.
> _And nothing else_, due to the arcane rules for sense code generation in
> SAM.
> But we end up with a completely different device.
> 
> The only way out of it is to do a rescan for every POR sense code, and
> disable the device eg via DID_NO_CONNECT whenever we find that the
> identification has changed. We already have a copy of the original VPD
> page 0x83 at hand, so that should be reasonably easy.

I don't know the depth of the SCSI or FC protocol, but storage systems
typically signal such events, maybe either via some unit attention or some FC
event. Older kernels logged that there was a change, but a manual SCSI bus scan
is needed, while newer kernels find new devices "automagically" for some
products. The HP EVA 6000 series wored that way, a 3PAR SotorServ 8000 series
also seems to work that way, but not Pure Storage X70 R3. FOr the latter you
need something like a FC LIP to make the kernel detect the new devices (LUNs).
I'm unsure where the problem is, but in principle the kernel can be
notified...

> 
> I had a rather lengthy discussion with Fred Knight @ NetApp about
> Power‑On/Reset handling, what with him complaining that we don't handle
> is correctly. So this really is something we should be looking into,
> even independently of multipathing.
> 
> But actually I like the idea from Martin Petersen to expose the parsed
> VPD identifiers to sysfs; that would allow us to drop sg_inq completely
> from the udev rules.

Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there was a
kernel change regarding trailing blanks in VPD data. That change blew up
several configurations being unable to re-recognize the devices. In one case
the software even had bound a license to a specific device with serial number,
and that software found "new" devices while missing the "old" ones...

Regards,
Ulrich

> 
> Cheers,
> 
> Hannes
> ‑‑ 
> Dr. Hannes Reinecke   Kernel Storage Architect
> h...@suse.de +49 911 74053 688
> SUSE Software Solutions Germany GmbH, 90409 Nürnberg
> GF: F. Imendörffer, HRB 36809 (AG Nürnberg)



___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [dm-devel] RFC: one more time: SCSI device identification

2021-04-27 Thread Hannes Reinecke
On 4/27/21 10:10 AM, Martin Wilck wrote:
> On Tue, 2021-04-27 at 13:48 +1000, Erwin van Londen wrote:
>>>
>>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>>> afaics.
>>>
>> In my view the WWID should never change. 
> 
> In an ideal world, perhaps not. But in the dm-multipath realm, we know
> that WWID changes can happen with certain storage arrays. See 
> https://listman.redhat.com/archives/dm-devel/2021-February/msg00116.html 
> and follow-ups, for example.
> 
And it's actually something which might happen quite easily.
The storage array can unmap a LUN, delete it, create a new one, and map
that one into the same LUN number than the old one.
If we didn't do I/O during that interval upon the next I/O we will be
getting the dreaded 'Power-On/Reset' sense code.
_And nothing else_, due to the arcane rules for sense code generation in
SAM.
But we end up with a completely different device.

The only way out of it is to do a rescan for every POR sense code, and
disable the device eg via DID_NO_CONNECT whenever we find that the
identification has changed. We already have a copy of the original VPD
page 0x83 at hand, so that should be reasonably easy.

I had a rather lengthy discussion with Fred Knight @ NetApp about
Power-On/Reset handling, what with him complaining that we don't handle
is correctly. So this really is something we should be looking into,
even independently of multipathing.

But actually I like the idea from Martin Petersen to expose the parsed
VPD identifiers to sysfs; that would allow us to drop sg_inq completely
from the udev rules.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke Kernel Storage Architect
h...@suse.de   +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Antw: [EXT] Re: [dm-devel] RFC: one more time: SCSI device identification

2021-04-27 Thread Ulrich Windl
>>> Erwin van Londen  schrieb am 27.04.2021 um 05:48 
>>> in
Nachricht
:

> 
> On Mon, 2021-04-26 at 13:16 +, Martin Wilck wrote:
>> On Mon, 2021-04-26 at 13:14 +0200, Ulrich Windl wrote:
>> > > > 
>> > > 
>> > > While we're at it, I'd like to mention another issue: WWID
>> > > changes.
>> > > 
>> > > This is a big problem for multipathd. The gist is that the device
>> > > identification attributes in sysfs only change after rescanning
>> > > the
>> > > device. Thus if a user changes LUN assignments on a storage
>> > > system,
>> > > it can happen that a direct INQUIRY returns a different WWID as
>> > > in
>> > > sysfs, which is fatal. If we plan to rely more on sysfs for
>> > > device
>> > > identification in the future, the problem gets worse. 
>> > 
>> > I think many devices rely on the fact that they are identified by
>> > Vendor/model/serial_nr, because in most professional SAN storage
>> > systems you
>> > can pre-set the serial number to a custom value; so if you want a
>> > new
>> > disk
>> > (maybe a snapshot) to be compatible with the old one, just assign
>> > the
>> > same
>> > serial number. I guess that's the idea behind.
>> 
>> What you are saying sounds dangerous to me. If a snapshot has the
>> same
>> WWID as the device it's a snapshot of, it must not be exposed to any
>> host(s) at the same time with its origin, otherwise the host may
>> happily combine it with the origin into one multipath map, and data
>> corruption will almost certainly result. 
>> 
>> My argument is about how the host is supposed to deal with a WWID
>> change if it happens. Here, "WWID change" means that a given H:C:T:L
>> suddenly exposes different device designators than it used to, while
>> this device is in use by a host. Here, too, data corruption is
>> imminent, and can happen in a blink of an eye. To avoid this, several
>> things are needed:
>> 
>>  1) the host needs to get notified about the change (likely by an UA
>> of
>> some sort)
>>  2) the kernel needs to react to the notification immediately, e.g.
>> by
>> blocking IO to the device,
>>  3) userspace tooling such as udev or multipathd need to figure out
>> how
>> to  how to deal with the situation cleanly, and eventually unblock
>> it.
>> 
>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>> afaics.
>> 
> In my view the WWID should never change. If a snapshot is created it
> should either obtain a new WWID. An example out of a Hitachi array is
> 
> Device Identification VPD page:
> Addressed logical unit:
> designator type: T10 vendor identification, code set: ASCII
> vendor id: HITACHI 
> vendor specific: 50403B050709
> designator type: NAA, code set: Binary
> 0x60060e80123b050050403b050709
> 
> The majority of the naa wwid is tied to the storage subsystem and
> identifies the vendor oui, model, serial etc. The last 4 in this
> example indicate the LDEV ID (Sorry mainframe heritage here..). When a
> snapshot is taken these 4 will change as a new LDEV ID is assigned to
> the snapshot. This sort of behaviour should be consistent across all
> storage vendors imho.

It's getting off-topic, but in automatic desaster recovery scenarios one might 
want that the "new disk" (maybe a snapshot of the original disk before it got 
corrupted) looks like the "old disk", so that the OS can boot without needing 
any adjustments.

Regards,
Ulrich

> 
>> Martin
>> 




___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel