Re: [systemd-devel] [PATCH 2/2] Manage: Inform udev about device removal when stopping
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/16/2016 09:46 PM, NeilBrown wrote: > On Wed, Feb 17 2016, Jes Sorensen wrote: > >> Hannes Reinecke writes: >>> On 02/16/2016 07:03 PM, Sebastian Parschauer wrote: On 16.02.2016 18:41, Jes Sorensen wrote: > Sebastian Parschauer > writes: >> When stopping an MD device, then its device node >> /dev/mdX may still exist afterwards or it is >> recreated by udev. The next open() call can lead to >> creation of an inoperable MD device. The reason for >> this is that a change event (KOBJ_CHANGE) is >> announced to udev. So announce a removal event >> (KOBJ_REMOVE) to udev instead. >> >> This also overrides the change event sent by the >> kernel. >> >> Signed-off-by: Sebastian Parschauer >> --- Manage.c | >> 6 +++--- 1 file changed, 3 insertions(+), 3 >> deletions(-) >> >> diff --git a/Manage.c b/Manage.c index >> 7e1b94b..bc89764 100644 --- a/Manage.c +++ >> b/Manage.c @@ -494,13 +494,13 @@ done: goto out; } /* >> prior to 2.6.28, KOBJ_CHANGE was not sent when an md >> array - * was stopped, so We'll do it here just to >> be sure. Drop any - * partitions as well... + * >> was stopped, it should be KOBJ_REMOVE instead, so we >> set the + * remove event here just to be sure. Drop >> any partitions as well... */ if (fd >= 0) ioctl(fd, >> BLKRRPART, 0); if (mdi) -sysfs_uevent(mdi, >> "change"); + sysfs_uevent(mdi, "remove"); > > I am a little concerned about this change. You assume > the kernel and mdadm will be updated in sync, which is > unlikely to happen. I believe you need to match the > kernel version and send the corresponding event > currectly for this to work correctly? The worst thing that can happen is that the kernel sends the change event after the remove event. Then it is the current situation again. From my tests mdadm does enough other stuff in between. Udev is able to handle receiving two remove events from my testing. Multiple mdadm instances can't run in parallel any ways. So userspace around it needs some serialization for it any ways. So also stopping an MD device and assembling a new one with the same minor number shouldn't race. I still prefer this solution here. But if you decide to drop the udev event sending in mdadm, then I'm also fine with that. >>> I strongly prefer removing the udev event generation >>> altogether. As this appears to be a carry-over from older >>> kernels, it looks to me as being an incomplete conversion: >>> At one point udev introduced 'ONLINE' and 'OFFLINE' events, >>> which were supposed to be used for this kind of scenario. >>> (ONLINE being a companion to 'ADD', and 'OFFLINE' being the >>> companion to 'DELETE'). However, later the 'ONLINE' got >>> modified to 'CHANGE', and the 'OFFLINE' got dropped >>> completely. Or that was the plan. So it looks as if the >>> conversion to 'CHANGE' got applied to the 'OFFLINE' event, >>> too. Hence I strongly recommend to drop it completely, and >>> let the kernel or the MD module decide if and when a uevent >>> should be send. >> >> I am totally fine with this, however we should make mdadm >> fail if run against a pre-2.6.28 kernel then. >> >> Cheers, Jes > > I would suggest protecting the > > if (fd >= 0) ioctl(fd, BLKRRPART, 0); if (mdi) > sysfs_uevent(mdi, "change"); > > code with > > if (get_linux_version() < 2006028) > > That should be completely safe - 2.6.28 and later do this (if > needed). > +1. Yes, this is the best solution. Cheers, Hannes - -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBAgAGBQJWxBs6AAoJEGz4yi9OyKjP8a0P/RqXuKmFFOcBXnST1f1ZWq24 gExfzeo8VAQicWi/CBFu+lumePOiypzKfP0NfHw4PDGPYuLQQq6OXRmiQCqhWz/p EBRZ9a8NyUIjUpra2j66IiMGwh1KGYl9AeIZmvGTNVkNcOySKdJxLWkFnI6wLwvU YmUez04UFxEleynt5c00ZYvioYnVchVDNEc4/8yTG6jAUg4+6Q7tTw8vvR3wko+K vmDbUyUz+q8R5tyTjCKB/KWgMPnMUv+wYoZx+jWtpRyUO6a76U9T5if+ZKk4EUkb NBHy76L6/YxvOhJqAuX9dMiPDADgHVzD5mnzTSzt9HF/ArXBXtEMcaMxhLPYpLTU lY7FQqzQ5sGQKbo0nm3EHrLjIP1bWe3BKaniyFQG39wlzdhhFGCzJzHnl1KwSnu6 gy+/AuQSibvxUQhD/ZO4+AJjMq1sXLwRlwwPFr/pI8wrcIqFIUuZG0JjY9NY2UQ2 +povgSj4UnXpRS7BKjvmN/VyUIbnXzf8cpB8w2qwxI5nSwKgjhSNz+o3NeCDQqpw u2E0MIciPDKXb2GnfPA2+Depm8VfcL9uaRNbHmnV9shIlRsQLB9/IzzVA5Cf5T3f GA9pHLKEM2LHAWqPmUVIghzyTTj5CXsZrH2GdJVyNTc1bymDBKmQbbiO+IVIHg45 XnJ7L15O6ZTXEW1WMNYT =okXv -END PGP SIGNATURE- ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-dev
Re: [systemd-devel] [PATCH 2/2] Manage: Inform udev about device removal when stopping
On Wed, Feb 17 2016, Jes Sorensen wrote: > Hannes Reinecke writes: >> On 02/16/2016 07:03 PM, Sebastian Parschauer wrote: >>> On 16.02.2016 18:41, Jes Sorensen wrote: Sebastian Parschauer writes: > When stopping an MD device, then its device node /dev/mdX may still > exist afterwards or it is recreated by udev. The next open() call > can lead to creation of an inoperable MD device. The reason for > this is that a change event (KOBJ_CHANGE) is announced to udev. > So announce a removal event (KOBJ_REMOVE) to udev instead. > > This also overrides the change event sent by the kernel. > > Signed-off-by: Sebastian Parschauer > --- > Manage.c |6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/Manage.c b/Manage.c > index 7e1b94b..bc89764 100644 > --- a/Manage.c > +++ b/Manage.c > @@ -494,13 +494,13 @@ done: > goto out; > } > /* prior to 2.6.28, KOBJ_CHANGE was not sent when an md array > - * was stopped, so We'll do it here just to be sure. Drop any > - * partitions as well... > + * was stopped, it should be KOBJ_REMOVE instead, so we set the > + * remove event here just to be sure. Drop any partitions as well... >*/ > if (fd >= 0) > ioctl(fd, BLKRRPART, 0); > if (mdi) > - sysfs_uevent(mdi, "change"); > + sysfs_uevent(mdi, "remove"); I am a little concerned about this change. You assume the kernel and mdadm will be updated in sync, which is unlikely to happen. I believe you need to match the kernel version and send the corresponding event currectly for this to work correctly? >>> >>> The worst thing that can happen is that the kernel sends the change >>> event after the remove event. Then it is the current situation again. >>> From my tests mdadm does enough other stuff in between. Udev is able to >>> handle receiving two remove events from my testing. Multiple mdadm >>> instances can't run in parallel any ways. So userspace around it needs >>> some serialization for it any ways. So also stopping an MD device and >>> assembling a new one with the same minor number shouldn't race. >>> >>> I still prefer this solution here. But if you decide to drop the udev >>> event sending in mdadm, then I'm also fine with that. >>> >> I strongly prefer removing the udev event generation altogether. >> As this appears to be a carry-over from older kernels, it looks to me >> as being an incomplete conversion: >> At one point udev introduced 'ONLINE' and 'OFFLINE' events, which were >> supposed to be used for this kind of scenario. >> (ONLINE being a companion to 'ADD', and 'OFFLINE' being the companion >> to 'DELETE'). However, later the 'ONLINE' got modified to 'CHANGE', >> and the 'OFFLINE' got dropped completely. >> Or that was the plan. >> So it looks as if the conversion to 'CHANGE' got applied to the >> 'OFFLINE' event, too. >> Hence I strongly recommend to drop it completely, and let the kernel >> or the MD module decide if and when a uevent should be send. > > I am totally fine with this, however we should make mdadm fail if run > against a pre-2.6.28 kernel then. > > Cheers, > Jes I would suggest protecting the if (fd >= 0) ioctl(fd, BLKRRPART, 0); if (mdi) sysfs_uevent(mdi, "change"); code with if (get_linux_version() < 2006028) That should be completely safe - 2.6.28 and later do this (if needed). NeilBrown signature.asc Description: PGP signature ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 1/2] md: Inform udev about device removal when stopping
On Wed, Feb 17 2016, Shaohua Li wrote: > On Tue, Feb 16, 2016 at 03:44:36PM +0100, Sebastian Parschauer wrote: >> When stopping an MD device, then its device node /dev/mdX may still >> exist afterwards or it is recreated by udev. The next open() call >> can lead to creation of an inoperable MD device. The reason for >> this is that a change event (KOBJ_CHANGE) is announced to udev. >> So announce a removal event (KOBJ_REMOVE) to udev instead. >> >> A change is likely also required in mdadm because of the support >> for kernels prior to 2.6.28. > > I didn't follow why we need the change. Shouldn't the KOBJ_REMOVE event be > sent > automatically when gendisk is deleted? > mddev_put()->mddev_delayed_delete()->md_free()->del_gendisk(). > > Thanks, > Shaohua For a bit of context: this KOBJ_CHANGE event was added in Oct 2008 Commit: 934d9c23b4c7 ("md: destroy partitions and notify udev when md array is stopped.") At the time, md devices weren't getting removed at all. Now they are (I figured out the locking), though they can still come back. There are still two stages. The array is stopped, and then the block device is destroyed. It is theoretically possible to stop the array without destroying the block device, though I don't think that happens in practice. So this KOBJ_CHANGE is, I think, technically correct (change from "active" to "inactive") but probably isn't needed any more - not to the extent it was at the time. There are some annoying races with caused by udev responding (belatedly) to events by running programs that open s/dev/mdXX and so automatically re-creates the md device. The real problem here is not the event or the delays in udev. It is the fact that opening /dev/mdXX transparently creates a device. The only way (I know of) to really avoid these races is to use named arrays. Put CREATE names=yes in mdadm.conf. Then md arrays will be created by writing a name to a magic file in /sys. The arrays have a minor number >=512 and are not auto-re-created if the device node is re-opened before udev unlinks it. So: the patch might be safe, and might solve a particular problem, but it is really just a bandaid. The best fix is "CREATE named=yes" (and use named like "md_home", not "md4". NeilBrown signature.asc Description: PGP signature ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 1/2] md: Inform udev about device removal when stopping
On Tue, Feb 16, 2016 at 03:44:36PM +0100, Sebastian Parschauer wrote: > When stopping an MD device, then its device node /dev/mdX may still > exist afterwards or it is recreated by udev. The next open() call > can lead to creation of an inoperable MD device. The reason for > this is that a change event (KOBJ_CHANGE) is announced to udev. > So announce a removal event (KOBJ_REMOVE) to udev instead. > > A change is likely also required in mdadm because of the support > for kernels prior to 2.6.28. I didn't follow why we need the change. Shouldn't the KOBJ_REMOVE event be sent automatically when gendisk is deleted? mddev_put()->mddev_delayed_delete()->md_free()->del_gendisk(). Thanks, Shaohua ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] nss-mymachines: slow name resolution
Am Tue, 16 Feb 2016 19:39:26 +0100 schrieb Kai Krakow : > Am Tue, 16 Feb 2016 15:35:24 +0100 > schrieb Lennart Poettering : > > > On Mon, 15.02.16 21:32, Kai Krakow (hurikha...@gmail.com) wrote: > > > > > Am Mon, 15 Feb 2016 14:28:19 +0100 > > > schrieb Lennart Poettering : > > > > > > > On Sun, 14.02.16 13:49, Kai Krakow (hurikha...@gmail.com) wrote: > > > > > > > > > Hello! > > > > > > > > > > I've followed the man page guide to setup mymachines name > > > > > resolution in nsswitch.conf. It works. But it takes around 4-5 > > > > > seconds to resolve a name. This is unexpected and cannot be > > > > > used in production. > > > > > > > > This sounds like the LLMNR timeout done. I figure we should fix > > > > the docs to suggest that "mymachines" appears before "resolve" > > > > in nsswitch.conf. That should fix your issue... > > > > > > Apparently it doesn't fix it - although I will leave it in this > > > order according to your recommendation. > > > > > > Is there a way to globally disable LLMNR altogether to nail it > > > down? I tried setting LLMNR=false in *.network - didn't help. > > > > Use the LLMNR= setting in /etc/systemd/resolved.conf > > Yeah! *thumbsup* You da man, Lennart! > > Setting LLMNR to "resolve" or to "no" globally solves the problem > which proves your first suspicion. BTW: Enabling and starting avahi also fixed the problem (at least it looks like, did a few other steps), although I don't see it listening on port 5353. > Now, how can I figure out which interface is the problematic one? Do I > actually need LLMNR in a simple home network? > > The long term is to use this in a container based hosting environment. > I'm pretty sure I actually don't need LLMNR there. So I'm just curious > how to "optimize" my home setup. -- Regards, Kai Replies to list-only preferred. ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 2/2] Manage: Inform udev about device removal when stopping
On 02/16/2016 07:03 PM, Sebastian Parschauer wrote: On 16.02.2016 18:41, Jes Sorensen wrote: Sebastian Parschauer writes: When stopping an MD device, then its device node /dev/mdX may still exist afterwards or it is recreated by udev. The next open() call can lead to creation of an inoperable MD device. The reason for this is that a change event (KOBJ_CHANGE) is announced to udev. So announce a removal event (KOBJ_REMOVE) to udev instead. This also overrides the change event sent by the kernel. Signed-off-by: Sebastian Parschauer --- Manage.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Manage.c b/Manage.c index 7e1b94b..bc89764 100644 --- a/Manage.c +++ b/Manage.c @@ -494,13 +494,13 @@ done: goto out; } /* prior to 2.6.28, KOBJ_CHANGE was not sent when an md array -* was stopped, so We'll do it here just to be sure. Drop any -* partitions as well... +* was stopped, it should be KOBJ_REMOVE instead, so we set the +* remove event here just to be sure. Drop any partitions as well... */ if (fd >= 0) ioctl(fd, BLKRRPART, 0); if (mdi) - sysfs_uevent(mdi, "change"); + sysfs_uevent(mdi, "remove"); I am a little concerned about this change. You assume the kernel and mdadm will be updated in sync, which is unlikely to happen. I believe you need to match the kernel version and send the corresponding event currectly for this to work correctly? The worst thing that can happen is that the kernel sends the change event after the remove event. Then it is the current situation again. From my tests mdadm does enough other stuff in between. Udev is able to handle receiving two remove events from my testing. Multiple mdadm instances can't run in parallel any ways. So userspace around it needs some serialization for it any ways. So also stopping an MD device and assembling a new one with the same minor number shouldn't race. I still prefer this solution here. But if you decide to drop the udev event sending in mdadm, then I'm also fine with that. I strongly prefer removing the udev event generation altogether. As this appears to be a carry-over from older kernels, it looks to me as being an incomplete conversion: At one point udev introduced 'ONLINE' and 'OFFLINE' events, which were supposed to be used for this kind of scenario. (ONLINE being a companion to 'ADD', and 'OFFLINE' being the companion to 'DELETE'). However, later the 'ONLINE' got modified to 'CHANGE', and the 'OFFLINE' got dropped completely. Or that was the plan. So it looks as if the conversion to 'CHANGE' got applied to the 'OFFLINE' event, too. Hence I strongly recommend to drop it completely, and let the kernel or the MD module decide if and when a uevent should be send. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] nss-mymachines: slow name resolution
Am Tue, 16 Feb 2016 15:35:24 +0100 schrieb Lennart Poettering : > On Mon, 15.02.16 21:32, Kai Krakow (hurikha...@gmail.com) wrote: > > > Am Mon, 15 Feb 2016 14:28:19 +0100 > > schrieb Lennart Poettering : > > > > > On Sun, 14.02.16 13:49, Kai Krakow (hurikha...@gmail.com) wrote: > > > > > > > Hello! > > > > > > > > I've followed the man page guide to setup mymachines name > > > > resolution in nsswitch.conf. It works. But it takes around 4-5 > > > > seconds to resolve a name. This is unexpected and cannot be > > > > used in production. > > > > > > This sounds like the LLMNR timeout done. I figure we should fix > > > the docs to suggest that "mymachines" appears before "resolve" in > > > nsswitch.conf. That should fix your issue... > > > > Apparently it doesn't fix it - although I will leave it in this > > order according to your recommendation. > > > > Is there a way to globally disable LLMNR altogether to nail it > > down? I tried setting LLMNR=false in *.network - didn't help. > > Use the LLMNR= setting in /etc/systemd/resolved.conf Yeah! *thumbsup* You da man, Lennart! Setting LLMNR to "resolve" or to "no" globally solves the problem which proves your first suspicion. Now, how can I figure out which interface is the problematic one? Do I actually need LLMNR in a simple home network? The long term is to use this in a container based hosting environment. I'm pretty sure I actually don't need LLMNR there. So I'm just curious how to "optimize" my home setup. -- Regards, Kai Replies to list-only preferred. ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 2/2] Manage: Inform udev about device removal when stopping
On 16.02.2016 18:41, Jes Sorensen wrote: > Sebastian Parschauer writes: >> When stopping an MD device, then its device node /dev/mdX may still >> exist afterwards or it is recreated by udev. The next open() call >> can lead to creation of an inoperable MD device. The reason for >> this is that a change event (KOBJ_CHANGE) is announced to udev. >> So announce a removal event (KOBJ_REMOVE) to udev instead. >> >> This also overrides the change event sent by the kernel. >> >> Signed-off-by: Sebastian Parschauer >> --- >> Manage.c |6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/Manage.c b/Manage.c >> index 7e1b94b..bc89764 100644 >> --- a/Manage.c >> +++ b/Manage.c >> @@ -494,13 +494,13 @@ done: >> goto out; >> } >> /* prior to 2.6.28, KOBJ_CHANGE was not sent when an md array >> - * was stopped, so We'll do it here just to be sure. Drop any >> - * partitions as well... >> + * was stopped, it should be KOBJ_REMOVE instead, so we set the >> + * remove event here just to be sure. Drop any partitions as well... >> */ >> if (fd >= 0) >> ioctl(fd, BLKRRPART, 0); >> if (mdi) >> -sysfs_uevent(mdi, "change"); >> +sysfs_uevent(mdi, "remove"); > > I am a little concerned about this change. You assume the kernel and > mdadm will be updated in sync, which is unlikely to happen. I believe > you need to match the kernel version and send the corresponding event > currectly for this to work correctly? The worst thing that can happen is that the kernel sends the change event after the remove event. Then it is the current situation again. From my tests mdadm does enough other stuff in between. Udev is able to handle receiving two remove events from my testing. Multiple mdadm instances can't run in parallel any ways. So userspace around it needs some serialization for it any ways. So also stopping an MD device and assembling a new one with the same minor number shouldn't race. I still prefer this solution here. But if you decide to drop the udev event sending in mdadm, then I'm also fine with that. Cheers, Sebastian ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] md/mdadm: Inform udev about device removal when stopping
Hi, I haven't been subscribed yet when I've sent my patch set. http://marc.info/?l=linux-raid&m=145563393627991&w=2 Please reply to the linux-raid mailing list if you'd like to comment my patches. Thanks, Sebastian ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] nss-mymachines: slow name resolution
On Mon, 15.02.16 21:32, Kai Krakow (hurikha...@gmail.com) wrote: > Am Mon, 15 Feb 2016 14:28:19 +0100 > schrieb Lennart Poettering : > > > On Sun, 14.02.16 13:49, Kai Krakow (hurikha...@gmail.com) wrote: > > > > > Hello! > > > > > > I've followed the man page guide to setup mymachines name > > > resolution in nsswitch.conf. It works. But it takes around 4-5 > > > seconds to resolve a name. This is unexpected and cannot be used in > > > production. > > > > This sounds like the LLMNR timeout done. I figure we should fix the > > docs to suggest that "mymachines" appears before "resolve" in > > nsswitch.conf. That should fix your issue... > > Apparently it doesn't fix it - although I will leave it in this order > according to your recommendation. > > Is there a way to globally disable LLMNR altogether to nail it down? I > tried setting LLMNR=false in *.network - didn't help. Use the LLMNR= setting in /etc/systemd/resolved.conf Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel