Re: [systemd-devel] Antw: [EXT] Re: [dm-devel] RFC: one more time: SCSI device identification

2021-05-04 Thread Hannes Reinecke

On 4/27/21 12:52 PM, Ulrich Windl wrote:

Hannes Reinecke  schrieb am 27.04.2021 um 10:21 in Nachricht

<2a6903e4-ff2b-67d5-e772-6971db844...@suse.de>:

On 4/27/21 10:10 AM, Martin Wilck wrote:

On Tue, 2021‑04‑27 at 13:48 +1000, Erwin van Londen wrote:


Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
afaics.


In my view the WWID should never change.


In an ideal world, perhaps not. But in the dm‑multipath realm, we know
that WWID changes can happen with certain storage arrays. See
https://listman.redhat.com/archives/dm‑devel/2021‑February/msg00116.html
and follow‑ups, for example.


And it's actually something which might happen quite easily.
The storage array can unmap a LUN, delete it, create a new one, and map
that one into the same LUN number than the old one.
If we didn't do I/O during that interval upon the next I/O we will be
getting the dreaded 'Power‑On/Reset' sense code.
_And nothing else_, due to the arcane rules for sense code generation in
SAM.
But we end up with a completely different device.

The only way out of it is to do a rescan for every POR sense code, and
disable the device eg via DID_NO_CONNECT whenever we find that the
identification has changed. We already have a copy of the original VPD
page 0x83 at hand, so that should be reasonably easy.


I don't know the depth of the SCSI or FC protocol, but storage systems
typically signal such events, maybe either via some unit attention or some FC
event. Older kernels logged that there was a change, but a manual SCSI bus scan
is needed, while newer kernels find new devices "automagically" for some
products. The HP EVA 6000 series wored that way, a 3PAR SotorServ 8000 series
also seems to work that way, but not Pure Storage X70 R3. FOr the latter you
need something like a FC LIP to make the kernel detect the new devices (LUNs).
I'm unsure where the problem is, but in principle the kernel can be
notified...

My point was that while there _is_ a unit attention with the sense code 
'INQUIRY DATA CHANGED' (and that indeed will generate a kernel message), 
it might be obscured by a subsequent unit attention with the sense code 
'Power-On/Reset', as per SCSI spec the latter might cause the previous 
ones to _not_ being sent.
So from that reasoning we will need to rescan the device upon 
'Power-on/Reset'.
But 'Power-On/Reset' is a sense code which we also get during initial 
device scan, so the problem is that we will be triggering a rescan while 
_doing_ a rescan, and as such it would need some really careful testing.


As for the PureStorage behaviour: The problem with changing the LUN 
mapping on the array is that it we might not _have_ a device to send 
unit attentions to.
If the array already exports LUNs to some other hosts, it doesn't need 
to re-initialize the FC port when starting to export LUNs to _this_ 
host. And as _this_ host doesn't have a LUN on which unit attentions can 
be sent, _and_ the FC port is already registered, there are no events 
whatsoever which would cause the host to initiate a rescan.
To resolve that the array would need to induce eg an RSCN, but that will 
only be triggered if a FC port is (re-)registered.
Which is what HPe arrays do; initiate a link-bounce on the attached 
ports, which will cause the attached hosts to initiate a rescan.
Of course, _all_ hosts will need to rescan (and thereby causing an 
interruption even on unrelated hosts), which is why this is not done by 
all vendors.




I had a rather lengthy discussion with Fred Knight @ NetApp about
Power‑On/Reset handling, what with him complaining that we don't handle
is correctly. So this really is something we should be looking into,
even independently of multipathing.

But actually I like the idea from Martin Petersen to expose the parsed
VPD identifiers to sysfs; that would allow us to drop sg_inq completely
from the udev rules.


Talking of VPDs: Somewhere in the last 12 years (within SLES 11)there was a
kernel change regarding trailing blanks in VPD data. That change blew up
several configurations being unable to re-recognize the devices. In one case
the software even had bound a license to a specific device with serial number,
and that software found "new" devices while missing the "old" ones...

That's probably just for VPD page 0x80. Page 0x83 has pretty strict 
rules on how the entries are formatted, so chopping off trailing blanks 
is not easily done there.


Cheers,

Hannes
--
Dr. Hannes ReineckeKernel Storage Architect
h...@suse.de  +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [dm-devel] RFC: one more time: SCSI device identification

2021-04-27 Thread Hannes Reinecke
On 4/27/21 10:10 AM, Martin Wilck wrote:
> On Tue, 2021-04-27 at 13:48 +1000, Erwin van Londen wrote:
>>>
>>> Wrt 1), we can only hope that it's the case. But 2) and 3) need work,
>>> afaics.
>>>
>> In my view the WWID should never change. 
> 
> In an ideal world, perhaps not. But in the dm-multipath realm, we know
> that WWID changes can happen with certain storage arrays. See 
> https://listman.redhat.com/archives/dm-devel/2021-February/msg00116.html 
> and follow-ups, for example.
> 
And it's actually something which might happen quite easily.
The storage array can unmap a LUN, delete it, create a new one, and map
that one into the same LUN number than the old one.
If we didn't do I/O during that interval upon the next I/O we will be
getting the dreaded 'Power-On/Reset' sense code.
_And nothing else_, due to the arcane rules for sense code generation in
SAM.
But we end up with a completely different device.

The only way out of it is to do a rescan for every POR sense code, and
disable the device eg via DID_NO_CONNECT whenever we find that the
identification has changed. We already have a copy of the original VPD
page 0x83 at hand, so that should be reasonably easy.

I had a rather lengthy discussion with Fred Knight @ NetApp about
Power-On/Reset handling, what with him complaining that we don't handle
is correctly. So this really is something we should be looking into,
even independently of multipathing.

But actually I like the idea from Martin Petersen to expose the parsed
VPD identifiers to sysfs; that would allow us to drop sg_inq completely
from the udev rules.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke Kernel Storage Architect
h...@suse.de   +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] md: Drop sending a change uevent when stopping

2016-02-17 Thread Hannes Reinecke
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/17/2016 10:29 PM, NeilBrown wrote:
> On Thu, Feb 18 2016, Shaohua Li wrote:
> 
>> On Wed, Feb 17, 2016 at 05:25:00PM +0100, Sebastian
>> Parschauer wrote:
>>> When stopping an MD device, then its device node /dev/mdX
>>> may still exist afterwards or it is recreated by udev. The
>>> next open() call can lead to creation of an inoperable MD
>>> device. The reason for this is that a change event
>>> (KOBJ_CHANGE) is sent to udev which races against the
>>> remove event (KOBJ_REMOVE) from md_free(). So drop sending
>>> the change event.
>>> 
>>> A change is likely also required in mdadm as many versions
>>> send the change event to udev as well.
>> 
>> Makes sense, it's unlikely we need the CHANGE event.
>> Applied.
>> 
>> Thanks, Shaohua
> 
> It would be worth checking, but I think that with this change,
> you can write "inactive" to /sys/block/mdXXX/md/array_state and
> the array will become inactive, but no uevent will be
> generated, which isn't good. Maybe send the uevent that was
> just removed from the 'inactive' case of array_state_store()
> instead.
> 
> (But I still think this is just a bandaid and doesn't provide
> any guarantees that there will be no races with udev)
> 
Thing is, _none_ of the other subsystems will ever send a uevent
when it becomes inactive.
(Would be pretty pointless, too, as what exactly is one supposed
to do here?)
The current usage has it that CHANGE events are only ever sent if
a device becomes active.

Cheers,

Hannes
- -- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQIcBAEBAgAGBQJWxWiUAAoJEGz4yi9OyKjPUhUP+gJXhNTCYTbLNzR7LYcPQplY
rqcALLhZIDt8inveiSaPVXs5F1VCQsT87qS6JtD3EBSU64eWVq0+xowxKStoyjPl
/MaBFQs7yxJCdf5Enx0/hKPN3MYuQT2nf5EiB461mlnfxLZKUEgwKbDK6+6HqToI
x0rtUFv4JpVDd9HFY3PNqZjGtQTMrbXMVxsBtefIiYPeyaPpYU9Zo0qM+17CmJIr
J3JdXOjHluengKcdi1O6GDvshUiysWP/ukG/q7If4JxpomKS5ljOn5MHcCgBl/CL
UwjUuSmZ9e5ZKyIIxU2oMAFRYqLCGX5Fw5Q90YG7UOZQ3ODbYPJfR7d61OSjomYt
j0bME+QXkdoxOkwlG7EwSU8fG6dv4H55RxrrFcu4ZBl6TRo3jpCOCrq+kY1XapF4
NGeY0j3vBQ4ZziMNodelg+KZaBrCSFGu+cn7uqEsrOJ+N4e7gJv28trPiPzE4Hiz
07buvtTNEYQuqUNvR9MiuDCubnHy0imaA+3fb0orXZOllhYFAMfUVxpOsJvO/ySg
XVSrpcm7BnU/4i6sNXJUpZdTrmWPi8kvZU9avlZFwL0CrVtdhjkoGOu20Ou7H53g
nR3wXQWEybd8yK62lqb3g7one/JnEeVupb69aSiaJnAt7kLZE2Dcg5QVFPAXOIDU
eqYQYG60/85zdeLcqzlM
=e02n
-END PGP SIGNATURE-
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH 2/2] Manage: Inform udev about device removal when stopping

2016-02-16 Thread Hannes Reinecke
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/16/2016 09:46 PM, NeilBrown wrote:
> On Wed, Feb 17 2016, Jes Sorensen wrote:
> 
>> Hannes Reinecke <h...@suse.de> writes:
>>> On 02/16/2016 07:03 PM, Sebastian Parschauer wrote:
>>>> On 16.02.2016 18:41, Jes Sorensen wrote:
>>>>> Sebastian Parschauer
>>>>> <sebastian.rie...@profitbricks.com> writes:
>>>>>> When stopping an MD device, then its device node
>>>>>> /dev/mdX may still exist afterwards or it is
>>>>>> recreated by udev. The next open() call can lead to
>>>>>> creation of an inoperable MD device. The reason for 
>>>>>> this is that a change event (KOBJ_CHANGE) is
>>>>>> announced to udev. So announce a removal event
>>>>>> (KOBJ_REMOVE) to udev instead.
>>>>>> 
>>>>>> This also overrides the change event sent by the
>>>>>> kernel.
>>>>>> 
>>>>>> Signed-off-by: Sebastian Parschauer
>>>>>> <sebastian.rie...@profitbricks.com> --- Manage.c |
>>>>>> 6 +++--- 1 file changed, 3 insertions(+), 3
>>>>>> deletions(-)
>>>>>> 
>>>>>> diff --git a/Manage.c b/Manage.c index
>>>>>> 7e1b94b..bc89764 100644 --- a/Manage.c +++
>>>>>> b/Manage.c @@ -494,13 +494,13 @@ done: goto out; } /*
>>>>>> prior to 2.6.28, KOBJ_CHANGE was not sent when an md
>>>>>> array -   * was stopped, so We'll do it here just to
>>>>>> be sure.  Drop any -  * partitions as well... +   *
>>>>>> was stopped, it should be KOBJ_REMOVE instead, so we
>>>>>> set the + * remove event here just to be sure. Drop
>>>>>> any partitions as well... */ if (fd >= 0) ioctl(fd,
>>>>>> BLKRRPART, 0); if (mdi) -sysfs_uevent(mdi,
>>>>>> "change"); + sysfs_uevent(mdi, "remove");
>>>>> 
>>>>> I am a little concerned about this change. You assume
>>>>> the kernel and mdadm will be updated in sync, which is
>>>>> unlikely to happen. I believe you need to match the
>>>>> kernel version and send the corresponding event 
>>>>> currectly for this to work correctly?
>>>> 
>>>> The worst thing that can happen is that the kernel sends
>>>> the change event after the remove event. Then it is the
>>>> current situation again. From my tests mdadm does enough
>>>> other stuff in between. Udev is able to handle receiving
>>>> two remove events from my testing. Multiple mdadm 
>>>> instances can't run in parallel any ways. So userspace
>>>> around it needs some serialization for it any ways. So
>>>> also stopping an MD device and assembling a new one with
>>>> the same minor number shouldn't race.
>>>> 
>>>> I still prefer this solution here. But if you decide to
>>>> drop the udev event sending in mdadm, then I'm also fine
>>>> with that.
>>>> 
>>> I strongly prefer removing the udev event generation
>>> altogether. As this appears to be a carry-over from older
>>> kernels, it looks to me as being an incomplete conversion: 
>>> At one point udev introduced 'ONLINE' and 'OFFLINE' events,
>>> which were supposed to be used for this kind of scenario. 
>>> (ONLINE being a companion to 'ADD', and 'OFFLINE' being the
>>> companion to 'DELETE'). However, later the 'ONLINE' got
>>> modified to 'CHANGE', and the 'OFFLINE' got dropped
>>> completely. Or that was the plan. So it looks as if the
>>> conversion to 'CHANGE' got applied to the 'OFFLINE' event,
>>> too. Hence I strongly recommend to drop it completely, and
>>> let the kernel or the MD module decide if and when a uevent
>>> should be send.
>> 
>> I am totally fine with this, however we should make mdadm
>> fail if run against a pre-2.6.28 kernel then.
>> 
>> Cheers, Jes
> 
> I would suggest protecting the
> 
> if (fd >= 0) ioctl(fd, BLKRRPART, 0); if (mdi) 
> sysfs_uevent(mdi, "change");
> 
> code with
> 
> if (get_linux_version() < 2006028)
> 
> That should be completely safe - 2.6.28 and later do this (if
> needed).
> 
+1.

Yes, this is the best solution.

Cheers,

Hannes
- -- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de

Re: [systemd-devel] [PATCH 2/2] Manage: Inform udev about device removal when stopping

2016-02-16 Thread Hannes Reinecke

On 02/16/2016 07:03 PM, Sebastian Parschauer wrote:

On 16.02.2016 18:41, Jes Sorensen wrote:

Sebastian Parschauer <sebastian.rie...@profitbricks.com> writes:

When stopping an MD device, then its device node /dev/mdX may still
exist afterwards or it is recreated by udev. The next open() call
can lead to creation of an inoperable MD device. The reason for
this is that a change event (KOBJ_CHANGE) is announced to udev.
So announce a removal event (KOBJ_REMOVE) to udev instead.

This also overrides the change event sent by the kernel.

Signed-off-by: Sebastian Parschauer <sebastian.rie...@profitbricks.com>
---
  Manage.c |6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Manage.c b/Manage.c
index 7e1b94b..bc89764 100644
--- a/Manage.c
+++ b/Manage.c
@@ -494,13 +494,13 @@ done:
goto out;
}
/* prior to 2.6.28, KOBJ_CHANGE was not sent when an md array
-* was stopped, so We'll do it here just to be sure.  Drop any
-* partitions as well...
+* was stopped, it should be KOBJ_REMOVE instead, so we set the
+* remove event here just to be sure. Drop any partitions as well...
 */
if (fd >= 0)
ioctl(fd, BLKRRPART, 0);
if (mdi)
-   sysfs_uevent(mdi, "change");
+   sysfs_uevent(mdi, "remove");


I am a little concerned about this change. You assume the kernel and
mdadm will be updated in sync, which is unlikely to happen. I believe
you need to match the kernel version and send the corresponding event
currectly for this to work correctly?


The worst thing that can happen is that the kernel sends the change
event after the remove event. Then it is the current situation again.
 From my tests mdadm does enough other stuff in between. Udev is able to
handle receiving two remove events from my testing. Multiple mdadm
instances can't run in parallel any ways. So userspace around it needs
some serialization for it any ways. So also stopping an MD device and
assembling a new one with the same minor number shouldn't race.

I still prefer this solution here. But if you decide to drop the udev
event sending in mdadm, then I'm also fine with that.


I strongly prefer removing the udev event generation altogether.
As this appears to be a carry-over from older kernels, it looks to me as 
being an incomplete conversion:
At one point udev introduced 'ONLINE' and 'OFFLINE' events, which were 
supposed to be used for this kind of scenario.
(ONLINE being a companion to 'ADD', and 'OFFLINE' being the companion to 
'DELETE'). However, later the 'ONLINE' got modified to 'CHANGE', and the 
'OFFLINE' got dropped completely.

Or that was the plan.
So it looks as if the conversion to 'CHANGE' got applied to the 
'OFFLINE' event, too.
Hence I strongly recommend to drop it completely, and let the kernel or 
the MD module decide if and when a uevent should be send.


Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH 0/2] Scalability fixes for large machines

2015-03-04 Thread Hannes Reinecke
On large machines we hit the limit on 512 concurrent dbus connections
pretty easily, so we're greeted with tons of messages

Too many concurrent connections, refusing

To raise this limit, however, there another cap on the number of
accepted epoll events, which is particularly nasty as it will
just silently drop events on the floor.

AFAICS both caps are totally arbitrary, so I've removed the cap on
the number of epoll events and raise the number of concurrent
dbus connections to 4096.

Hannes Reinecke (2):
  Remove the cap on epoll events
  Allow up to 4096 simultaneous connections

 src/core/dbus.c| 2 +-
 src/libsystemd/sd-event/sd-event.c | 3 +--
 2 files changed, 2 insertions(+), 3 deletions(-)

-- 
1.8.4.5

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH 1/2] Remove the cap on epoll events

2015-03-04 Thread Hannes Reinecke
Currently the code will silently blank out events
if there are more then 512 epoll events, causing them
never to be handled at all.
This patch removes the cap on the number of events
for epoll_wait, thereby avoiding this issue.

Signed-off-by: Hannes Reinecke h...@suse.de
---
 src/libsystemd/sd-event/sd-event.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/libsystemd/sd-event/sd-event.c 
b/src/libsystemd/sd-event/sd-event.c
index 0c4e517..8eb154c 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -36,7 +36,6 @@
 
 #include sd-event.h
 
-#define EPOLL_QUEUE_MAX 512U
 #define DEFAULT_ACCURACY_USEC (250 * USEC_PER_MSEC)
 
 typedef enum EventSourceType {
@@ -2366,7 +2365,7 @@ _public_ int sd_event_wait(sd_event *e, uint64_t timeout) 
{
 return 1;
 }
 
-ev_queue_max = CLAMP(e-n_sources, 1U, EPOLL_QUEUE_MAX);
+ev_queue_max = e-n_sources  0 ? e-n_sources : 1;
 ev_queue = newa(struct epoll_event, ev_queue_max);
 
 m = epoll_wait(e-epoll_fd, ev_queue, ev_queue_max,
-- 
1.8.4.5

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH 1/2] Remove the cap on epoll events

2015-03-04 Thread Hannes Reinecke
Currently the code will silently blank out events
if there are more then 512 epoll events, causing them
never to be handled at all.
This patch removes the cap on the number of events
for epoll_wait, thereby avoiding this issue.

Signed-off-by: Hannes Reinecke h...@suse.de
---
 src/libsystemd/sd-event/sd-event.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/libsystemd/sd-event/sd-event.c 
b/src/libsystemd/sd-event/sd-event.c
index 0c4e517..dae81c5 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -36,7 +36,6 @@
 
 #include sd-event.h
 
-#define EPOLL_QUEUE_MAX 512U
 #define DEFAULT_ACCURACY_USEC (250 * USEC_PER_MSEC)
 
 typedef enum EventSourceType {
@@ -2366,7 +2365,7 @@ _public_ int sd_event_wait(sd_event *e, uint64_t timeout) 
{
 return 1;
 }
 
-ev_queue_max = CLAMP(e-n_sources, 1U, EPOLL_QUEUE_MAX);
+ev_queue_max = e-n_sources  0 ? e-n_source : 1;
 ev_queue = newa(struct epoll_event, ev_queue_max);
 
 m = epoll_wait(e-epoll_fd, ev_queue, ev_queue_max,
-- 
1.8.4.5

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH 2/2] Allow up to 4096 simultaneous connections

2015-03-04 Thread Hannes Reinecke
On large system we hit the limit on 512 simultaneous dbus
connections, resulting in tons of annoying messages:

Too many concurrent connections, refusing

This patch raises the limit to 4096.

Signed-off-by: Hannes Reinecke h...@suse.de
---
 src/core/dbus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/core/dbus.c b/src/core/dbus.c
index 5dcb0d1..80f7589 100644
--- a/src/core/dbus.c
+++ b/src/core/dbus.c
@@ -43,7 +43,7 @@
 #include bus-internal.h
 #include selinux-access.h
 
-#define CONNECTIONS_MAX 512
+#define CONNECTIONS_MAX 4096
 
 static void destroy_bus(Manager *m, sd_bus **bus);
 
-- 
1.8.4.5

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH 2/2] Allow up to 4096 simultaneous connections

2015-03-04 Thread Hannes Reinecke
On large system we hit the limit on 512 simultaneous dbus
connections, resulting in tons of annoying messages:

Too many concurrent connections, refusing

This patch raises the limit to 4096.

Signed-off-by: Hannes Reinecke h...@suse.de
---
 src/core/dbus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/core/dbus.c b/src/core/dbus.c
index 5dcb0d1..80f7589 100644
--- a/src/core/dbus.c
+++ b/src/core/dbus.c
@@ -43,7 +43,7 @@
 #include bus-internal.h
 #include selinux-access.h
 
-#define CONNECTIONS_MAX 512
+#define CONNECTIONS_MAX 4096
 
 static void destroy_bus(Manager *m, sd_bus **bus);
 
-- 
1.8.4.5

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCHv2 0/2] Scalability fixes for large machines

2015-03-04 Thread Hannes Reinecke
On large machines we hit the limit on 512 concurrent dbus connections
pretty easily, so we're greeted with tons of messages

Too many concurrent connections, refusing

To raise this limit, however, there another cap on the number of
accepted epoll events, which is particularly nasty as it will
just silently drop events on the floor.

AFAICS both caps are totally arbitrary, so I've removed the cap on
the number of epoll events and raise the number of concurrent
dbus connections to 4096.

Changes to v1:
- Fix typo in the first patch

Hannes Reinecke (2):
  Remove the cap on epoll events
  Allow up to 4096 simultaneous connections

 src/core/dbus.c| 2 +-
 src/libsystemd/sd-event/sd-event.c | 3 +--
 2 files changed, 2 insertions(+), 3 deletions(-)

-- 
1.8.4.5

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [dm-devel] multipath breaks with recent udev/systemd

2014-12-18 Thread Hannes Reinecke
On 12/18/2014 11:04 PM, Benjamin Marzinski wrote:
 On Wed, Dec 17, 2014 at 01:04:54PM +0100, Hannes Reinecke wrote:
 On 12/16/2014 11:18 PM, Benjamin Marzinski wrote:
 On Tue, Dec 16, 2014 at 04:10:44PM -0600, Benjamin Marzinski wrote:
 On Mon, Dec 15, 2014 at 10:31:44AM +0100, Hannes Reinecke wrote:
 [ .. ]
 So during bootup it's anyone's guess who's first, multipath or udev.
 And depending on the timing either multipath will fail to setup
 the device-mapper device or udev will simply ignore the device.
 Neither of those is a good, but the first is an absolute killer for
 modern systems which _rely_ on udev to configure devices.

 So how it this supposed to work?
 Why does udev ignore the entire event if it can't get the lock?
 Shouldn't it rather be retried?
 What is the supposed recovery here?

 Hannes, are you against the idea that Alexander mentioned in his first
 email, of just locking a file in /var/lock?  Multipathd doesn't create
 devices in parallel. Multipath doesn't create files in parallel.  We are
 explicitly trying to avoid multipath and multipathd creating files at
 the same time. So, we should only need a single file to lock, and
 /run/lock should always be there.

 O.k. So if we want to keep our current nonblocking behavior, we'll need
 more lockfiles, either one per path or one per wwid.  This still seems
 like a reasonable idea, if there is a good reason for systemd doing what
 it's doing.

 The problem is as follows:

 When multipathd is running we simply _cannot_ guarantee that no udev
 events are currently running. This currently hits us especially bad
 during system startup when device probing is still running during
 multipathd startup.
 Multipathd will then enumerate all block devices to setup the
 initial topology.
 But in doing so it might trip over device which are still processed
 by udev (or, worse still, _not yet_ processed by udev).
 (Yes, I know, libudev_enumerate should protect against this.
  But it doesn't. )
 
 But we start waiting for events before the initial multipath device
 configuration, and don't process them until after that configuration
 is compelete, so if there is ever a case where the initial configuration
 is accessing the device to early, aren't we guaranteed to get an event
 afterwards, assuming that udev doesn't drop it?
 
That was the initial idea. Only it doesn't do it currently :-)


 So it's anyone guess what'll happen now; either multipath trips over
 the lock from udev when calling 'lock_multipath' (and consequently
 failing to setup the multipath device), or udev
 tripping over the lock from multipath and ignoring the event,
 leaving us with a non-functioning device.
 
 But my point above is that if we use a lockfile instead of locking the
 path device itself, there won't be any lock contention, and udev won't
 drop the events.
 
The underlying issue here is:

Why does multipath lock the devices _at all_?
If it were to protect against device disappearing while doing the
ioctl that's already proven not to work.
And for protecting against mounts a simple open(O_EXCL) would be
sufficient. So whom are we fooling here?

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries  Storage
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [dm-devel] multipath breaks with recent udev/systemd

2014-12-17 Thread Hannes Reinecke
On 12/16/2014 11:18 PM, Benjamin Marzinski wrote:
 On Tue, Dec 16, 2014 at 04:10:44PM -0600, Benjamin Marzinski wrote:
 On Mon, Dec 15, 2014 at 10:31:44AM +0100, Hannes Reinecke wrote:
[ .. ]
 So during bootup it's anyone's guess who's first, multipath or udev.
 And depending on the timing either multipath will fail to setup
 the device-mapper device or udev will simply ignore the device.
 Neither of those is a good, but the first is an absolute killer for
 modern systems which _rely_ on udev to configure devices.

 So how it this supposed to work?
 Why does udev ignore the entire event if it can't get the lock?
 Shouldn't it rather be retried?
 What is the supposed recovery here?

 Hannes, are you against the idea that Alexander mentioned in his first
 email, of just locking a file in /var/lock?  Multipathd doesn't create
 devices in parallel. Multipath doesn't create files in parallel.  We are
 explicitly trying to avoid multipath and multipathd creating files at
 the same time. So, we should only need a single file to lock, and
 /run/lock should always be there.
 
 O.k. So if we want to keep our current nonblocking behavior, we'll need
 more lockfiles, either one per path or one per wwid.  This still seems
 like a reasonable idea, if there is a good reason for systemd doing what
 it's doing.
 
The problem is as follows:

When multipathd is running we simply _cannot_ guarantee that no udev
events are currently running. This currently hits us especially bad
during system startup when device probing is still running during
multipathd startup.
Multipathd will then enumerate all block devices to setup the
initial topology.
But in doing so it might trip over device which are still processed
by udev (or, worse still, _not yet_ processed by udev).
(Yes, I know, libudev_enumerate should protect against this.
 But it doesn't. )

So it's anyone guess what'll happen now; either multipath trips over
the lock from udev when calling 'lock_multipath' (and consequently
failing to setup the multipath device), or udev
tripping over the lock from multipath and ignoring the event,
leaving us with a non-functioning device.

We can fixup the startup sequence (which we need to do anyway, given
the libudev enumerate bug) to just re-trigger all block device
events, but this still doesn't fix the actual issue.
Point is, there might be _several_ events for the same device being
queued (think of a flaky path with several PATH_FAILED /
PATH_REINSTATED events in a row), and so multipathd might be
processing one event for the device while udev is processing the
next event for the same device.

For this to work we need some synchronization with udev; _if_ there
would be a libudev callout for 'is there an event for this device
running' we can easily fail the 'lock_multipath' operation, knowing
that we'll be getting another event shortly for the same device.

Alternatively we can call flock(LOCK_EX) on that device, but that
will only work if udev would _not_ abort event handling for that
device, but rather issues a retry.
After all, there _is_ a device timeout in udev. It should be
relatively easy to retry the event and let it run into a timeout if
the lock won't be released.

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries  Storage
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] multipath breaks with recent udev/systemd

2014-12-15 Thread Hannes Reinecke
Hi all,

in commit 3ebdb81ef088afd3b4c72b516beb5610f8c93a0d
(udev: serialize/synchronize block device event handling with file
locks) udev started using flock() on the device node, supposedly to
synchronize with an ominous 'block event handling'.

The code looks like this:

  if (d) {
fd_lock = open(udev_device_get_devnode(d),
O_RDONLY|O_CLOEXEC|O_NOFOLLOW|O_NONBLOCK);
if (fd_lock = 0  flock(fd_lock,
LOCK_SH|LOCK_NB)  0) {
 log_debug(Unable to flock(%s),
skipping event handling: %m, udev_device_get_devnode(d));
 err = -EWOULDBLOCK;
 goto skip;
   }
   }

However, multipath since several years is using a similar construct
to lock all devices belonging to a multipath device table before
creating a mulitpath dm-device:

vector_foreach_slot (mpp-pg, pgp, i) {
if (!pgp-paths)
continue;
vector_foreach_slot(pgp-paths, pp, j) {
if (lock  flock(pp-fd, LOCK_SH | LOCK_NB) 
errno == EWOULDBLOCK)
goto fail;
else if (!lock)
flock(pp-fd, LOCK_UN);
}
}

So during bootup it's anyone's guess who's first, multipath or udev.
And depending on the timing either multipath will fail to setup
the device-mapper device or udev will simply ignore the device.
Neither of those is a good, but the first is an absolute killer for
modern systems which _rely_ on udev to configure devices.

So how it this supposed to work?
Why does udev ignore the entire event if it can't get the lock?
Shouldn't it rather be retried?
What is the supposed recovery here?

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries  Storage
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] udevd: add --event-timeout commandline option

2014-07-30 Thread Hannes Reinecke

On 07/29/2014 04:13 PM, Kay Sievers wrote:

On Tue, Jul 29, 2014 at 9:06 AM, Hannes Reinecke h...@suse.de wrote:

On large configurations some events take longer than the
default 30 seconds. Killing those events will leave the
machine halfway configured.

So add a commandline option '--event-timeout' to handle these cases.


Applied. But with a follow-up commit, I changed the timeout logic. We
do not need or want several independent timeouts for the same thing.
Please check.


Hmm. You sure this is correct?


@@ -1462,14 +1454,6 @@ static int add_rule(struct udev_rules *rules, 
char *line,
 rule_add_key(rule_tmp, 
TK_A_DEVLINK_PRIO, op, NULL, prio);

 }

-pos = strstr(value, event_timeout=);
-if (pos != NULL) {
-int tout = 
atoi(pos[strlen(event_timeout=)]);

-
-rule_add_key(rule_tmp, 
TK_M_EVENT_TIMEOUT, op, NULL, tout);

-}
-
-pos = strstr(value, string_escape=);
 if (pos != NULL) {
 pos = pos[strlen(string_escape=)];
 if (startswith(pos, none))

Looks like the line 'pos = strstr(value, string_escape=);' should 
have been deleted ...


Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH] Fixup commit dd5eddd28a74a49607a8fffcaf960040dba98479

2014-07-30 Thread Hannes Reinecke
Commit dd5eddd28a74a49607a8fffcaf960040dba98479 accidentally
removed one line too many.

Signed-off-by: Hannes Reinecke h...@suse.de
---
 src/udev/udev-rules.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/udev/udev-rules.c b/src/udev/udev-rules.c
index 59bc124..cc56215 100644
--- a/src/udev/udev-rules.c
+++ b/src/udev/udev-rules.c
@@ -1436,6 +1436,7 @@ static int add_rule(struct udev_rules *rules, char *line,
 rule_add_key(rule_tmp, TK_A_DEVLINK_PRIO, op, 
NULL, prio);
 }
 
+pos = strstr(value, string_escape=);
 if (pos != NULL) {
 pos = pos[strlen(string_escape=)];
 if (startswith(pos, none))
-- 
1.8.4.5

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCHv3] use systemd.debug on the kernel command line, not debug

2014-04-03 Thread Hannes Reinecke
From: Greg KH gre...@linuxfoundation.org

If the kernel is started with debug, that's for the kernel to switch
into debug mode.  We should rely on a namespace for our options, like
everything else (with the exception of quiet).  Some people want to
only debug the kernel, not systemd, and the opposite as well so make
everyone happy.

Signed-off-by: Greg KH gre...@linuxfoundation.org
Signed-off-by: Hannes Reinecke h...@suse.de
---
 man/kernel-command-line.xml |  4 ++--
 src/core/main.c | 19 ++-
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/man/kernel-command-line.xml b/man/kernel-command-line.xml
index dbfec61..19da7a3 100644
--- a/man/kernel-command-line.xml
+++ b/man/kernel-command-line.xml
@@ -126,10 +126,10 @@
 /varlistentry
 
 varlistentry
-termvarnamedebug/varname/term
+termvarnamesystemd.debug/varname/term
 listitem
 paraParameter understood by
-both the kernel and the system
+the system
 and service manager to control
 console log verbosity. For
 details, see
diff --git a/src/core/main.c b/src/core/main.c
index 41605ee..bbacfd1 100644
--- a/src/core/main.c
+++ b/src/core/main.c
@@ -374,6 +374,15 @@ static int parse_proc_cmdline_item(const char *key, const 
char *value) {
 } else
 log_warning(Environment variable name '%s' is not 
valid. Ignoring., value);
 
+} else if (streq(key, systemd.debug)  !value) {
+
+/* Log to kmsg, the journal socket will fill up before the
+ * journal is started and tools running during that time
+ * will block with every log message for for 60 seconds,
+ * before they give up. */
+log_set_max_level(LOG_DEBUG);
+log_set_target(detect_container(NULL)  0 ? LOG_TARGET_CONSOLE 
: LOG_TARGET_KMSG);
+
 } else if (!streq(key, systemd.restore_state) 
!streq(key, systemd.gpt_auto) 
(startswith(key, systemd.) || startswith(key, 
rd.systemd.))) {
@@ -409,6 +418,7 @@ static int parse_proc_cmdline_item(const char *key, const 
char *value) {
   Set 
default log error output for services\n
  systemd.setenv=ASSIGNMENTSet 
an environment variable for all spawned processes\n
  systemd.restore_state=0|1
Restore backlight/rfkill state at boot\n);
+ systemd.debug
Enable debugging output\n);
 }
 
 } else if (streq(key, quiet)  !value) {
@@ -416,15 +426,6 @@ static int parse_proc_cmdline_item(const char *key, const 
char *value) {
 if (arg_show_status == _SHOW_STATUS_UNSET)
 arg_show_status = SHOW_STATUS_AUTO;
 
-} else if (streq(key, debug)  !value) {
-
-/* Log to kmsg, the journal socket will fill up before the
- * journal is started and tools running during that time
- * will block with every log message for for 60 seconds,
- * before they give up. */
-log_set_max_level(LOG_DEBUG);
-log_set_target(detect_container(NULL)  0 ? LOG_TARGET_CONSOLE 
: LOG_TARGET_KMSG);
-
 } else if (!in_initrd()  !value) {
 unsigned i;
 
-- 
1.8.1.4

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCHv3] use systemd.debug on the kernel command line, not debug

2014-04-03 Thread Hannes Reinecke
On 04/03/2014 01:02 PM, Colin Guthrie wrote:
 'Twas brillig, and Hannes Reinecke at 03/04/14 07:52 did gyre and gimble:
   systemd.restore_state=0|1
 Restore backlight/rfkill state at boot\n);
 + systemd.debug
 Enable debugging output\n);
 
 Line above the change line should have the ); removed...
 
Indeed. Serves me right for not having tested them.

Sorry for this.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH v4] use systemd.debug on the kernel command line, not debug

2014-04-03 Thread Hannes Reinecke
From: Greg KH gre...@linuxfoundation.org

If the kernel is started with debug, that's for the kernel to switch
into debug mode.  We should rely on a namespace for our options, like
everything else (with the exception of quiet).  Some people want to
only debug the kernel, not systemd, and the opposite as well so make
everyone happy.

Signed-off-by: Greg KH gre...@linuxfoundation.org
Signed-off-by: Hannes Reinecke h...@suse.de
---
 man/kernel-command-line.xml |  4 ++--
 src/core/main.c | 21 +++--
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/man/kernel-command-line.xml b/man/kernel-command-line.xml
index dbfec61..19da7a3 100644
--- a/man/kernel-command-line.xml
+++ b/man/kernel-command-line.xml
@@ -126,10 +126,10 @@
 /varlistentry
 
 varlistentry
-termvarnamedebug/varname/term
+termvarnamesystemd.debug/varname/term
 listitem
 paraParameter understood by
-both the kernel and the system
+the system
 and service manager to control
 console log verbosity. For
 details, see
diff --git a/src/core/main.c b/src/core/main.c
index 41605ee..2ca038c 100644
--- a/src/core/main.c
+++ b/src/core/main.c
@@ -374,6 +374,15 @@ static int parse_proc_cmdline_item(const char *key, const 
char *value) {
 } else
 log_warning(Environment variable name '%s' is not 
valid. Ignoring., value);
 
+} else if (streq(key, systemd.debug)  !value) {
+
+/* Log to kmsg, the journal socket will fill up before the
+ * journal is started and tools running during that time
+ * will block with every log message for for 60 seconds,
+ * before they give up. */
+log_set_max_level(LOG_DEBUG);
+log_set_target(detect_container(NULL)  0 ? LOG_TARGET_CONSOLE 
: LOG_TARGET_KMSG);
+
 } else if (!streq(key, systemd.restore_state) 
!streq(key, systemd.gpt_auto) 
(startswith(key, systemd.) || startswith(key, 
rd.systemd.))) {
@@ -408,7 +417,8 @@ static int parse_proc_cmdline_item(const char *key, const 
char *value) {
  
systemd.default_standard_error=null|tty|syslog|syslog+console|kmsg|kmsg+console|journal|journal+console\n
   Set 
default log error output for services\n
  systemd.setenv=ASSIGNMENTSet 
an environment variable for all spawned processes\n
- systemd.restore_state=0|1
Restore backlight/rfkill state at boot\n);
+ systemd.restore_state=0|1
Restore backlight/rfkill state at boot\n
+ systemd.debug
Enable debugging output\n);
 }
 
 } else if (streq(key, quiet)  !value) {
@@ -416,15 +426,6 @@ static int parse_proc_cmdline_item(const char *key, const 
char *value) {
 if (arg_show_status == _SHOW_STATUS_UNSET)
 arg_show_status = SHOW_STATUS_AUTO;
 
-} else if (streq(key, debug)  !value) {
-
-/* Log to kmsg, the journal socket will fill up before the
- * journal is started and tools running during that time
- * will block with every log message for for 60 seconds,
- * before they give up. */
-log_set_max_level(LOG_DEBUG);
-log_set_target(detect_container(NULL)  0 ? LOG_TARGET_CONSOLE 
: LOG_TARGET_KMSG);
-
 } else if (!in_initrd()  !value) {
 unsigned i;
 
-- 
1.8.1.4

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCHv3] use systemd.debug on the kernel command line, not debug

2014-04-03 Thread Hannes Reinecke
On 04/03/2014 05:17 PM, Greg KH wrote:
 On Thu, Apr 03, 2014 at 08:52:06AM +0200, Hannes Reinecke wrote:
 From: Greg KH gre...@linuxfoundation.org

 If the kernel is started with debug, that's for the kernel to switch
 into debug mode.  We should rely on a namespace for our options, like
 everything else (with the exception of quiet).  Some people want to
 only debug the kernel, not systemd, and the opposite as well so make
 everyone happy.

 Signed-off-by: Greg KH gre...@linuxfoundation.org
 
 NEVER add this line to a patch from someone who did not add it
 themselves, as it means something.
 
 Come on, you know better Hannes...
 
Sorry.
Won't happen again.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH] systemd: powerd initctl support

2014-03-07 Thread Hannes Reinecke
Old versions of powerd will be using the initctl fifo to signal
state changes. To maintain backward compability systemd should
be interpreting these messages, too.

Signed-off-by: Hannes Reinecke h...@suse.de
---
 src/initctl/initctl.c | 71 ++-
 1 file changed, 70 insertions(+), 1 deletion(-)

diff --git a/src/initctl/initctl.c b/src/initctl/initctl.c
index 468df35..d4794a6 100644
--- a/src/initctl/initctl.c
+++ b/src/initctl/initctl.c
@@ -32,8 +32,11 @@
 #include sys/un.h
 #include fcntl.h
 #include ctype.h
+#include sys/reboot.h
+#include linux/reboot.h
 
 #include sd-daemon.h
+#include sd-shutdown.h
 #include sd-bus.h
 
 #include util.h
@@ -44,6 +47,7 @@
 #include bus-util.h
 #include bus-error.h
 #include def.h
+#include socket-util.h
 
 #define SERVER_FD_MAX 16
 #define TIMEOUT_MSEC ((int) (DEFAULT_EXIT_USEC/USEC_PER_MSEC))
@@ -141,7 +145,53 @@ static void change_runlevel(Server *s, int runlevel) {
 }
 }
 
+static int send_shutdownd(unsigned delay, char mode, const char *message) {
+usec_t t = now(CLOCK_REALTIME) + delay * USEC_PER_MINUTE;
+struct sd_shutdown_command c = {
+.usec = t,
+.mode = mode,
+.dry_run = false,
+.warn_wall = true,
+};
+
+union sockaddr_union sockaddr = {
+.un.sun_family = AF_UNIX,
+.un.sun_path = /run/systemd/shutdownd,
+};
+
+struct iovec iovec[2] = {{
+ .iov_base = (char*) c,
+ .iov_len = offsetof(struct sd_shutdown_command, wall_message),
+}};
+
+struct msghdr msghdr = {
+.msg_name = sockaddr,
+.msg_namelen = offsetof(struct sockaddr_un, sun_path)
+   + sizeof(/run/systemd/shutdownd) - 1,
+.msg_iov = iovec,
+.msg_iovlen = 1,
+};
+
+_cleanup_close_ int fd;
+
+fd = socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0);
+if (fd  0)
+return -errno;
+
+if (!isempty(message)) {
+iovec[1].iov_base = (char*) message;
+iovec[1].iov_len = strlen(message);
+msghdr.msg_iovlen++;
+}
+
+if (sendmsg(fd, msghdr, MSG_NOSIGNAL)  0)
+return -errno;
+
+return 0;
+}
+
 static void request_process(Server *s, const struct init_request *req) {
+int r;
 assert(s);
 assert(req);
 
@@ -184,9 +234,28 @@ static void request_process(Server *s, const struct 
init_request *req) {
 return;
 
 case INIT_CMD_POWERFAIL:
+r = send_shutdownd(2, SD_SHUTDOWN_POWEROFF,
+   THE POWER IS FAILED! SYSTEM GOING DOWN! 
PLEASE LOG OFF NOW!);
+if (r  0) {
+log_warning(Failed to talk to shutdownd, shutdown 
cancelled: %s, strerror(-r));
+}
+return;
 case INIT_CMD_POWERFAILNOW:
+r = send_shutdownd(0, SD_SHUTDOWN_POWEROFF,
+   THE POWER IS FAILED! LOW BATTERY - 
EMERGENCY SYSTEM SHUTDOWN!);
+if (r  0) {
+log_warning(Failed to talk to shutdownd, proceeding 
with immediate shutdown: %s, strerror(-r));
+reboot(RB_ENABLE_CAD);
+reboot(RB_POWER_OFF);
+}
+return;
+
 case INIT_CMD_POWEROK:
-log_warning(Received UPS/power initctl request. This is not 
implemented in systemd. Upgrade your UPS daemon!);
+r = send_shutdownd(0, SD_SHUTDOWN_NONE,
+   THE POWER IS BACK);
+if (r  0) {
+log_warning(Failed to talk to shutdownd, proceeding 
with shutdown: %s, strerror(-r));
+}
 return;
 
 case INIT_CMD_CHANGECONS:
-- 
1.8.1.4

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCHv3] tty: Set correct tty name in 'active' sysfs attribute

2014-02-07 Thread Hannes Reinecke
The 'active' sysfs attribute should refer to the currently
active tty devices the console is running on, not the currently
active console.
The console structure doesn't refer to any device in sysfs,
only the tty the console is running on has.
So we need to print out the tty names in 'active', not
the console names.

Cc: Lennart Poettering lenn...@poettering.net
Cc: Kay Sievers k...@vrfy.org
Cc: Greg Kroah-Hartmann gre...@linuxfoundation.org
Cc: Jiri Slaby jsl...@suse.cz
Cc: David Herrmann dh.herrm...@gmail.com
Signed-off-by: Werner Fink wer...@suse.de
Signed-off-by: Hannes Reinecke h...@suse.de
---
 Documentation/ABI/testing/sysfs-tty |  3 ++-
 drivers/tty/tty_io.c| 25 ++---
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-tty 
b/Documentation/ABI/testing/sysfs-tty
index ad22fb0..a2ccec3 100644
--- a/Documentation/ABI/testing/sysfs-tty
+++ b/Documentation/ABI/testing/sysfs-tty
@@ -3,7 +3,8 @@ Date:   Nov 2010
 Contact:   Kay Sievers kay.siev...@vrfy.org
 Description:
 Shows the list of currently configured
-console devices, like 'tty1 ttyS0'.
+tty devices used for the console,
+like 'tty1 ttyS0'.
 The last entry in the file is the active
 device connected to /dev/console.
 The file supports poll() to detect virtual
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index c74a00a..bd2715a 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1267,16 +1267,17 @@ static void pty_line_name(struct tty_driver *driver, 
int index, char *p)
  * @p: output buffer of at least 7 bytes
  *
  * Generate a name from a driver reference and write it to the output
- * buffer.
+ * buffer. Return the number of bytes written.
  *
  * Locking: None
  */
-static void tty_line_name(struct tty_driver *driver, int index, char *p)
+static ssize_t tty_line_name(struct tty_driver *driver, int index, char *p)
 {
if (driver-flags  TTY_DRIVER_UNNUMBERED_NODE)
-   strcpy(p, driver-name);
+   return sprintf(p, %s, driver-name);
else
-   sprintf(p, %s%d, driver-name, index + driver-name_base);
+   return sprintf(p, %s%d, driver-name,
+  index + driver-name_base);
 }
 
 /**
@@ -3545,9 +3546,19 @@ static ssize_t show_cons_active(struct device *dev,
if (i = ARRAY_SIZE(cs))
break;
}
-   while (i--)
-   count += sprintf(buf + count, %s%d%c,
-cs[i]-name, cs[i]-index, i ? ' ':'\n');
+   while (i--) {
+   struct tty_driver *driver;
+   const char *name = cs[i]-name;
+   int index = cs[i]-index;
+
+   driver = cs[i]-device(cs[i], index);
+   if (driver) {
+   count += tty_line_name(driver, index, buf + count);
+   count += sprintf(buf + count, %c, i ? ' ':'\n');
+   } else
+   count += sprintf(buf + count, %s%d%c,
+name, index, i ? ' ':'\n');
+   }
console_unlock();
 
return count;
-- 
1.7.12.4

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] tty: Set correct tty name in 'active' sysfs attribute

2014-02-06 Thread Hannes Reinecke
On 02/05/2014 01:53 PM, David Herrmann wrote:
 Hi
 
 On Wed, Feb 5, 2014 at 11:11 AM, Hannes Reinecke h...@suse.de wrote:
 The 'active' sysfs attribute should refer to the currently
 active tty devices the console is running on, not the currently
 active console.
 The console structure doesn't refer to any device in sysfs,
 only the tty the console is running on has.
 So we need to print out the tty names in 'active', not
 the console names.

 Cc: Lennart Poettering lenn...@poettering.net
 Cc: Kay Sievers k...@vrfy.org
 Signed-off-by: Werner Fink wer...@suse.de
 Signed-off-by: Hannes Reinecke h...@suse.de
 ---
  drivers/tty/tty_io.c | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

 diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
 index c74a00a..17db8ca 100644
 --- a/drivers/tty/tty_io.c
 +++ b/drivers/tty/tty_io.c
 @@ -3545,9 +3545,19 @@ static ssize_t show_cons_active(struct device *dev,
 if (i = ARRAY_SIZE(cs))
 break;
 }
 -   while (i--)
 +   while (i--) {
 +   const struct tty_driver *driver;
 +   const char *name = cs[i]-name;
 +   int index = cs[i]-index;
 +
 +   driver = cs[i]-device(cs[i], index);
 +   if (driver) {
 +   index += driver-name_base;
 +   name = driver-name;
 +   }
 count += sprintf(buf + count, %s%d%c,
 -cs[i]-name, cs[i]-index, i ? ' ':'\n');
 +name, index, i ? ' ':'\n');
 +   }
 
 Nice catch and indeed, systemd already relies on these names to be
 identical to their char-dev name. Fortunately, VTs and most serial
 devices register the console with the same name as the TTY, so we're
 fine.
 Two minor nitpicks:
 1) Could you use tty_line_name() instead of sprintf()? It's in the
 same file and avoids duplicating the name_base logic.
Ok. Not that it makes the patch nicer, but hey.

 2) Does it make sense to print the console-name if -device() returns
 NULL? Seems weird if we print console-names and tty-names in the same
 attribute. It's unlikely that it causes problems, but there might be
 conflicts.
 
This is basically a fallback; this is the old behaviour, which still
might be called for when coming across a tty which just has a stub
for the -device callback.
It's not that the '-device' callback is used that frequently, so I
wouldn't be surprised here.

Meanwhile I've sent a new patch, reviews are welcome there.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCHv2] tty: Set correct tty name in 'active' sysfs attribute

2014-02-06 Thread Hannes Reinecke
On 02/06/2014 04:29 PM, Greg Kroah-Hartman wrote:
 On Thu, Feb 06, 2014 at 03:27:43PM +0100, Hannes Reinecke wrote:
 The 'active' sysfs attribute should refer to the currently
 active tty devices the console is running on, not the currently
 active console.
 
 That's not what Documentation/ABI/sysfs-tty says:
  Shows the list of currently configured   
   
  console devices, like 'tty1 ttyS0'.  
   
  The last entry in the file is the active 
   
  device connected to /dev/console.
   
  The file supports poll() to detect virtual   
   
  console switches. 
 
The problem is indeed with 'console devices'. There is no such
thing; you only have tty devices where the console is running on.

 The console structure doesn't refer to any device in sysfs,
 only the tty the console is running on has.
 
 That sentance doesn't make sense.
 
 So we need to print out the tty names in 'active', not
 the console names.
 
 But that doesn't match the documentation.
 
 What exactly are you trying to fix here?  What is the problem that the
 current file has that is broken?  And as you are changing what this file
 means, what will break if the information in the file changes?
 
systemd is using the 'active' sysfs attribute to figure out on which
_tty_ device to start a getty on.
As soon as the console name and the tty name are different
you have no means of figuring out which _device_ to open.
AFAICS the console 'device' (ie the current entry in 'active')
doesn't have _any_ equivalent in sysfs; it just so happens that for
most console drivers the tty driver name is identical.
But this is not a requirement, and fails for drivers which have a
different device for the console and the tty.

EG on S/390 the 3270 tty has the devices

/dev/3270/tty1

but the console driver announces the name 'tty3270'.
So as per current rules the 'active' attribute contains

tty32700

which correct as per documentation, but doesn't have _any_
equivalent in sysfs.

Martin has the grubby details here.

But of course, the documentation should be updated to match the new
behavior.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] [PATCH] tty: Set correct tty name in 'active' sysfs attribute

2014-02-05 Thread Hannes Reinecke
The 'active' sysfs attribute should refer to the currently
active tty devices the console is running on, not the currently
active console.
The console structure doesn't refer to any device in sysfs,
only the tty the console is running on has.
So we need to print out the tty names in 'active', not
the console names.

Cc: Lennart Poettering lenn...@poettering.net
Cc: Kay Sievers k...@vrfy.org
Signed-off-by: Werner Fink wer...@suse.de
Signed-off-by: Hannes Reinecke h...@suse.de
---
 drivers/tty/tty_io.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index c74a00a..17db8ca 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -3545,9 +3545,19 @@ static ssize_t show_cons_active(struct device *dev,
if (i = ARRAY_SIZE(cs))
break;
}
-   while (i--)
+   while (i--) {
+   const struct tty_driver *driver;
+   const char *name = cs[i]-name;
+   int index = cs[i]-index;
+
+   driver = cs[i]-device(cs[i], index);
+   if (driver) {
+   index += driver-name_base;
+   name = driver-name;
+   }
count += sprintf(buf + count, %s%d%c,
-cs[i]-name, cs[i]-index, i ? ' ':'\n');
+name, index, i ? ' ':'\n');
+   }
console_unlock();
 
return count;
-- 
1.7.12.4

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH] s390/getty-generator: initialize essential system terminals/consoles

2014-02-03 Thread Hannes Reinecke
On 01/31/2014 05:08 PM, Hendrik Brueckner wrote:
 Ensure to start getty programs on all essential system consoles on Linux on
 System z.  Add these essential devices to the list of virtualization_consoles
 to always generate getty configurations.
 
 For the sake of completion, the list of essential consoles is:
 
   /dev/sclp_line0 - Operating system messages applet (LPAR)
   /dev/ttysclp0 - Integrated ASCII console applet (z/VM and LPAR)
   /dev/ttyS0 - Already handled by systemd (3215 console on z/VM)
   /dev/hvc0  - Already handled by systemd (IUCV HVC terminal on z/VM)
 
 Depending on the environment, z/VM or LPAR, only a subset of these terminals
 are available.
 
 See also RH BZ 860158[1] Cannot login via Operating System Console into RHEL7
 instance installed on a LPAR.  This bugzilla actually blocks the installation
 of Linux on System z instances in LPAR mode.
 
Hehe.

Nice try, but sadly incomplete.

When switching to a real 3270 console (try 'conmode=3270' on the
kernel command line) systemd isn't able to open a console, either.

Which opens up a can of worms:

- The S/390 3270 tty device is using a device node
  /dev/3270/ttyX _and_ an offset '1' to the minor node.
  So the first tty here is in fact /dev/3270/tty1
- systemd is using the 'active' sysfs attribute in
  /sys/class/tty/console to figure out the active
  console; for the 3270 console this contains the string
  'tty32700'.
  Which of course doesn't exist and confuses systemd
  getty-generator.

The reason for the slightly weird string in 'active' is the way it's
generated (check drivers/tty/tty_io.c:show_cons_active()):
count += sprintf(buf + count, %s%d%c,
 cs[i]-name, cs[i]-index, i ? ' ':'\n');

where 'cs' is the _console_ structure, not the tty structure.
So we're getting the _console_ name plus the _console_ index here.
And the console name for the 3270 console is 'tty3270', with the
index '0'.

Which raises the question: what exactly should be 'active' contain?
The console name (which doesn't have any equivalent in sysfs), or
the tty name (which has)?

And, more importantly, how is one supposed to _find_ the
corresponding sysfs entry for the current 'active' attribute?

From what I've seen most driver work by virtue of the happy accident
that the console index equals the tty index.
But that's not a requirement anywhere in the console code.
Quite the contrary; tty drivers have the 'first_minor' entry
to explicit request an offset other than '0'.

(And the console driver has an explict '-device' callback
which allows the tty driver to return the correct index.
Not that it's used here; would've been too easy).

So how to fix this?
Update the driver to adhere to the (broken) current behaviour?
Or modify 'active' to return the corrent tty name?
Or add a workaround to systemd?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] udev rules environment variable

2013-12-17 Thread Hannes Reinecke
On 12/17/2013 11:52 AM, Kay Sievers wrote:
 On Tue, Dec 17, 2013 at 8:56 AM, Hannes Reinecke h...@suse.de wrote:
 On 12/17/2013 08:52 AM, Robert Milasan wrote:
 Hello,
   got a small question about creating a rule, like this:

 ACTION==add, , ENV{test_device}=1

 ACTION==remove, , ENV{test_device}==1,
 RUN+=/path/to/some/script

 Does udev save test_device variable someplace and then it can be used
 later on, when have ACTION==remove ?
 
 Typically not.
 
 There is a standard REMOVE_CMD in 95-udev-late.rules to use for this
 
 All variables are stored in the database and are contained in the
 remove event, without the need to be added or queried. Sysfs though
 is not available, all of it is gone at remove.
 
Ah.

Curious.

So 'remove' contains the environment variables set from the 'add' event?

What about the 'change' event?

I would have thought the same rule applies to this one, but
experience shows that 'change' events have to generate the same
environment variables as the 'add' event.
Otherwise unexpected results happen ...

So why are the environment variables / db entries not merged
into the 'change' environment?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] udev rules environment variable

2013-12-16 Thread Hannes Reinecke
On 12/17/2013 08:52 AM, Robert Milasan wrote:
 Hello,
   got a small question about creating a rule, like this:
 
 ACTION==add, , ENV{test_device}=1
 
 ACTION==remove, , ENV{test_device}==1,
 RUN+=/path/to/some/script
 
 Does udev save test_device variable someplace and then it can be used
 later on, when have ACTION==remove ?
 

Typically not.
I would be using external tools for that, like 'collect'

Might be biased, though, what with me having written it ...

And the alternative of querying the udev database from within an
udev event is just plain ugly and prone to races.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Question about a udev rule

2013-12-08 Thread Hannes Reinecke
On 12/08/2013 07:14 PM, Robert Milasan wrote:
 I've got this small rule which seems to not work at all:
 
 ACTION!=add|change, GOTO=end_root_symlink
 SUBSYSTEM!=block, GOTO=end_root_symlink
 ENV{DEVTYPE}!=partition, GOTO=end_root_symlink
 
 IMPORT{program}=/usr/bin/udevadm info --export --export-prefix=ROOT_
 --device-id-of-file=/ ENV{MAJOR}!=0, ENV{MAJOR}==$env{ROOT_MAJOR},
 ENV{MINOR}==$env{ROOT_MINOR}, SYMLINK+=root
 
 LABEL=end_root_symlink
 
 Can anyone tell me what I'm doing wrong?
 
'udevadm info' pulls information from the udev database, which is
filled with information _after_ the rule has been processed.
So for the current event the udev database will be empty.

Also, calling 'udevadm info' from a udev rule is inherently wrong,
due to two reasons:

1) udev rules deal with single events only. So all information for
the current event is directly available (set in environment
variables etc), and there is no need to call 'udevadm info'
2) For cross-event mechanisms (ie if you need to synchronize
between several events) you should be using appropriate
mechanisms like collect.

And the 'root' symlink is generated directly by dracut; maybe you
should look there ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Hard-coded /bin/mount in systemd

2013-11-27 Thread Hannes Reinecke
Hi all,

for some reason systemd has /bin/mount hardcoded in
src/core/mount.c:mount_enter_mounting()

Which is a bit odd, seeing that everyting moved to /usr/bin.
So we always have to do a symlink here, which really is a bit annoying.

Is this by design or a simple left-over?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Hard-coded /bin/mount in systemd

2013-11-27 Thread Hannes Reinecke
On 11/27/2013 02:58 PM, Mantas Mikulėnas wrote:
 On Wed, Nov 27, 2013 at 3:31 PM, Hannes Reinecke h...@suse.de wrote:

 Hi all,

 for some reason systemd has /bin/mount hardcoded in
 src/core/mount.c:mount_enter_mounting()

 Which is a bit odd, seeing that everyting moved to /usr/bin.
 So we always have to do a symlink here, which really is a bit annoying.

 Is this by design or a simple left-over?
 
 If *everything* moved to /usr/bin, then /bin itself has to be a
 symlink anyway (as many tools expect and some standards require
 specific commands to be in /bin).
 
Ah. IIRC it was _systemd_ which initiated the move to /usr, so I
found it slightly odd to rely on a location which it has just
obsoleted ...
Or, rather, to have a hard-coded location to start with.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel