Re: [systemd-devel] Errorneous detection of degraded array

2017-02-07 Thread NeilBrown
On Tue, Jan 31 2017, Andrei Borzenkov wrote:

>
 Changing the
   Conflicts=sys-devices-virtual-block-%i.device
 lines to
   ConditionPathExists=/sys/devices/virtual/block/%i
 might make the problem go away, without any negative consequences.

>>>
>>> Ugly, but yes, may be this is the only way using current systemd.
>>>
>
> This won't work. sysfs node appears as soon as the very first array
> member is found and array is still inactive, while what we need is
> condition "array is active".

Of course, you are right.
A suitable "array is active" test is the existence of
  .../md/sync_action
which appears when an array is activated (except for RAID0 and Linear,
which don't need last-resort support).

So this is what I propose to post upstream.  Could you please confirm
that it works for you?  It appears to work for me.

Thanks,
NeilBrown

From: NeilBrown 
Date: Wed, 8 Feb 2017 15:01:05 +1100
Subject: [PATCH] systemd/mdadm-last-resort: use ConditionPathExists instead of
 Conflicts

Commit cec72c071bbe ("systemd/mdadm-last-resort: add Conflicts to .service 
file.")

added a 'Conflicts' directive to the mdadm-last-resort@.service file in
the hope that this would make sure the service didn't run after the device
was active, even if the timer managed to get started, which is possible in
race conditions.

This seemed to work is testing, but it isn't clear why, and it is known
to cause problems.
If systemd happens to know that the mentioned device is a dependency of a
mount point, the Conflicts can unmount that mountpoint, which is certainly
not wanted.

So remove the "Conflicts" and instead use
 ConditionPathExists=!/sys/devices/virtual/block/%i/md/sync_action

The "sync_action" file exists for any array which require last-resort
handling, and only appear when the array is activated.  So it is safe
to reliy on it to determine if the last-resort is really needed.

Fixes: cec72c071bbe ("systemd/mdadm-last-resort: add Conflicts to .service 
file.")
Signed-off-by: NeilBrown 
---
 systemd/mdadm-last-resort@.service | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/systemd/mdadm-last-resort@.service 
b/systemd/mdadm-last-resort@.service
index e93d72b2b45e..f9d4d12738a3 100644
--- a/systemd/mdadm-last-resort@.service
+++ b/systemd/mdadm-last-resort@.service
@@ -1,7 +1,7 @@
 [Unit]
 Description=Activate md array even though degraded
 DefaultDependencies=no
-Conflicts=sys-devices-virtual-block-%i.device
+ConditionPathExists=!/sys/devices/virtual/block/%i/md/sync_action
 
 [Service]
 Type=oneshot
-- 
2.11.0



signature.asc
Description: PGP signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Errorneous detection of degraded array

2017-01-31 Thread Andrei Borzenkov
31.01.2017 01:19, NeilBrown пишет:
> On Mon, Jan 30 2017, Andrei Borzenkov wrote:
> 
>> On Mon, Jan 30, 2017 at 9:36 AM, NeilBrown  wrote:
>> ...
>>>
>>> systemd[1]: Created slice system-mdadm\x2dlast\x2dresort.slice.
>>> systemd[1]: Starting system-mdadm\x2dlast\x2dresort.slice.
>>> systemd[1]: Starting Activate md array even though degraded...
>>> systemd[1]: Stopped target Local File Systems.
>>> systemd[1]: Stopping Local File Systems.
>>> systemd[1]: Unmounting /share...
>>> systemd[1]: Stopped (with error) /dev/md0.
>
>> ...
>>>
>>> The race is, I think, that one I mentioned.  If the md device is started
>>> before udev tells systemd to start the timer, the Conflicts dependencies
>>> goes the "wrong" way and stops the wrong thing.
>>>
>>
>> From the logs provided it is unclear whether it is *timer* or
>> *service*. If it is timer - I do not understand why it is started
>> exactly 30 seconds after device apparently appears. This would match
>> starting service.
> 
> My guess is that the timer is triggered immediately after the device is
> started, but before it is mounted.
> The Conflicts directive tries to stop the device, but is cannot stop the
> device and there are no dependencies yet, so nothing happen.
> After the timer fires (30 seconds later) the .service starts.  It also
> has a Conflicts directory so systemd tried to stop the device again.
> Now that it has been mounted, there is a dependences that can be
> stopped, and the device gets unmounted.
> 
>>
>> Yet another case where system logging is hopelessly unfriendly for
>> troubleshooting :(
>>
>>> It would be nice to be able to reliably stop the timer when the device
>>> starts, without risking having the device get stopped when the timer
>>> starts, but I don't think we can reliably do that.
>>>
>>
>> Well, let's wait until we can get some more information about what happens.
>>

Not much more, but we at least have confirmed that it was indeed last
resort service which was fired off by last resort timer. Unfortunately
no trace of timer itself.

>>> Changing the
>>>   Conflicts=sys-devices-virtual-block-%i.device
>>> lines to
>>>   ConditionPathExists=/sys/devices/virtual/block/%i
>>> might make the problem go away, without any negative consequences.
>>>
>>
>> Ugly, but yes, may be this is the only way using current systemd.
>>

This won't work. sysfs node appears as soon as the very first array
member is found and array is still inactive, while what we need is
condition "array is active".

Conflicts line works because array is not announced to systemd
(SYSTEMD_READY) until it is active. Which in turn is derived from the
content of md/array_state.

>>> The primary purpose of having the 'Conflicts' directives was so that
>>> systemd wouldn't log
>>>   Starting Activate md array even though degraded
>>> after the array was successfully started.
>>

Yes, I understand it.

>> This looks like cosmetic problem. What will happen if last resort
>> service is started when array is fully assembled? Will it do any harm?
> 
> Yes, it could be seen as cosmetic, but cosmetic issues can be important
> too.  Confusing messages in logs can be harmful.
> 
> In all likely cases, running the last-resort service won't cause any
> harm.
> If, during the 30 seconds, the array is started, then deliberately
> stopped, then partially assembled again, then when the last-resort
> service finally starts it might do the wrong thing.
> So it would be cleanest if the timer was killed as soon as the device
> is started.  But I don't think there is a practical concern.
> 
> I guess I could make a udev rule that fires when the array started, and
> that runs "systemctl stop mdadm-last-resort@md0.timer"
> 


Well ... what we really need is unidirectional dependency. Actually the
way Conflicts is used *is* unidirectional anyway - nobody seriously
expects that starting foo.service will stop currently running
shutdown.target. But that is semantic we have currently.

But this probably will do to mitigate this issue until something more
generic can be implemented.



signature.asc
Description: OpenPGP digital signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Errorneous detection of degraded array

2017-01-30 Thread Luke Pyzowski
> Does
>   systemctl  list-dependencies  sys-devices-virtual-block-md0.device
> report anything interesting?  I get
>
> sys-devices-virtual-block-md0.device
> ● └─mdmonitor.service

Nothing interesting, the same output as you have above.



> Could you try run with systemd.log_level=debug on kernel command line and 
> upload journal again. We can only hope that it will not skew timings enough 
> but it may prove my hypothesis.

I've uploaded the full debug logs to: 
https://gist.github.com/Kryai/8273322c8a61347e2300e476c70b4d05
In around 20 reboots, the error appeared only twice, certainly with debug 
enabled it is more rare, but it does still occur, but to your correct guess, 
debug logging does affect the exhibition of the race condition.

Reminder of key things in the log:
# cat /etc/systemd/system/mdadm-last-resort@.timer 
[Unit]
Description=Timer to wait for more drives before activating degraded array.
DefaultDependencies=no
Conflicts=sys-devices-virtual-block-%i.device

[Timer]
OnActiveSec=30



# cat /etc/systemd/system/share.mount 
[Unit]
Description=Mount /share RAID partition explicitly
Before=nfs-server.service

[Mount]
What=/dev/disk/by-uuid/2b9114be-3d5a-41d7-8d4b-e5047d223129
Where=/share
Type=ext4
Options=defaults
TimeoutSec=120

[Install]
WantedBy=multi-user.target


Again, if any more information is needed please let me know I'll provide it.


Many thanks,
Luke Pyzowski
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Errorneous detection of degraded array

2017-01-29 Thread Andrei Borzenkov
On Mon, Jan 30, 2017 at 9:36 AM, NeilBrown  wrote:
...
>
> systemd[1]: Created slice system-mdadm\x2dlast\x2dresort.slice.
> systemd[1]: Starting system-mdadm\x2dlast\x2dresort.slice.
> systemd[1]: Starting Activate md array even though degraded...
> systemd[1]: Stopped target Local File Systems.
> systemd[1]: Stopping Local File Systems.
> systemd[1]: Unmounting /share...
> systemd[1]: Stopped (with error) /dev/md0.
>>>
...
>
> The race is, I think, that one I mentioned.  If the md device is started
> before udev tells systemd to start the timer, the Conflicts dependencies
> goes the "wrong" way and stops the wrong thing.
>

From the logs provided it is unclear whether it is *timer* or
*service*. If it is timer - I do not understand why it is started
exactly 30 seconds after device apparently appears. This would match
starting service.

Yet another case where system logging is hopelessly unfriendly for
troubleshooting :(

> It would be nice to be able to reliably stop the timer when the device
> starts, without risking having the device get stopped when the timer
> starts, but I don't think we can reliably do that.
>

Well, let's wait until we can get some more information about what happens.

> Changing the
>   Conflicts=sys-devices-virtual-block-%i.device
> lines to
>   ConditionPathExists=/sys/devices/virtual/block/%i
> might make the problem go away, without any negative consequences.
>

Ugly, but yes, may be this is the only way using current systemd.

> The primary purpose of having the 'Conflicts' directives was so that
> systemd wouldn't log
>   Starting Activate md array even though degraded
> after the array was successfully started.

This looks like cosmetic problem. What will happen if last resort
service is started when array is fully assembled? Will it do any harm?

> Hopefully it won't do that when the Condition fails.
>
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Errorneous detection of degraded array

2017-01-29 Thread Reindl Harald



Am 30.01.2017 um 07:36 schrieb NeilBrown:

By virtue of "Following" attribute. dev-md0.device is Following
sys-devices-virtual-block-md0.device so stopping the latter will also
stop the former.


Ahh.. I see why I never saw this now.
Two reasons.
 1/ My /etc/fstab has UUID=d1711227-c9fa-4883-a904-7cd7a3eb865c rather
than /dev/md0
systemd doesn't manage to intuit a 'Following' dependency between
the UUID and the mount point.
 2/ I use partitions of md arrays: that UUID is actually /dev/md0p3.
systemd doesn't intuit that md0p3.device is Following md0.device.

So you only hit a problem if you have "/dev/md0" or similar in
/etc/fstab.


the fstab in the virtual machine which ad that issue around 3 times in 
the last year is using UUID - so it's not hit often at all but wouldn't 
have expected it especially hit a VM (reason for that raid is just 
testing RAID10 with writemostly option)


UUID=f0b27a5c-7e3d-45ad-8b7f-617820860379  /mnt/raid10 ext4 
defaults,commit=30,inode_readahead_blks=16,relatime,lazytime,barrier=0  0 1

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Errorneous detection of degraded array

2017-01-29 Thread NeilBrown
On Mon, Jan 30 2017, Andrei Borzenkov wrote:

> 30.01.2017 04:53, NeilBrown пишет:
>> On Fri, Jan 27 2017, Andrei Borzenkov wrote:
>> 
>>> 26.01.2017 21:02, Luke Pyzowski пишет:
 Hello,
 I have a large RAID6 device with 24 local drives on CentOS7.3. Randomly 
 (around 50% of the time) systemd will unmount my RAID device thinking it 
 is degraded after the mdadm-last-resort@.timer expires, however the device 
 is working normally by all accounts, and I can immediately mount it 
 manually upon boot completion. In the logs below /share is the RAID 
 device. I can increase the timer in 
 /usr/lib/systemd/system/mdadm-last-resort@.timer from 30 to 60 seconds, 
 but this problem can randomly still occur.

 systemd[1]: Created slice system-mdadm\x2dlast\x2dresort.slice.
 systemd[1]: Starting system-mdadm\x2dlast\x2dresort.slice.
 systemd[1]: Starting Activate md array even though degraded...
 systemd[1]: Stopped target Local File Systems.
 systemd[1]: Stopping Local File Systems.
 systemd[1]: Unmounting /share...
 systemd[1]: Stopped (with error) /dev/md0.
>> 
>> This line perplexes me.
>> 
>> The last-resort.service (and .timer) files have a Conflict= directive
>> against sys-devices-virtual-block-md$DEV.device 
>> Normally a Conflicts= directive means that if this service starts, that
>> one is stopped, and if that one starts, this is stopped.
>> However .device units cannot be stopped:
>> 
>> $ systemctl show sys-devices-virtual-block-md0.device | grep Can
>> CanStart=no
>> CanStop=no
>> CanReload=no
>> CanIsolate=no
>> 
>> so presumable the attempt to stop the device fails, so the Conflict=
>> dependency cannot be met, so the last-resort service (or timer) doesn't
>> get started.
>
> As I explained in other mail, to me it looks like last-resort timer does
> get started, and then last-resort service is started which attempts to
> stop device and because mount point depends on device it also stops
> mount point. So somehow we have bad timing when both device and timer
> start without canceling each other.
>
> The fact that stopping of device itself fails is irrelevant here -
> dependencies are evaluated at the time job is submitted, so if
> share.mount Requires dev-md0.device and you attempt to Stop
> dev-md0.device, systemd still queues job to Stop share.mount.
>
>> At least, that is what I see happening in my tests.
>> 
>
> Yes, we have race condition here, I cannot reproduce this either. It
> does not mean it does not exist :) Let's hope debug logging will show
> something more useful (it is entirely possible that with debugging logs
> turned on this race does not happen).
>
>> But your log doesn't mention sys-devices-virtual-block-md0, it
>> mentions /dev/md0.
>> How does systemd know about /dev/md0, or the connection it has with
>> sys-devices-virtual-block-md0 ??
>> 
>
> By virtue of "Following" attribute. dev-md0.device is Following
> sys-devices-virtual-block-md0.device so stopping the latter will also
> stop the former.

Ahh.. I see why I never saw this now.
Two reasons.
 1/ My /etc/fstab has UUID=d1711227-c9fa-4883-a904-7cd7a3eb865c rather
than /dev/md0
systemd doesn't manage to intuit a 'Following' dependency between
the UUID and the mount point.
 2/ I use partitions of md arrays: that UUID is actually /dev/md0p3.
systemd doesn't intuit that md0p3.device is Following md0.device.

So you only hit a problem if you have "/dev/md0" or similar in
/etc/fstab.

The race is, I think, that one I mentioned.  If the md device is started
before udev tells systemd to start the timer, the Conflicts dependencies
goes the "wrong" way and stops the wrong thing.

It would be nice to be able to reliably stop the timer when the device
starts, without risking having the device get stopped when the timer
starts, but I don't think we can reliably do that.

Changing the
  Conflicts=sys-devices-virtual-block-%i.device
lines to
  ConditionPathExists=/sys/devices/virtual/block/%i
might make the problem go away, without any negative consequences.

The primary purpose of having the 'Conflicts' directives was so that
systemd wouldn't log
  Starting Activate md array even though degraded
after the array was successfully started.
Hopefully it won't do that when the Condition fails.

Thanks,
NeilBrown


 
  

>
>> Does
>>   systemctl  list-dependencies  sys-devices-virtual-block-md0.device
>> 
>> report anything interesting?  I get
>> 
>> sys-devices-virtual-block-md0.device
>> ● └─mdmonitor.service
>> 


signature.asc
Description: PGP signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Errorneous detection of degraded array

2017-01-29 Thread Andrei Borzenkov
30.01.2017 04:53, NeilBrown пишет:
> On Fri, Jan 27 2017, Andrei Borzenkov wrote:
> 
>> 26.01.2017 21:02, Luke Pyzowski пишет:
>>> Hello,
>>> I have a large RAID6 device with 24 local drives on CentOS7.3. Randomly 
>>> (around 50% of the time) systemd will unmount my RAID device thinking it is 
>>> degraded after the mdadm-last-resort@.timer expires, however the device is 
>>> working normally by all accounts, and I can immediately mount it manually 
>>> upon boot completion. In the logs below /share is the RAID device. I can 
>>> increase the timer in /usr/lib/systemd/system/mdadm-last-resort@.timer from 
>>> 30 to 60 seconds, but this problem can randomly still occur.
>>>
>>> systemd[1]: Created slice system-mdadm\x2dlast\x2dresort.slice.
>>> systemd[1]: Starting system-mdadm\x2dlast\x2dresort.slice.
>>> systemd[1]: Starting Activate md array even though degraded...
>>> systemd[1]: Stopped target Local File Systems.
>>> systemd[1]: Stopping Local File Systems.
>>> systemd[1]: Unmounting /share...
>>> systemd[1]: Stopped (with error) /dev/md0.
> 
> This line perplexes me.
> 
> The last-resort.service (and .timer) files have a Conflict= directive
> against sys-devices-virtual-block-md$DEV.device 
> Normally a Conflicts= directive means that if this service starts, that
> one is stopped, and if that one starts, this is stopped.
> However .device units cannot be stopped:
> 
> $ systemctl show sys-devices-virtual-block-md0.device | grep Can
> CanStart=no
> CanStop=no
> CanReload=no
> CanIsolate=no
> 
> so presumable the attempt to stop the device fails, so the Conflict=
> dependency cannot be met, so the last-resort service (or timer) doesn't
> get started.

As I explained in other mail, to me it looks like last-resort timer does
get started, and then last-resort service is started which attempts to
stop device and because mount point depends on device it also stops
mount point. So somehow we have bad timing when both device and timer
start without canceling each other.

The fact that stopping of device itself fails is irrelevant here -
dependencies are evaluated at the time job is submitted, so if
share.mount Requires dev-md0.device and you attempt to Stop
dev-md0.device, systemd still queues job to Stop share.mount.

> At least, that is what I see happening in my tests.
> 

Yes, we have race condition here, I cannot reproduce this either. It
does not mean it does not exist :) Let's hope debug logging will show
something more useful (it is entirely possible that with debugging logs
turned on this race does not happen).

> But your log doesn't mention sys-devices-virtual-block-md0, it
> mentions /dev/md0.
> How does systemd know about /dev/md0, or the connection it has with
> sys-devices-virtual-block-md0 ??
> 

By virtue of "Following" attribute. dev-md0.device is Following
sys-devices-virtual-block-md0.device so stopping the latter will also
stop the former.

> Does
>   systemctl  list-dependencies  sys-devices-virtual-block-md0.device
> 
> report anything interesting?  I get
> 
> sys-devices-virtual-block-md0.device
> ● └─mdmonitor.service
> 




signature.asc
Description: OpenPGP digital signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Errorneous detection of degraded array

2017-01-29 Thread NeilBrown
On Fri, Jan 27 2017, Andrei Borzenkov wrote:

> 26.01.2017 21:02, Luke Pyzowski пишет:
>> Hello,
>> I have a large RAID6 device with 24 local drives on CentOS7.3. Randomly 
>> (around 50% of the time) systemd will unmount my RAID device thinking it is 
>> degraded after the mdadm-last-resort@.timer expires, however the device is 
>> working normally by all accounts, and I can immediately mount it manually 
>> upon boot completion. In the logs below /share is the RAID device. I can 
>> increase the timer in /usr/lib/systemd/system/mdadm-last-resort@.timer from 
>> 30 to 60 seconds, but this problem can randomly still occur.
>> 
>> systemd[1]: Created slice system-mdadm\x2dlast\x2dresort.slice.
>> systemd[1]: Starting system-mdadm\x2dlast\x2dresort.slice.
>> systemd[1]: Starting Activate md array even though degraded...
>> systemd[1]: Stopped target Local File Systems.
>> systemd[1]: Stopping Local File Systems.
>> systemd[1]: Unmounting /share...
>> systemd[1]: Stopped (with error) /dev/md0.

This line perplexes me.

The last-resort.service (and .timer) files have a Conflict= directive
against sys-devices-virtual-block-md$DEV.device 
Normally a Conflicts= directive means that if this service starts, that
one is stopped, and if that one starts, this is stopped.
However .device units cannot be stopped:

$ systemctl show sys-devices-virtual-block-md0.device | grep Can
CanStart=no
CanStop=no
CanReload=no
CanIsolate=no

so presumable the attempt to stop the device fails, so the Conflict=
dependency cannot be met, so the last-resort service (or timer) doesn't
get started.
At least, that is what I see happening in my tests.

But your log doesn't mention sys-devices-virtual-block-md0, it
mentions /dev/md0.
How does systemd know about /dev/md0, or the connection it has with
sys-devices-virtual-block-md0 ??

Does
  systemctl  list-dependencies  sys-devices-virtual-block-md0.device

report anything interesting?  I get

sys-devices-virtual-block-md0.device
● └─mdmonitor.service

NeilBrown


signature.asc
Description: PGP signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Errorneous detection of degraded array

2017-01-28 Thread Andrei Borzenkov
27.01.2017 22:44, Luke Pyzowski пишет:
...
> Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: raid level 6 active with 24 out 
> of 24 devices, algorithm 2
...
> Jan 27 11:33:14 lnxnfs01 kernel: md0: detected capacity change from 0 to 
> 45062020923392
> Jan 27 11:33:14 lnxnfs01 systemd[1]: Found device 
> /dev/disk/by-uuid/2b9114be-3d5a-41d7-8d4b-e5047d223129.
> Jan 27 11:33:14 lnxnfs01 systemd[1]: Started udev Wait for Complete Device 
> Initialization.
> Jan 27 11:33:14 lnxnfs01 systemd[1]: Started Timer to wait for more drives 
> before activating degraded array..
> Jan 27 11:33:14 lnxnfs01 systemd[1]: Starting Timer to wait for more drives 
> before activating degraded array..
...
> 
> ... + 31 seconds from disk initialization, expiration of 30 second timer from 
> mdadm-last-resort@.timer
> 
> Jan 27 11:33:45 lnxnfs01 systemd[1]: Created slice 
> system-mdadm\x2dlast\x2dresort.slice.
> Jan 27 11:33:45 lnxnfs01 systemd[1]: Starting 
> system-mdadm\x2dlast\x2dresort.slice.
> Jan 27 11:33:45 lnxnfs01 systemd[1]: Stopped target Local File Systems.
> Jan 27 11:33:45 lnxnfs01 systemd[1]: Stopping Local File Systems.
> Jan 27 11:33:45 lnxnfs01 systemd[1]: Unmounting Mount /share RAID partition 
> explicitly...
> Jan 27 11:33:45 lnxnfs01 systemd[1]: Starting Activate md array even though 
> degraded...
> Jan 27 11:33:45 lnxnfs01 systemd[1]: Stopped (with error) /dev/md0.
> Jan 27 11:33:45 lnxnfs01 systemd[1]: Started Activate md array even though 
> degraded.
> Jan 27 11:33:45 lnxnfs01 systemd[1]: Unmounted Mount /share RAID partition 
> explicitly.
> 

Here is my educated guess.

Both mdadm-last-resort@.timer and mdadm-last-resort@.service conflict
with MD device:

bor@bor-Latitude-E5450:~/src/systemd$ cat ../mdadm/systemd/
mdadm-grow-continue@.service  mdadm.shutdown
SUSE-mdadm_env.sh
mdadm-last-resort@.servicemdmonitor.service
mdadm-last-resort@.timer  mdmon@.service
bor@bor-Latitude-E5450:~/src/systemd$ cat
../mdadm/systemd/mdadm-last-resort@.timer
[Unit]
Description=Timer to wait for more drives before activating degraded array.
DefaultDependencies=no
Conflicts=sys-devices-virtual-block-%i.device

[Timer]
OnActiveSec=30
bor@bor-Latitude-E5450:~/src/systemd$ cat
../mdadm/systemd/mdadm-last-resort@.service
[Unit]
Description=Activate md array even though degraded
DefaultDependencies=no
Conflicts=sys-devices-virtual-block-%i.device

[Service]
Type=oneshot
ExecStart=BINDIR/mdadm --run /dev/%i

I presume intention is to stop these units when MD device is finally
assembled as complete. This is indeed what happens on my (test) system:

Jan 28 14:18:04 linux-ffk5 kernel: md: bind
Jan 28 14:18:04 linux-ffk5 kernel: md: bind
Jan 28 14:18:05 linux-ffk5 kernel: md/raid1:md0: active with 2 out of 2
mirrors
Jan 28 14:18:05 linux-ffk5 kernel: md0: detected capacity change from 0
to 5363466240
Jan 28 14:18:06 linux-ffk5 systemd[1]: mdadm-last-resort@md0.timer:
Installed new job mdadm-last-resort@md0.timer/start as 287
Jan 28 14:18:06 linux-ffk5 systemd[1]: mdadm-last-resort@md0.timer:
Enqueued job mdadm-last-resort@md0.timer/start as 287
Jan 28 14:18:06 linux-ffk5 systemd[1]: dev-ttyS9.device: Changed dead ->
plugged
Jan 28 14:18:07 linux-ffk5 systemd[1]: mdadm-last-resort@md0.timer:
Changed dead -> waiting
Jan 28 14:18:12 linux-ffk5 systemd[1]:
sys-devices-virtual-block-md0.device: Changed dead -> plugged
Jan 28 14:18:12 linux-ffk5 systemd[1]: mdadm-last-resort@md0.timer:
Trying to enqueue job mdadm-last-resort@md0.timer/stop/replace
Jan 28 14:18:12 linux-ffk5 systemd[1]: mdadm-last-resort@md0.timer:
Installed new job mdadm-last-resort@md0.timer/stop as 292
Jan 28 14:18:12 linux-ffk5 systemd[1]: mdadm-last-resort@md0.timer:
Enqueued job mdadm-last-resort@md0.timer/stop as 292
Jan 28 14:18:12 linux-ffk5 systemd[1]: mdadm-last-resort@md0.timer:
Changed waiting -> dead
Jan 28 14:18:12 linux-ffk5 systemd[1]: mdadm-last-resort@md0.timer: Job
mdadm-last-resort@md0.timer/stop finished, result=done
Jan 28 14:18:12 linux-ffk5 systemd[1]: Stopped Timer to wait for more
drives before activating degraded array..
Jan 28 14:19:34 10 systemd[1692]: dev-vda1.device: Changed dead -> plugged
Jan 28 14:19:34 10 systemd[1692]: dev-vdb1.device: Changed dead -> plugged


On your system apparently timer is not stopped when md device appears so
that when later last-resort service runs, it causes attempt to stop md
device (due to conflict) and transitively mount on top of it.

Could you try run with systemd.log_level=debug on kernel command line
and upload journal again. We can only hope that it will not skew timings
enough but it may prove my hypothesis.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Errorneous detection of degraded array

2017-01-27 Thread Luke Pyzowski
I've modified a number of settings to try to resolve this, so far no success.
I've created an explicit mount file for the RAID array: 
/etc/systemd/system/share.mount
Inside there I've experimented with TimeoutSec=

In /etc/systemd/system/mdadm-last-resort@.timer I've worked with
OnActiveSec=

I've also tried (without an explicit mount file) to add 
x-systemd.device-timeout to /etc/fstab for the mount. 

Here's a few more system logs showing perhaps more detail. I've edited them to 
show only relevant details, full pastebin of logs: http://pastebin.com/sL8nKt7j
These logs were generated with TimeoutSec=120 in 
/etc/systemd/system/share.mount the description of the mount in the logs is: 
"Mount /share RAID partition explicitly"
And OnActiveSec=30 in /etc/systemd/system/mdadm-last-resort@.timer
From blkid:
/dev/md0: UUID="2b9114be-3d5a-41d7-8d4b-e5047d223129" TYPE="ext4"
/dev/md0 is the /share partition.

From /etc/mdadm.conf:
ARRAY /dev/md/0  metadata=1.2 UUID=97566d2f:ae7a169b:966f5840:3e8267f9 
name=lnxnfs01:0

Boot begins at Jan 27 11:33:10
+4 seconds from boot:
Jan 27 11:33:14 lnxnfs01 systemd[1]: Found device 
/dev/disk/by-uuid/283669e9-f32c-498d-b848-c6f91738c959.
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdc operational as raid 
disk 2
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdx operational as raid 
disk 23
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdu operational as raid 
disk 20
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdt operational as raid 
disk 19
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdo operational as raid 
disk 14
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdn operational as raid 
disk 13
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdd operational as raid 
disk 3
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdv operational as raid 
disk 21
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sda operational as raid 
disk 0
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdf operational as raid 
disk 5
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdm operational as raid 
disk 12
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sde operational as raid 
disk 4
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdp operational as raid 
disk 15
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdi operational as raid 
disk 8
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdl operational as raid 
disk 11
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdk operational as raid 
disk 10
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sds operational as raid 
disk 18
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdb operational as raid 
disk 1
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdj operational as raid 
disk 9
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdg operational as raid 
disk 6
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdr operational as raid 
disk 17
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdh operational as raid 
disk 7
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdq operational as raid 
disk 16
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: device sdw operational as raid 
disk 22
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: allocated 25534kB
Jan 27 11:33:14 lnxnfs01 kernel: md/raid:md0: raid level 6 active with 24 out 
of 24 devices, algorithm 2
Jan 27 11:33:14 lnxnfs01 kernel: RAID conf printout:
Jan 27 11:33:14 lnxnfs01 kernel:  --- level:6 rd:24 wd:24
Jan 27 11:33:14 lnxnfs01 kernel:  disk 0, o:1, dev:sda
Jan 27 11:33:14 lnxnfs01 kernel:  disk 1, o:1, dev:sdb
Jan 27 11:33:14 lnxnfs01 kernel:  disk 2, o:1, dev:sdc
Jan 27 11:33:14 lnxnfs01 kernel:  disk 3, o:1, dev:sdd
Jan 27 11:33:14 lnxnfs01 kernel:  disk 4, o:1, dev:sde
Jan 27 11:33:14 lnxnfs01 kernel:  disk 5, o:1, dev:sdf
Jan 27 11:33:14 lnxnfs01 kernel:  disk 6, o:1, dev:sdg
Jan 27 11:33:14 lnxnfs01 kernel:  disk 7, o:1, dev:sdh
Jan 27 11:33:14 lnxnfs01 kernel:  disk 8, o:1, dev:sdi
Jan 27 11:33:14 lnxnfs01 kernel:  disk 9, o:1, dev:sdj
Jan 27 11:33:14 lnxnfs01 kernel:  disk 10, o:1, dev:sdk
Jan 27 11:33:14 lnxnfs01 kernel:  disk 11, o:1, dev:sdl
Jan 27 11:33:14 lnxnfs01 kernel:  disk 12, o:1, dev:sdm
Jan 27 11:33:14 lnxnfs01 kernel:  disk 13, o:1, dev:sdn
Jan 27 11:33:14 lnxnfs01 kernel:  disk 14, o:1, dev:sdo
Jan 27 11:33:14 lnxnfs01 kernel:  disk 15, o:1, dev:sdp
Jan 27 11:33:14 lnxnfs01 kernel:  disk 16, o:1, dev:sdq
Jan 27 11:33:14 lnxnfs01 kernel:  disk 17, o:1, dev:sdr
Jan 27 11:33:14 lnxnfs01 kernel:  disk 18, o:1, dev:sds
Jan 27 11:33:14 lnxnfs01 kernel:  disk 19, o:1, dev:sdt
Jan 27 11:33:14 lnxnfs01 kernel:  disk 20, o:1, dev:sdu
Jan 27 11:33:14 lnxnfs01 kernel:  disk 21, o:1, dev:sdv
Jan 27 11:33:14 lnxnfs01 kernel:  disk 22, o:1, dev:sdw
Jan 27 11:33:14 lnxnfs01 kernel:  disk 23, o:1, dev:sdx
Jan 27 11:33:14 lnxnfs01 kernel: md0: detected capacity change from 0 to 
45062020923392
Jan 27 11:33:14 lnxnfs01 systemd[1]: Found device 

Re: [systemd-devel] Errorneous detection of degraded array

2017-01-27 Thread Martin Wilck
> 26.01.2017 21:02, Luke Pyzowski пишет:
> > Hello,
> > I have a large RAID6 device with 24 local drives on CentOS7.3.
> > Randomly (around 50% of the time) systemd will unmount my RAID
> > device thinking it is degraded after the mdadm-last-resort@.timer
> > expires, however the device is working normally by all accounts,
> > and I can immediately mount it manually upon boot completion. In
> > the logs below /share is the RAID device. I can increase the timer
> > in /usr/lib/systemd/system/mdadm-last-resort@.timer from 30 to 60
> > seconds, but this problem can randomly still occur.

It seems to me that you rather need to decrease the timeout value, or
(more reasonable) increase x-systemd.device-timeout for the /share
mount point.
Unfortunately your log excerpt contains to time stamps but I suppose
you're facing a race where the device times out before the "last
resort" timer starts it (and before the last devices appear).

Martin

-- 
Dr. Martin Wilck , Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Errorneous detection of degraded array

2017-01-26 Thread Andrei Borzenkov
26.01.2017 21:02, Luke Pyzowski пишет:
> Hello,
> I have a large RAID6 device with 24 local drives on CentOS7.3. Randomly 
> (around 50% of the time) systemd will unmount my RAID device thinking it is 
> degraded after the mdadm-last-resort@.timer expires, however the device is 
> working normally by all accounts, and I can immediately mount it manually 
> upon boot completion. In the logs below /share is the RAID device. I can 
> increase the timer in /usr/lib/systemd/system/mdadm-last-resort@.timer from 
> 30 to 60 seconds, but this problem can randomly still occur.
> 
> systemd[1]: Created slice system-mdadm\x2dlast\x2dresort.slice.
> systemd[1]: Starting system-mdadm\x2dlast\x2dresort.slice.
> systemd[1]: Starting Activate md array even though degraded...
> systemd[1]: Stopped target Local File Systems.
> systemd[1]: Stopping Local File Systems.
> systemd[1]: Unmounting /share...
> systemd[1]: Stopped (with error) /dev/md0.
> systemd[1]: Started Activate md array even though degraded.
> systemd[1]: Unmounted /share.
> 
> When the system boots normally the following is in the logs:
> systemd[1]: Started Timer to wait for more drives before activating degraded 
> array..
> systemd[1]: Starting Timer to wait for more drives before activating degraded 
> array..
> ...
> systemd[1]: Stopped Timer to wait for more drives before activating degraded 
> array..
> systemd[1]: Stopping Timer to wait for more drives before activating degraded 
> array..
> 
> The above occurs within the same second according to the timestamps and the 
> timer ends prior to mounting any local filesystems, it properly detects that 
> the RAID is valid and everything continues normally. The other RAID device - 
> a RAID1 of 2 disks containing swap and / have never exhibited this failure.
> 
> My question is, what are the conditions where systemd detects the RAID6 as 
> being degraded? It seems to be a race condition somewhere, but I am not sure 
> what configuration should be modified if any. If needed I can provide more 
> verbose logs, just let me know if they might be useful.
> 

It is not directly related to systemd. When block device that is part of
MD array is detected by kernel, udev rule queries array if it is
complete. If it is, it starts array (subject to general rules of which
arrays are auto-started); and if not, it (udev rule) starts timer to
assemble degraded array.

See udev-md-raid-assembly.rules in mdadm sources:

ACTION=="add|change", ENV{MD_STARTED}=="*unsafe*",
ENV{MD_FOREIGN}=="no",
ENV{SYSTEMD_WANTS}+="mdadm-last-resort@$env{MD_DEVICE}.timer"

So it looks like events for some array members either got lost or are
delivered late.

Note that there was discussion on openSUSE list where arrays would not
be auto-assembled on boot, even though triggering device change *after*
initial boot would correctly run these rules. This situation was
triggered by adding extra disk to the system (i.e. - boot with 3 disks
worked, with 4 disks - not). I could not find any hints even after
enabling full udev and systemd debug logs. Logs are available if anyone
wants to try it.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Errorneous detection of degraded array

2017-01-26 Thread Reindl Harald



Am 26.01.2017 um 19:02 schrieb Luke Pyzowski:

I have a large RAID6 device with 24 local drives on CentOS7.3. Randomly (around 
50% of the time) systemd will unmount my RAID device thinking it is degraded 
after the mdadm-last-resort@.timer expires, however the device is working 
normally by all accounts, and I can immediately mount it manually upon boot 
completion. In the logs below /share is the RAID device. I can increase the 
timer in /usr/lib/systemd/system/mdadm-last-resort@.timer from 30 to 60 
seconds, but this problem can randomly still occur.

systemd[1]: Created slice system-mdadm\x2dlast\x2dresort.slice.
systemd[1]: Starting system-mdadm\x2dlast\x2dresort.slice.
systemd[1]: Starting Activate md array even though degraded...
systemd[1]: Stopped target Local File Systems.
systemd[1]: Stopping Local File Systems.
systemd[1]: Unmounting /share...
systemd[1]: Stopped (with error) /dev/md0.
systemd[1]: Started Activate md array even though degraded.
systemd[1]: Unmounted /share.


that also happens randomly in my Fedora 24 testing-vm with a RAID10 and 
you can be sure that in a virtual machine drives don't disappear or take 
long to appear

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Errorneous detection of degraded array

2017-01-26 Thread Reindl Harald



Am 26.01.2017 um 19:02 schrieb Luke Pyzowski:

I have a large RAID6 device with 24 local drives on CentOS7.3. Randomly (around 
50% of the time) systemd will unmount my RAID device thinking it is degraded 
after the mdadm-last-resort@.timer expires, however the device is working 
normally by all accounts, and I can immediately mount it manually upon boot 
completion. In the logs below /share is the RAID device. I can increase the 
timer in /usr/lib/systemd/system/mdadm-last-resort@.timer from 30 to 60 
seconds, but this problem can randomly still occur.

systemd[1]: Created slice system-mdadm\x2dlast\x2dresort.slice.
systemd[1]: Starting system-mdadm\x2dlast\x2dresort.slice.
systemd[1]: Starting Activate md array even though degraded...
systemd[1]: Stopped target Local File Systems.
systemd[1]: Stopping Local File Systems.
systemd[1]: Unmounting /share...
systemd[1]: Stopped (with error) /dev/md0.
systemd[1]: Started Activate md array even though degraded.
systemd[1]: Unmounted /share.


that also happens randomly in my Fedora 24 testing-vm with a RAID10 and 
you can be sure that in a virtual machine drives don't disappear or take 
long to appear

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel