Re: Shutdown errors and timeout

2020-11-18 Thread Mateusz Piotrowski

On 11/16/20 7:16 PM, Johan Hendriks wrote:


On 14/11/2020 13:03, Mateusz Piotrowski wrote:

On 11/14/20 1:19 AM, Tomoaki AOKI wrote:

On Fri, 13 Nov 2020 20:04:59 +0900 (JST)
Yasuhiro KIMURA  wrote:


From: Johan Hendriks 


Hello all, i have two FreeBSD 13 machines, one is a bare metal and one
is virtualbox machine which i both update about once a week.

The vritual machine seems to fail stopping something and gives a
timeout after 90 sec.

The console ends with

Writing entropy file: .
Writing early boot entropy file: .

90 second watchdog timeout expired. Shutdown terminated.
Fri Nov13 11:20:40 CEST 2020
Nov 13 11:20:40 test-head init[1]: /etc/rc.shutdown terminated
abnormally, going to single user mode
...

On the bare metal machine i see the following.
Writing entropy file: .
Writing early boot entropy file: .
cannot unmount '/var/run': umount failed
cannot unmount '/var/log': umount failed
cannot unmount '/var': umount failed
cannot unmount '/usr/home': umount failed
cannot unmount '/usr': umount failed
cannot unmount '/': umount failed


(snip)

The pools have not been upgraded after the latest openzfs import,
maybe that is related?

FreeBSD test-freebsd-head 13.0-CURRENT FreeBSD 13.0-CURRENT #2
r367585:

First thing i noticed is about a week ago.

I'm facing same problem with 13.0-CURRENT amd64 r367487 and
virtualbox. In my case I use autofs to mount remote file system of
12.2-RELEASE amd64 server with NFSv4. When there is still filesystem
mounted by autofs, then watchdog timeout happens while shutdown. The
watchdog timeout can be worked around by executing `automount -fu`
before shutting down. But 'cannot unmount ...' error messages are
still displayed.

I added 'rc_debug="YES"' to /etc/rc.conf and checked which rc script
causes this message. Then it is displayed when following `zfs_stop`
function of /etc/rc.d/zfs is executed.

--
zfs_stop_main()
{
zfs unshare -a
zfs unmount -a
}
--

At this point syslog process still running and it opens some files
under /var/log. So it make sence that `zfs unmount -a` results in the
message.

Probably order of executing each rc script in shutdown time should be
changed so `/etc/rc.d/zfs faststop` is executed after all processes
other than `init' are exited.

This happens on stable/12, too.
As a workaround, reverting r367291 on head (r367546 on stable/12)
would stop the issue until this is really fixed.

If you have shared dataset or jail(s) mounting dataset, the workaround
would be discouraged. Read commit message for detail.


I've committed r367291 and r367546.

I am not sure if I can think of a proper fix for the described issues, so I guess the best idea 
would be to revert those changes for now until we figure out how to do it properly.



I can tell that reverting the mentioned commit i do not have the symptoms when 
i reboot my servers.
Thank you all for your time, and no sorry needed ;-)


I'll revert the commit then. I'm just waiting for an approval from a src 
committer.

Best,

Mateusz

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Shutdown errors and timeout

2020-11-16 Thread Johan Hendriks


On 14/11/2020 13:03, Mateusz Piotrowski wrote:

Hi,

On 11/14/20 1:19 AM, Tomoaki AOKI wrote:

On Fri, 13 Nov 2020 20:04:59 +0900 (JST)
Yasuhiro KIMURA  wrote:


From: Johan Hendriks 


Hello all, i have two FreeBSD 13 machines, one is a bare metal and one
is virtualbox machine which i both update about once a week.

The vritual machine seems to fail stopping something and gives a
timeout after 90 sec.

The console ends with

Writing entropy file: .
Writing early boot entropy file: .

90 second watchdog timeout expired. Shutdown terminated.
Fri Nov13 11:20:40 CEST 2020
Nov 13 11:20:40 test-head init[1]: /etc/rc.shutdown terminated
abnormally, going to single user mode
...

On the bare metal machine i see the following.
Writing entropy file: .
Writing early boot entropy file: .
cannot unmount '/var/run': umount failed
cannot unmount '/var/log': umount failed
cannot unmount '/var': umount failed
cannot unmount '/usr/home': umount failed
cannot unmount '/usr': umount failed
cannot unmount '/': umount failed


(snip)

The pools have not been upgraded after the latest openzfs import,
maybe that is related?

FreeBSD test-freebsd-head 13.0-CURRENT FreeBSD 13.0-CURRENT #2
r367585:

First thing i noticed is about a week ago.

I'm facing same problem with 13.0-CURRENT amd64 r367487 and
virtualbox. In my case I use autofs to mount remote file system of
12.2-RELEASE amd64 server with NFSv4. When there is still filesystem
mounted by autofs, then watchdog timeout happens while shutdown. The
watchdog timeout can be worked around by executing `automount -fu`
before shutting down. But 'cannot unmount ...' error messages are
still displayed.

I added 'rc_debug="YES"' to /etc/rc.conf and checked which rc script
causes this message. Then it is displayed when following `zfs_stop`
function of /etc/rc.d/zfs is executed.

--
zfs_stop_main()
{
zfs unshare -a
zfs unmount -a
}
--

At this point syslog process still running and it opens some files
under /var/log. So it make sence that `zfs unmount -a` results in the
message.

Probably order of executing each rc script in shutdown time should be
changed so `/etc/rc.d/zfs faststop` is executed after all processes
other than `init' are exited.

This happens on stable/12, too.
As a workaround, reverting r367291 on head (r367546 on stable/12)
would stop the issue until this is really fixed.

If you have shared dataset or jail(s) mounting dataset, the workaround
would be discouraged. Read commit message for detail.


I've committed r367291 and r367546.

I am not sure if I can think of a proper fix for the described issues, 
so I guess the best idea would be to revert those changes for now 
until we figure out how to do it properly.


Sorry for the regression.

Best,

Mateusz



I can tell that reverting the mentioned commit i do not have the 
symptoms when i reboot my servers.

Thank you all for your time, and no sorry needed ;-)

regards,
Johan

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Shutdown errors and timeout

2020-11-14 Thread Mateusz Piotrowski

Hi,

On 11/14/20 1:19 AM, Tomoaki AOKI wrote:

On Fri, 13 Nov 2020 20:04:59 +0900 (JST)
Yasuhiro KIMURA  wrote:


From: Johan Hendriks 


Hello all, i have two FreeBSD 13 machines, one is a bare metal and one
is virtualbox machine which i both update about once a week.

The vritual machine seems to fail stopping something and gives a
timeout after 90 sec.

The console ends with

Writing entropy file: .
Writing early boot entropy file: .

90 second watchdog timeout expired. Shutdown terminated.
Fri Nov13 11:20:40 CEST 2020
Nov 13 11:20:40 test-head init[1]: /etc/rc.shutdown terminated
abnormally, going to single user mode
...

On the bare metal machine i see the following.
Writing entropy file: .
Writing early boot entropy file: .
cannot unmount '/var/run': umount failed
cannot unmount '/var/log': umount failed
cannot unmount '/var': umount failed
cannot unmount '/usr/home': umount failed
cannot unmount '/usr': umount failed
cannot unmount '/': umount failed


(snip)

The pools have not been upgraded after the latest openzfs import,
maybe that is related?

FreeBSD test-freebsd-head 13.0-CURRENT FreeBSD 13.0-CURRENT #2
r367585:

First thing i noticed is about a week ago.

I'm facing same problem with 13.0-CURRENT amd64 r367487 and
virtualbox. In my case I use autofs to mount remote file system of
12.2-RELEASE amd64 server with NFSv4. When there is still filesystem
mounted by autofs, then watchdog timeout happens while shutdown. The
watchdog timeout can be worked around by executing `automount -fu`
before shutting down. But 'cannot unmount ...' error messages are
still displayed.

I added 'rc_debug="YES"' to /etc/rc.conf and checked which rc script
causes this message. Then it is displayed when following `zfs_stop`
function of /etc/rc.d/zfs is executed.

--
zfs_stop_main()
{
zfs unshare -a
zfs unmount -a
}
--

At this point syslog process still running and it opens some files
under /var/log. So it make sence that `zfs unmount -a` results in the
message.

Probably order of executing each rc script in shutdown time should be
changed so `/etc/rc.d/zfs faststop` is executed after all processes
other than `init' are exited.

This happens on stable/12, too.
As a workaround, reverting r367291 on head (r367546 on stable/12)
would stop the issue until this is really fixed.

If you have shared dataset or jail(s) mounting dataset, the workaround
would be discouraged. Read commit message for detail.


I've committed r367291 and r367546.

I am not sure if I can think of a proper fix for the described issues, so I guess the best idea 
would be to revert those changes for now until we figure out how to do it properly.


Sorry for the regression.

Best,

Mateusz

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Shutdown errors and timeout

2020-11-13 Thread Tomoaki AOKI
On Fri, 13 Nov 2020 20:04:59 +0900 (JST)
Yasuhiro KIMURA  wrote:

> From: Johan Hendriks 
> Subject: Shutdown errors and timeout
> Date: Fri, 13 Nov 2020 11:35:53 +0100
> 
> > Hello all, i have two FreeBSD 13 machines, one is a bare metal and one
> > is virtualbox machine which i both update about once a week.
> > 
> > The vritual machine seems to fail stopping something and gives a
> > timeout after 90 sec.
> > 
> > The console ends with
> > 
> > Writing entropy file: .
> > Writing early boot entropy file: .
> > 
> > 90 second watchdog timeout expired. Shutdown terminated.
> > Fri Nov13 11:20:40 CEST 2020
> > Nov 13 11:20:40 test-head init[1]: /etc/rc.shutdown terminated
> > abnormally, going to single user mode
> > ...
> > 
> > On the bare metal machine i see the following.
> > Writing entropy file: .
> > Writing early boot entropy file: .
> > cannot unmount '/var/run': umount failed
> > cannot unmount '/var/log': umount failed
> > cannot unmount '/var': umount failed
> > cannot unmount '/usr/home': umount failed
> > cannot unmount '/usr': umount failed
> > cannot unmount '/': umount failed
> > 
> (snip)
> > 
> > The pools have not been upgraded after the latest openzfs import,
> > maybe that is related?
> > 
> > FreeBSD test-freebsd-head 13.0-CURRENT FreeBSD 13.0-CURRENT #2
> > r367585:
> > 
> > First thing i noticed is about a week ago.
> 
> I'm facing same problem with 13.0-CURRENT amd64 r367487 and
> virtualbox. In my case I use autofs to mount remote file system of
> 12.2-RELEASE amd64 server with NFSv4. When there is still filesystem
> mounted by autofs, then watchdog timeout happens while shutdown. The
> watchdog timeout can be worked around by executing `automount -fu`
> before shutting down. But 'cannot unmount ...' error messages are
> still displayed.
> 
> I added 'rc_debug="YES"' to /etc/rc.conf and checked which rc script
> causes this message. Then it is displayed when following `zfs_stop`
> function of /etc/rc.d/zfs is executed.
> 
> --
> zfs_stop_main()
> {
>   zfs unshare -a
>   zfs unmount -a
> }
> --
> 
> At this point syslog process still running and it opens some files
> under /var/log. So it make sence that `zfs unmount -a` results in the
> message.
> 
> Probably order of executing each rc script in shutdown time should be
> changed so `/etc/rc.d/zfs faststop` is executed after all processes
> other than `init' are exited.
> 
> ---
> Yasuhiro KIMURA
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

This happens on stable/12, too.
As a workaround, reverting r367291 on head (r367546 on stable/12)
would stop the issue until this is really fixed.

If you have shared dataset or jail(s) mounting dataset, the workaround
would be discouraged. Read commit message for detail.

I couldn't determine the problematic commit on head as there was
multiple commits between previous build, but could determine on
stable/12 as there was no other zfs-related commit between previous
build.


-- 
Tomoaki AOKI
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Shutdown errors and timeout

2020-11-13 Thread Johan Hendriks


On 13/11/2020 11:55, Andrea Venturoli wrote:

On 11/13/20 11:35 AM, Johan Hendriks wrote:

Just my 2c...



The vritual machine seems to fail stopping something and gives a 
timeout after 90 sec.


I've seen this on many (physical) boxes and I solved by increasing 
shutdown timeout. Sometimes 90s is just too little (especially, but 
not only, if you have VMs running).


E.g. I have rcshutdown_timeout="600" in /etc/rc.conf and 
kern.init_shutdown_timeout=900 in /etc/sysctl.conf.





On the bare metal machine i see the following.
Writing entropy file: .
Writing early boot entropy file: .
cannot unmount '/var/run': umount failed
cannot unmount '/var/log': umount failed
cannot unmount '/var': umount failed
cannot unmount '/usr/home': umount failed
cannot unmount '/usr': umount failed
cannot unmount '/': umount failed


Probably a process is still running and that's why those filesystems 
cannot be (unforcibly) unmounted.

Logs can help identify which process it is.
Perhaps putting rc_debug="YES" in /etc/rc.conf can be useful.



 bye
av.


Thank you for your anwer.
The rc_debug showed me that the virtualox server are hangs on zfs_stop, 
if i do not enable bastille and reboot all is fine after a shutdown, so 
i think the jail zfs datasets are interfering.
The baremetal server is not waiting for anything just throw the umount 
errors directly.


regards.




___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Shutdown errors and timeout

2020-11-13 Thread Yasuhiro KIMURA
From: Johan Hendriks 
Subject: Shutdown errors and timeout
Date: Fri, 13 Nov 2020 11:35:53 +0100

> Hello all, i have two FreeBSD 13 machines, one is a bare metal and one
> is virtualbox machine which i both update about once a week.
> 
> The vritual machine seems to fail stopping something and gives a
> timeout after 90 sec.
> 
> The console ends with
> 
> Writing entropy file: .
> Writing early boot entropy file: .
> 
> 90 second watchdog timeout expired. Shutdown terminated.
> Fri Nov13 11:20:40 CEST 2020
> Nov 13 11:20:40 test-head init[1]: /etc/rc.shutdown terminated
> abnormally, going to single user mode
> ...
> 
> On the bare metal machine i see the following.
> Writing entropy file: .
> Writing early boot entropy file: .
> cannot unmount '/var/run': umount failed
> cannot unmount '/var/log': umount failed
> cannot unmount '/var': umount failed
> cannot unmount '/usr/home': umount failed
> cannot unmount '/usr': umount failed
> cannot unmount '/': umount failed
> 
(snip)
> 
> The pools have not been upgraded after the latest openzfs import,
> maybe that is related?
> 
> FreeBSD test-freebsd-head 13.0-CURRENT FreeBSD 13.0-CURRENT #2
> r367585:
> 
> First thing i noticed is about a week ago.

I'm facing same problem with 13.0-CURRENT amd64 r367487 and
virtualbox. In my case I use autofs to mount remote file system of
12.2-RELEASE amd64 server with NFSv4. When there is still filesystem
mounted by autofs, then watchdog timeout happens while shutdown. The
watchdog timeout can be worked around by executing `automount -fu`
before shutting down. But 'cannot unmount ...' error messages are
still displayed.

I added 'rc_debug="YES"' to /etc/rc.conf and checked which rc script
causes this message. Then it is displayed when following `zfs_stop`
function of /etc/rc.d/zfs is executed.

--
zfs_stop_main()
{
zfs unshare -a
zfs unmount -a
}
--

At this point syslog process still running and it opens some files
under /var/log. So it make sence that `zfs unmount -a` results in the
message.

Probably order of executing each rc script in shutdown time should be
changed so `/etc/rc.d/zfs faststop` is executed after all processes
other than `init' are exited.

---
Yasuhiro KIMURA
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Shutdown errors and timeout

2020-11-13 Thread Andrea Venturoli

On 11/13/20 11:35 AM, Johan Hendriks wrote:

Just my 2c...



The vritual machine seems to fail stopping something and gives a timeout 
after 90 sec.


I've seen this on many (physical) boxes and I solved by increasing 
shutdown timeout. Sometimes 90s is just too little (especially, but not 
only, if you have VMs running).


E.g. I have rcshutdown_timeout="600" in /etc/rc.conf and 
kern.init_shutdown_timeout=900 in /etc/sysctl.conf.





On the bare metal machine i see the following.
Writing entropy file: .
Writing early boot entropy file: .
cannot unmount '/var/run': umount failed
cannot unmount '/var/log': umount failed
cannot unmount '/var': umount failed
cannot unmount '/usr/home': umount failed
cannot unmount '/usr': umount failed
cannot unmount '/': umount failed


Probably a process is still running and that's why those filesystems 
cannot be (unforcibly) unmounted.

Logs can help identify which process it is.
Perhaps putting rc_debug="YES" in /etc/rc.conf can be useful.



 bye
av.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Shutdown errors and timeout

2020-11-13 Thread Johan Hendriks
Hello all, i have two FreeBSD 13 machines, one is a bare metal and one 
is virtualbox machine which i both update about once a week.


The vritual machine seems to fail stopping something and gives a timeout 
after 90 sec.


The console ends with

Writing entropy file: .
Writing early boot entropy file: .

90 second watchdog timeout expired. Shutdown terminated.
Fri Nov13 11:20:40 CEST 2020
Nov 13 11:20:40 test-head init[1]: /etc/rc.shutdown terminated 
abnormally, going to single user mode

...

On the bare metal machine i see the following.
Writing entropy file: .
Writing early boot entropy file: .
cannot unmount '/var/run': umount failed
cannot unmount '/var/log': umount failed
cannot unmount '/var': umount failed
cannot unmount '/usr/home': umount failed
cannot unmount '/usr': umount failed
cannot unmount '/': umount failed

The virtual machine has the following mount points set.
zroot/ROOT/default on / (zfs, local, noatime, nfsv4acls)
devfs on /dev (devfs)
fdescfs on /dev/fd (fdescfs)
zroot/var/log on /var/log (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/usr/home on /usr/home (zfs, local, noatime, nfsv4acls)
zroot/jails on /jails (zfs, local, noatime, nfsv4acls)
zroot/tmp on /tmp (zfs, local, noatime, nosuid, nfsv4acls)
zroot/var/audit on /var/audit (zfs, local, noatime, noexec, nosuid, 
nfsv4acls)

zroot on /zroot (zfs, local, noatime, nfsv4acls)
zroot/var/crash on /var/crash (zfs, local, noatime, noexec, nosuid, 
nfsv4acls)

zroot/var/tmp on /var/tmp (zfs, local, noatime, nosuid, nfsv4acls)
zroot/usr/ports on /usr/ports (zfs, local, noatime, nosuid, nfsv4acls)
zroot/var/mail on /var/mail (zfs, local, nfsv4acls)

The barematal one
zroot/ROOT/default on / (zfs, local, noatime, nfsv4acls)
devfs on /dev (devfs)
fdescfs on /dev/fd (fdescfs)
zroot/usr on /usr (zfs, local, noatime, nfsv4acls)
zroot/var on /var (zfs, local, noatime, nfsv4acls)
zroot/tmp on /tmp (zfs, local, noatime, nosuid, nfsv4acls)
zroot/usr/home on /usr/home (zfs, local, noatime, nfsv4acls)
zroot/var/log on /var/log (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/var/run on /var/run (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/var/mail on /var/mail (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/var/crash on /var/crash (zfs, local, noatime, noexec, nosuid, 
nfsv4acls)

zroot/var/db on /var/db (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/var/empty on /var/empty (zfs, local, noatime, noexec, nosuid, 
read-only, nfsv4acls)
zroot/usr/src on /usr/src (zfs, NFS exported, local, noatime, noexec, 
nosuid, nfsv4acls)

zroot/usr/ports on /usr/ports (zfs, local, noatime, nosuid, nfsv4acls)
zroot/usr/obj on /usr/obj (zfs, NFS exported, local, noatime, nfsv4acls)
zroot/var/tmp on /var/tmp (zfs, local, noatime, nosuid, nfsv4acls)
zroot/usr/ports/distfiles on /usr/ports/distfiles (zfs, local, noatime, 
noexec, nosuid, nfsv4acls)
zroot/usr/ports/packages on /usr/ports/packages (zfs, local, noatime, 
noexec, nosuid, nfsv4acls)

zroot/var/db/pkg on /var/db/pkg (zfs, local, noatime, nosuid, nfsv4acls)

The pools have not been upgraded after the latest openzfs import, maybe 
that is related?


FreeBSD test-freebsd-head 13.0-CURRENT FreeBSD 13.0-CURRENT #2 r367585:

First thing i noticed is about a week ago.






___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"