Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-22 Thread Tom Smyth
Hi Jan sorry for the late reply,
Thanks for your comments and questions ,
Replies are in line

On Wed, 17 Jul 2024 at 13:12, Jan Stary  wrote:
>
> On Jul 10 17:05:55, tom.sm...@wirelessconnect.eu wrote:
> > Hi Jan
> > thanks for your Reply and feedback,
> >  please find my replies  in line ,
> >
> > On Wed, 10 Jul 2024 at 16:28, Jan Stary  wrote:
> > >
> > > On Jul 10 14:44:28, tom.sm...@wirelessconnect.eu wrote:
> > > > we have been using  mfs mounted /var /dev and /tmp for years
> > >
> > > Why?
> > so any writes to disk would be simply written a memory filesystem and
> > if  there was a power cut
>
> How often do you get these power cuts?
Not Often in Ireland but want to avoid truck rools,
>
> > there would be no changes happening to the
> > disk because it is being just written to memory
>
> To be clear, you are concerned with changes to the filesystem
> (not disk as such), which makes a dirty fs and invokes fsck
> at reboot, right?

one of the main reasons which I didnt articulate was repeated log
writes of chatty daemons,
OpenVPN with a very short timeout/ retry interval  writing to disk and
causing flash / ssd wear

>
> > > > however  the impact of mfs (/var in particular) on upgrades has been
> > > > quite painful,
> > >
> > > How?
> > Losing new files in /var if the box is rebooted without first copying
> > the /var (in memory) to where the persistent storage is  (on shutdown)
>
> Whht do you mean by "new files"? Those coming to exist
> during regular operation (as in /var/run), or "new" if
> they get installed under /var on an upgrade?
the New files that would be installed in /var on upgrade yes,

>
> The above (losing the nonpersistent mfs storage) is exactly
> what would happen on a power outage; but what does that have
> to do with upgrades? If you reboot (cleanly) after an upgrade,
> the content of /var gets stored to persitent storage.
Oh ok ... I had not seen that behaviour  ...and  sometimes /var/db
the relinked kernel hash would throw the
relink failure ...

>
> > > > #cat /etc/fstab
> > > > ff0023511d131fc2.a / ffs rw,softdep,noatime 1 1
> > > > ff0023511d131fc2.b /usr/local ffs rw,wxallowed,nodev,softdep,noatime 1 2
> > > > ff0023511d131fc2.d /var ffs rw,nodev,nosuid,softdep,noatime 1 2
> > >
> > > So you _don't_ have /var on mfs ...
> > > Also, softdep no loger exists.
> > Thanks  it was an older option (now a noop (for backward compatibility
> > ) just checked the manual there...  Ill drop it off the deployment
> > script
> >
> > > > swap /tmp mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/tmp 0 0
> > > > swap /var/log mfs 
> > > > rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 0 0
> > > > swap /var/run mfs 
> > > > rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/var/run 0 0
> > > > swap /dev mfs rw,nosuid,noexec,-P=/persist-fs/dev,-i=2048,-s=32768 0 0
> > >
> > > Why do you need /tmp to persist?
> > Fair point  I was more interested in getting /tmp to be memory mounted
> > (dont care about persistence) in that case
> > checking manual
> >
> > > Why do you have a separate /dev?
> > when programs write to /dev/blah  is there a possibility of the
> > filesystem being updated...
>
> Above you talk about an upgrade, here about an update.
> What you mean is just a write to the filesystem?
>
> I never saw a / filesystem (holding /dev)
> been screwed in a way that fsck couldn't get out of
> because a file under /dev was being written ...
I think I took this advice so that / could be mounted read only (if I
Recall correctly )
so I was being cautious ... on this front ...

>
> > > Why don't you have a separate /home?
> > it is a router /firewall / network appliance  /not a standard desktop
> > / server ...  users are admins... etc .
> > >
> > > > ###
> > > > This seems to solve problems with  upgrades and  package updates,
> > basically if the partition was not synced with a copy on shutdown you
> > would lose the updated files ...
>
> Wll, you wouldn't have this problem
> if you were not using mfs :-)

indeed yes :)
>
> Filesystem inconsistency after a power outage is normal;
> fsck will deal with it. You might lose some files -
sometimes you lose the wrong file...  and the system requires manual
intervention / truck roll... I want to avoid that

> with mfs, you lose everything.

so I wanted to have  a reliable boot and prefer losing logs and tmp
files rather than risking corruption...

>
> Jan
>
Thanks for your feedback Jan
really appreciate it



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-17 Thread Jan Stary
On Jul 10 17:05:55, tom.sm...@wirelessconnect.eu wrote:
> Hi Jan
> thanks for your Reply and feedback,
>  please find my replies  in line ,
> 
> On Wed, 10 Jul 2024 at 16:28, Jan Stary  wrote:
> >
> > On Jul 10 14:44:28, tom.sm...@wirelessconnect.eu wrote:
> > > we have been using  mfs mounted /var /dev and /tmp for years
> >
> > Why?
> so any writes to disk would be simply written a memory filesystem and
> if  there was a power cut

How often do you get these power cuts?

> there would be no changes happening to the
> disk because it is being just written to memory

To be clear, you are concerned with changes to the filesystem
(not disk as such), which makes a dirty fs and invokes fsck
at reboot, right?

> > > however  the impact of mfs (/var in particular) on upgrades has been
> > > quite painful,
> >
> > How?
> Losing new files in /var if the box is rebooted without first copying
> the /var (in memory) to where the persistent storage is  (on shutdown)

Whht do you mean by "new files"? Those coming to exist
during regular operation (as in /var/run), or "new" if
they get installed under /var on an upgrade?

The above (losing the nonpersistent mfs storage) is exactly
what would happen on a power outage; but what does that have
to do with upgrades? If you reboot (cleanly) after an upgrade,
the content of /var gets stored to persitent storage.

> > > #cat /etc/fstab
> > > ff0023511d131fc2.a / ffs rw,softdep,noatime 1 1
> > > ff0023511d131fc2.b /usr/local ffs rw,wxallowed,nodev,softdep,noatime 1 2
> > > ff0023511d131fc2.d /var ffs rw,nodev,nosuid,softdep,noatime 1 2
> >
> > So you _don't_ have /var on mfs ...
> > Also, softdep no loger exists.
> Thanks  it was an older option (now a noop (for backward compatibility
> ) just checked the manual there...  Ill drop it off the deployment
> script
> 
> > > swap /tmp mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/tmp 0 0
> > > swap /var/log mfs rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 
> > > 0 0
> > > swap /var/run mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/var/run 
> > > 0 0
> > > swap /dev mfs rw,nosuid,noexec,-P=/persist-fs/dev,-i=2048,-s=32768 0 0
> >
> > Why do you need /tmp to persist?
> Fair point  I was more interested in getting /tmp to be memory mounted
> (dont care about persistence) in that case
> checking manual
> 
> > Why do you have a separate /dev?
> when programs write to /dev/blah  is there a possibility of the
> filesystem being updated...

Above you talk about an upgrade, here about an update.
What you mean is just a write to the filesystem?

I never saw a / filesystem (holding /dev)
been screwed in a way that fsck couldn't get out of
because a file under /dev was being written ...

> > Why don't you have a separate /home?
> it is a router /firewall / network appliance  /not a standard desktop
> / server ...  users are admins... etc .
> >
> > > ###
> > > This seems to solve problems with  upgrades and  package updates,
> basically if the partition was not synced with a copy on shutdown you
> would lose the updated files ...

Wll, you wouldn't have this problem
if you were not using mfs :-)

Filesystem inconsistency after a power outage is normal;
fsck will deal with it. You might lose some files -
with mfs, you lose everything.

Jan



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-11 Thread Raul Miller
On Mon, Jun 17, 2019 at 4:22 AM Mogens Jensen
 wrote:
> Even after many tries, I have not yet been able to corrupt the
> filesystem so fsck cannot repair it without manual intervention.
> However, if power is removed  while the 'reorder_kernel' script runs,
> the system will become completely unbootable. I could do this multiple
> times.

Given that file system corruption in these cases is more likely with
some hardware and less likely with other hardware, it might be
interesting to know what you were using.

Granted, this kind of information might be inadequate - there are too
many choices and too many challenges. But even hints are probably
better than nothing.

Thanks,

-- 
Raul



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Kenneth Gober
On Wed, Jul 10, 2024 at 9:45 AM Tom Smyth 
wrote:

> are there other directories that contain files that regularly change
> that should be mfs mounted ?
>

Logs for cron go into /var/cron by default. This can be changed by modifying
/etc/syslog.conf, but if you do this don't forget to update
/etc/newsyslog.conf
as well.

-ken


Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Tom Smyth
Hi Stuart I heard that no swap  stops dumps in the event of a panic

On Wed, 10 Jul 2024 at 21:46, Stuart Henderson
 wrote:
>
> On 2024-07-10, Tom Smyth  wrote:
> > I don't include a swap partition on the routers  in the field as I
> > don't want them swapping to disk, we over specify the hardware so that
> > memory exhaustion is (should be anyway)  not a concern.
>
> fwiw I don't know if they're (still? ever?) valid, but I've heard
> comments in the past that not having any swap can sometimes cause
> problems.
>
>


-- 
Kindest regards,
Tom Smyth.



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Brian Conway
On Wed, Jul 10, 2024, at 3:49 PM, Crystal Kolipe wrote:
> On Wed, Jul 10, 2024 at 08:32:38PM -, Stuart Henderson wrote:
>> On 2024-07-10, Tom Smyth  wrote:
>> > I don't include a swap partition on the routers  in the field as I
>> > don't want them swapping to disk, we over specify the hardware so that
>> > memory exhaustion is (should be anyway)  not a concern.
>> 
>> fwiw I don't know if they're (still? ever?) valid, but I've heard
>> comments in the past that not having any swap can sometimes cause
>> problems.
>
> Yeah, we discussed it with Jan on -arm back in 2021:
>
> https://marc.info/?l=openbsd-arm&m=163500865502090&w=2

Thanks for the link. I ran into the same issue with swapless arm64 around the 
same time and didn't investigate it further. Apparently I missed that thread, 
probably because I don't usually read arm@.

I definitely never encountered it on swapless amd64, i386, octeon, or macppc 
(the latter two being in years long past).

Brian



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Crystal Kolipe
On Wed, Jul 10, 2024 at 08:32:38PM -, Stuart Henderson wrote:
> On 2024-07-10, Tom Smyth  wrote:
> > I don't include a swap partition on the routers  in the field as I
> > don't want them swapping to disk, we over specify the hardware so that
> > memory exhaustion is (should be anyway)  not a concern.
> 
> fwiw I don't know if they're (still? ever?) valid, but I've heard
> comments in the past that not having any swap can sometimes cause
> problems.

Yeah, we discussed it with Jan on -arm back in 2021:

https://marc.info/?l=openbsd-arm&m=163500865502090&w=2



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Crystal Kolipe
On Wed, Jul 10, 2024 at 03:29:47PM -0500, Brian Conway wrote:
> On Wed, Jul 10, 2024, at 2:48 PM, Tom Smyth wrote:
> > Hi Kirill,
> > I don't include a swap partition on the routers  in the field as I
> > don't want them swapping to disk, we over specify the hardware so that
> > memory exhaustion is (should be anyway)  not a concern.
> >
> > so im assuming the lack of a swap partition means that this would not
> > be an issue (in my deployment scenario)
> 
> That matches my experience.

If you're using X86 then lack of a swap partition shouldn't be an issue.
We've been running X86 machines in production without swap for many years.

On some non-X86 archs, (E.G. arm64), we have seen issues with having no swap
configured, even without memory exhaustion.  This has been discussed on the
lists before.



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Stuart Henderson
On 2024-07-10, Tom Smyth  wrote:
> I don't include a swap partition on the routers  in the field as I
> don't want them swapping to disk, we over specify the hardware so that
> memory exhaustion is (should be anyway)  not a concern.

fwiw I don't know if they're (still? ever?) valid, but I've heard
comments in the past that not having any swap can sometimes cause
problems.




Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Stuart Henderson
On 2024-07-10, Marcus MERIGHI  wrote:
> Hello Tom, 
>
> tom.sm...@wirelessconnect.eu (Tom Smyth), 2024.07.10 (Wed) 18:40 (CEST):
>> swap /var/log mfs rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 0 0
>> mfs:97883 on /var/log type mfs (asynchronous, local, nodev, noexec,
>>   nosuid, size=524288 512-blocks)
>
> as you do not save the logs, why not syslog "to an in-memory buffer that may 
> be
> read using syslogc(8)" (text taken from syslog.conf(5)?
>
> I have everything commented out in syslog.conf(5), except for: 
> *.* :256:full
>
> And in rc.conf.local(8):
> syslogd_flags=-s /var/run/syslogd.sock
>
> You can then read the logs with 
> $ syslogc -f full

That can be useful, but there are some gotchas. For example you can't
use syslogc twice at the same time.



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Brian Conway
On Wed, Jul 10, 2024, at 2:48 PM, Tom Smyth wrote:
> Hi Kirill,
> I don't include a swap partition on the routers  in the field as I
> don't want them swapping to disk, we over specify the hardware so that
> memory exhaustion is (should be anyway)  not a concern.
>
> so im assuming the lack of a swap partition means that this would not
> be an issue (in my deployment scenario)

That matches my experience.

Brian Conway
Owner
RCE Software, LLC



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Tom Smyth
Hi Kirill,
I don't include a swap partition on the routers  in the field as I
don't want them swapping to disk, we over specify the hardware so that
memory exhaustion is (should be anyway)  not a concern.

so im assuming the lack of a swap partition means that this would not
be an issue (in my deployment scenario)


Thanks
Tom Smyth

On Wed, 10 Jul 2024 at 18:39, Kirill A. Korinsky  wrote:
>
> On Wed, 10 Jul 2024 17:40:17 +0100,
> Tom Smyth  wrote:
> >
> > swap /tmp mfs rw,nosuid,noexec,nodev,-s=262144 0 0
> > swap /var/log mfs rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 0 > > 0
> > swap /var/run mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/var/run 0 > > 0
> > swap /dev mfs rw,nosuid,noexec,-P=/persist-fs/dev,-i=2048,-s=32768 0 0
> >
>
> I'd like to share https://marc.info/?l=openbsd-bugs&m=171959901216119&w=2
>
> Here I have a pretty simple way to block mfs when the system starts to use 
> swap.
>
> Not sure if it is achievable by you, but still worth mentioning
>
> --
> wbr, Kirill



-- 
Kindest regards,
Tom Smyth.



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Marcus MERIGHI
Hello Tom, 

tom.sm...@wirelessconnect.eu (Tom Smyth), 2024.07.10 (Wed) 18:40 (CEST):
> swap /var/log mfs rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 0 0
> mfs:97883 on /var/log type mfs (asynchronous, local, nodev, noexec,
>   nosuid, size=524288 512-blocks)

as you do not save the logs, why not syslog "to an in-memory buffer that may be
read using syslogc(8)" (text taken from syslog.conf(5)?

I have everything commented out in syslog.conf(5), except for: 
*.* :256:full

And in rc.conf.local(8):
syslogd_flags=-s /var/run/syslogd.sock

You can then read the logs with 
$ syslogc -f full

Marcus

> On Wed, 10 Jul 2024 at 17:07, Tom Smyth  wrote:
> >
> > Hi Kirill,
> > Ill give sync a go ... and see how  it impacts performance...
> > thanks for the suggestion,
> >
> > On Wed, 10 Jul 2024 at 16:30, Kirill A. Korinsky  wrote:
> > >
> > > On Wed, 10 Jul 2024 14:44:28 +0100,
> > > Tom Smyth  wrote:
> > > >
> > > > #cat /etc/fstab
> > > >
> > > > ff0023511d131fc2.a / ffs rw,softdep,noatime 1 1
> > > > ff0023511d131fc2.b /usr/local ffs rw,wxallowed,nodev,softdep,noatime 1 2
> > > > ff0023511d131fc2.d /var ffs rw,nodev,nosuid,softdep,noatime 1 2
> > > > swap /tmp mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/tmp 0 0
> > > > swap /var/log mfs 
> > > > rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 0 0
> > > > swap /var/run mfs 
> > > > rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/var/run 0 0
> > > > swap /dev mfs rw,nosuid,noexec,-P=/persist-fs/dev,-i=2048,-s=32768 0 0
> > > >
> > >
> > > You can dramatically reduce the probability of errors that can't be fixed 
> > > by
> > > fsck on boot by adding sync. Especially with noatime, this seems like a
> > > bulletproof setup.
> > >
> > > --
> > > wbr, Kirill
> >
> >
> >
> > --
> > Kindest regards,
> > Tom Smyth.
> 
> 
> 
> -- 
> Kindest regards,
> Tom Smyth.
> 



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Kirill A . Korinsky
On Wed, 10 Jul 2024 17:40:17 +0100,
Tom Smyth  wrote:
> 
> swap /tmp mfs rw,nosuid,noexec,nodev,-s=262144 0 0
> swap /var/log mfs rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 0 0
> swap /var/run mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/var/run 0 0
> swap /dev mfs rw,nosuid,noexec,-P=/persist-fs/dev,-i=2048,-s=32768 0 0
> 

I'd like to share https://marc.info/?l=openbsd-bugs&m=171959901216119&w=2

Here I have a pretty simple way to block mfs when the system starts to use swap.

Not sure if it is achievable by you, but still worth mentioning

-- 
wbr, Kirill



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Tom Smyth
Thanks Kirill and Jan,
based on your feedback dropped pointless persist on /tmp (which is
cleared on startup anyway)   (ill check permissions (now that I think
of it ...)
updated fstab now
cat /etc/fstab
ff0023511d131fc2.a / ffs rw,sync,noatime 1 1
ff0023511d131fc2.b /usr/local ffs rw,wxallowed,nodev,sync,noatime 1 2
ff0023511d131fc2.d /var ffs rw,nodev,nosuid,sync,noatime 1 2
swap /tmp mfs rw,nosuid,noexec,nodev,-s=262144 0 0
swap /var/log mfs rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 0 0
swap /var/run mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/var/run 0 0
swap /dev mfs rw,nosuid,noexec,-P=/persist-fs/dev,-i=2048,-s=32768 0 0


mount output
/dev/sd0a on / type ffs (local, noatime, synchronous)
/dev/sd0b on /usr/local type ffs (local, noatime, nodev, wxallowed, synchronous)
/dev/sd0d on /var type ffs (local, noatime, nodev, nosuid, synchronous)
mfs:95138 on /tmp type mfs (asynchronous, local, nodev, noexec,
nosuid, size=262144 512-blocks)
mfs:97883 on /var/log type mfs (asynchronous, local, nodev, noexec,
nosuid, size=524288 512-blocks)
mfs:12839 on /var/run type mfs (asynchronous, local, nodev, noexec,
nosuid, size=262144 512-blocks)
mfs:77475 on /dev type mfs (asynchronous, local, noexec, nosuid,
size=32768 512-blocks)


Thanks again

On Wed, 10 Jul 2024 at 17:07, Tom Smyth  wrote:
>
> Hi Kirill,
> Ill give sync a go ... and see how  it impacts performance...
> thanks for the suggestion,
>
> On Wed, 10 Jul 2024 at 16:30, Kirill A. Korinsky  wrote:
> >
> > On Wed, 10 Jul 2024 14:44:28 +0100,
> > Tom Smyth  wrote:
> > >
> > > #cat /etc/fstab
> > >
> > > ff0023511d131fc2.a / ffs rw,softdep,noatime 1 1
> > > ff0023511d131fc2.b /usr/local ffs rw,wxallowed,nodev,softdep,noatime 1 2
> > > ff0023511d131fc2.d /var ffs rw,nodev,nosuid,softdep,noatime 1 2
> > > swap /tmp mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/tmp 0 0
> > > swap /var/log mfs rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 
> > > 0 0
> > > swap /var/run mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/var/run 
> > > 0 0
> > > swap /dev mfs rw,nosuid,noexec,-P=/persist-fs/dev,-i=2048,-s=32768 0 0
> > >
> >
> > You can dramatically reduce the probability of errors that can't be fixed by
> > fsck on boot by adding sync. Especially with noatime, this seems like a
> > bulletproof setup.
> >
> > --
> > wbr, Kirill
>
>
>
> --
> Kindest regards,
> Tom Smyth.



-- 
Kindest regards,
Tom Smyth.



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Tom Smyth
Hi Kirill,
Ill give sync a go ... and see how  it impacts performance...
thanks for the suggestion,

On Wed, 10 Jul 2024 at 16:30, Kirill A. Korinsky  wrote:
>
> On Wed, 10 Jul 2024 14:44:28 +0100,
> Tom Smyth  wrote:
> >
> > #cat /etc/fstab
> >
> > ff0023511d131fc2.a / ffs rw,softdep,noatime 1 1
> > ff0023511d131fc2.b /usr/local ffs rw,wxallowed,nodev,softdep,noatime 1 2
> > ff0023511d131fc2.d /var ffs rw,nodev,nosuid,softdep,noatime 1 2
> > swap /tmp mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/tmp 0 0
> > swap /var/log mfs rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 0 > > 0
> > swap /var/run mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/var/run 0 > > 0
> > swap /dev mfs rw,nosuid,noexec,-P=/persist-fs/dev,-i=2048,-s=32768 0 0
> >
>
> You can dramatically reduce the probability of errors that can't be fixed by
> fsck on boot by adding sync. Especially with noatime, this seems like a
> bulletproof setup.
>
> --
> wbr, Kirill



-- 
Kindest regards,
Tom Smyth.



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Tom Smyth
Hi Jan
thanks for your Reply and feedback,
 please find my replies  in line ,

On Wed, 10 Jul 2024 at 16:28, Jan Stary  wrote:
>
> On Jul 10 14:44:28, tom.sm...@wirelessconnect.eu wrote:
> > we have been using  mfs mounted /var /dev and /tmp for years
>
> Why?
so any writes to disk would be simply written a memory filesystem and
if  there was a power cut there would be no changes happening to the
disk because it is being just written to memory


>
> > however  the impact of mfs (/var in particular) on upgrades has been
> > quite painful,
>
> How?
Losing new files in /var if the box is rebooted without first copying
the /var (in memory) to where the persistent storage is  (on shutdown)


>
> > my latest iteration for fstab is to  have
> >  / ,  /var /usr/local  and /tmp with different mount points to support
> > different mount options, (wxallowed for /usr/local)
> >
> > and to
> > mfs mount  /var/run,  /var/logs  /dev and /tmp
>
> I assume you mean /var/log (not /var/logs).
Yes (sorry )
>
> > #cat /etc/fstab
> >
> > ff0023511d131fc2.a / ffs rw,softdep,noatime 1 1
> > ff0023511d131fc2.b /usr/local ffs rw,wxallowed,nodev,softdep,noatime 1 2
> > ff0023511d131fc2.d /var ffs rw,nodev,nosuid,softdep,noatime 1 2
>
> So you _don't_ have /var on mfs ...
> Also, softdep no loger exists.
Thanks  it was an older option (now a noop (for backward compatibility
) just checked the manual there...  Ill drop it off the deployment
script



>
> > swap /tmp mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/tmp 0 0
> > swap /var/log mfs rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 0 > > 0
> > swap /var/run mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/var/run 0 > > 0
> > swap /dev mfs rw,nosuid,noexec,-P=/persist-fs/dev,-i=2048,-s=32768 0 0
>
> Why do you need /tmp to persist?
Fair point  I was more interested in getting /tmp to be memory mounted
(dont care about persistence) in that case
checking manual

> Why do you have a separate /dev?
when programs write to /dev/blah  is there a possibility of the
filesystem being updated...


> Why don't you have a separate /home?
it is a router /firewall / network appliance  /not a standard desktop
/ server ...  users are admins... etc .
>
> > ###
> > This seems to solve problems with  upgrades and  package updates,
basically if the partition was not synced with a copy on shutdown you
would lose the updated files ...

>
> What problem?


>
> Jan
>


-- 
Kindest regards,
Tom Smyth.



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Kirill A . Korinsky
On Wed, 10 Jul 2024 14:44:28 +0100,
Tom Smyth  wrote:
> 
> #cat /etc/fstab
> 
> ff0023511d131fc2.a / ffs rw,softdep,noatime 1 1
> ff0023511d131fc2.b /usr/local ffs rw,wxallowed,nodev,softdep,noatime 1 2
> ff0023511d131fc2.d /var ffs rw,nodev,nosuid,softdep,noatime 1 2
> swap /tmp mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/tmp 0 0
> swap /var/log mfs rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 0 0
> swap /var/run mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/var/run 0 0
> swap /dev mfs rw,nosuid,noexec,-P=/persist-fs/dev,-i=2048,-s=32768 0 0
> 

You can dramatically reduce the probability of errors that can't be fixed by
fsck on boot by adding sync. Especially with noatime, this seems like a
bulletproof setup.

-- 
wbr, Kirill



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Jan Stary
On Jul 10 14:44:28, tom.sm...@wirelessconnect.eu wrote:
> we have been using  mfs mounted /var /dev and /tmp for years

Why?

> however  the impact of mfs (/var in particular) on upgrades has been
> quite painful,

How?

> my latest iteration for fstab is to  have
>  / ,  /var /usr/local  and /tmp with different mount points to support
> different mount options, (wxallowed for /usr/local)
> 
> and to
> mfs mount  /var/run,  /var/logs  /dev and /tmp

I assume you mean /var/log (not /var/logs).

> #cat /etc/fstab
> 
> ff0023511d131fc2.a / ffs rw,softdep,noatime 1 1
> ff0023511d131fc2.b /usr/local ffs rw,wxallowed,nodev,softdep,noatime 1 2
> ff0023511d131fc2.d /var ffs rw,nodev,nosuid,softdep,noatime 1 2

So you _don't_ have /var on mfs ...
Also, softdep no loger exists.

> swap /tmp mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/tmp 0 0
> swap /var/log mfs rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 0 0
> swap /var/run mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/var/run 0 0
> swap /dev mfs rw,nosuid,noexec,-P=/persist-fs/dev,-i=2048,-s=32768 0 0

Why do you need /tmp to persist?
Why do you have a separate /dev?
Why don't you have a separate /home?

> ###
> This seems to solve problems with  upgrades and  package updates,

What problem?

Jan



Re: Filesystem corruption on OpenBSD routers after power outage?

2024-07-10 Thread Tom Smyth
Folks,
sorry to revive an old thread but for OpenBSD Routers in the Field
(where power availability and graceful shutdown / restarts are far
from guaranteed,

we have been using  mfs mounted /var /dev and /tmp for years and it
has been quite successful (a few hundred devices running for a few
years)
and only 1 or 2 failures (attributable to filesystem issues) in that time.

 however  the impact of mfs (/var in particular) on upgrades has been
quite painful,
my latest iteration for fstab is to  have
 / ,  /var /usr/local  and /tmp with different mount points to support
different mount options, (wxallowed for /usr/local)

and to
mfs mount  /var/run,  /var/logs  /dev and /tmp

#cat /etc/fstab

ff0023511d131fc2.a / ffs rw,softdep,noatime 1 1
ff0023511d131fc2.b /usr/local ffs rw,wxallowed,nodev,softdep,noatime 1 2
ff0023511d131fc2.d /var ffs rw,nodev,nosuid,softdep,noatime 1 2
swap /tmp mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/tmp 0 0
swap /var/log mfs rw,nosuid,noexec,nodev,-s=524288,-P=/persist-fs/var/log 0 0
swap /var/run mfs rw,nosuid,noexec,nodev,-s=262144,-P=/persist-fs/var/run 0 0
swap /dev mfs rw,nosuid,noexec,-P=/persist-fs/dev,-i=2048,-s=32768 0 0

###
This seems to solve problems with  upgrades and  package updates,

I have left /var/www/logs/  out as we are not using httpd /
webservices  on the boxes in the field

are there other directories that contain files that regularly change
that should be mfs mounted ?


Any thoughts / feedback welcome

Thanks
Tom Smyth



On Sun, 15 Mar 2020 at 15:26, Maurice McCarthy  wrote:
>
> There is a discussion about sofdeps here
> http://openbsd-archive.7691.n7.nabble.com/What-are-the-disadvantages-of-soft-updates-td264283.html
>


-- 
Kindest regards,
Tom Smyth.



Re: Filesystem corruption on OpenBSD routers after power outage?

2020-03-15 Thread Maurice McCarthy
There is a discussion about sofdeps here
http://openbsd-archive.7691.n7.nabble.com/What-are-the-disadvantages-of-soft-updates-td264283.html



Re: Filesystem corruption on OpenBSD routers after power outage?

2020-03-15 Thread jeanfrancois

Good day,

I have few questions, why are soft updates not on by default, and
does they help consistency in case of failure,  are  they  recom‐
mended  to  be  turned on only in specific case ?  Except my mis‐
take, they help keeping drive consistent, avoid the need for fsck
for  most  hard failure cases, and only risk to have unused spare
sectores which can be later recovered.

Would not they help with the file system in general or some  draw
back with their use ?

Regards, J.F.


Sidenote: Filesystem corruption on OpenBSD routers after power outage?

2019-06-18 Thread Kevin Chadwick


> Even after many tries, I have not yet been able to corrupt the
> filesystem so fsck cannot repair it without manual intervention.

Another less severe corner fail case I have found is that on a couple of buggy
386 laptops (that will be replaced soon anyway) with temperamental over temp
shutdowns on some bootups (and now failing host controller, I'm guessing due to
age/damage). I have found LOST+FOUND can fill up the filesystem, preventing
library and kernel randomisation from taking affect. I have a script to check
and remove older LOST+FOUND files. It's unlikely that there would be anything
important lost on these browsing machines anyway.

I guess a proper solution would be thorny?



Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-17 Thread Theo de Raadt
Ted Unangst  wrote:

> Theo de Raadt wrote:
> > How does sync() fix this?  Please explain this.  Look at the source
> > code.
> > 
> > sync() is an asyncronous call requesting syncronization, and once
> > it has marked the blocks that should be pushed, it returns before
> > the work has been done.
> 
> Ah, indeed.
> 
> > > 2. cp could do an fsync call. There was a thread about this a while ago?
> > 
> > How does fsync fix this?  What if it returns an error.  What do you do next.
> > Should cp spin until fsync returns non-error, or what should it do.
> 
> Exit with an error? Same as if it got EIO or ENOSPC from a write? Then the
> command chain will stop before the mv.

The OP demonstrated a problem.  Any solution presented should try to fix
the problem, not just "maybe".




Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-17 Thread Ted Unangst
Theo de Raadt wrote:
> How does sync() fix this?  Please explain this.  Look at the source
> code.
> 
> sync() is an asyncronous call requesting syncronization, and once
> it has marked the blocks that should be pushed, it returns before
> the work has been done.

Ah, indeed.

> > 2. cp could do an fsync call. There was a thread about this a while ago?
> 
> How does fsync fix this?  What if it returns an error.  What do you do next.
> Should cp spin until fsync returns non-error, or what should it do.

Exit with an error? Same as if it got EIO or ENOSPC from a write? Then the
command chain will stop before the mv.



Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-17 Thread Theo de Raadt
Ted Unangst  wrote:

> Mogens Jensen wrote:
> > Even after many tries, I have not yet been able to corrupt the
> > filesystem so fsck cannot repair it without manual intervention.
> > However, if power is removed  while the 'reorder_kernel' script runs,
> > the system will become completely unbootable. I could do this multiple
> > times.
> 
> The new kernel is installed like this:
> umask 077 && cp bsd /nbsd && mv /nbsd /bsd
> 
> So a crash during compilation or linking shouldn't affect reboot. However,
> there's a window between cp and mv where some of the new kernel may reside in
> unwritten dirty buffers. Then mv will rewrite the directory entry. A crash at
> this point will leave you with a broken kernel.

I agree, dirty buffers are being pushed too slowly.

> A few possible fixes.
> 
> 1. Insert a sync call in there. Kinda heavyweight, but works.

How does sync() fix this?  Please explain this.  Look at the source
code.

sync() is an asyncronous call requesting syncronization, and once
it has marked the blocks that should be pushed, it returns before
the work has been done.

BUGS
 sync() may return before the buffers are completely flushed.

This has been marked as a bug for over half your life, to the point
where it isn't actually a bug.  It is the design of the system call.

So it doesn't solve what you want to solve.

Who will be first person to says sync(); sync(); sync();

> 2. cp could do an fsync call. There was a thread about this a while ago?

How does fsync fix this?  What if it returns an error.  What do you do next.
Should cp spin until fsync returns non-error, or what should it do.

> 3. mv could do the fsync instead, to make sure it doesn't move incomplete
> files.

Same question.



Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-17 Thread Ted Unangst
Mogens Jensen wrote:
> Even after many tries, I have not yet been able to corrupt the
> filesystem so fsck cannot repair it without manual intervention.
> However, if power is removed  while the 'reorder_kernel' script runs,
> the system will become completely unbootable. I could do this multiple
> times.

The new kernel is installed like this:
umask 077 && cp bsd /nbsd && mv /nbsd /bsd

So a crash during compilation or linking shouldn't affect reboot. However,
there's a window between cp and mv where some of the new kernel may reside in
unwritten dirty buffers. Then mv will rewrite the directory entry. A crash at
this point will leave you with a broken kernel.

A few possible fixes.

1. Insert a sync call in there. Kinda heavyweight, but works.

2. cp could do an fsync call. There was a thread about this a while ago?

3. mv could do the fsync instead, to make sure it doesn't move incomplete
files.



Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-17 Thread Mogens Jensen
Since posting this question I have been trying to intentionally corrupt
the router filesystem, by simulating power outages while writing files
and various other things.

Even after many tries, I have not yet been able to corrupt the
filesystem so fsck cannot repair it without manual intervention.
However, if power is removed  while the 'reorder_kernel' script runs,
the system will become completely unbootable. I could do this multiple
times.

A situation could be that the electric grid is unstable, so power will
return and a new outage will occur shortly after, if this happens
during the boot process at the exact time 'reorder_kernel' is running,
the system will break because of a corrupt kernel and repair is not
possible remotely.

Is there a way to avoid 'reorder_kernel' during the boot process and
run it manually instead?

Thanks in advance.

Mogens Jensen


Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-07 Thread Consus
On 19:30 Tue 04 Jun, Mogens Jensen wrote:
> I'm going to build a router for use in a remote location, and I have
> chosen OpenBSD 6.5 for the task. Unfortunately, it's not possible to
> protect the router with an UPS, so it will have to be resilient enough
> to survive sudden power outages and still boot without manual
> intervention.
> 
> In the past I have built a few Linux based routers and they were
> configured to run from RAM. I have made some research to see if this is
> also possible on OpenBSD and found that, while there are solutions to
> have / read-only, none of this is officially supported.
> 
> Can anyone with experience running OpenBSD routers without UPS, tell if
> filesystem corruption is going to be a problem after power outages, or
> if there are any officially supported ways to make the system resilient
> enough to not break after a power outage?
> 
> I'm using an mSATA disk with MLC flash in the router.
> 
> Thanks in advance.

I've had a couple of issues with my APU2-based router on 6.4. After the
power outage the newly linked kernel was corrupted, and some files ended
up in lost+found.



Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-06 Thread Tom Smyth
Yeah Marko,
this blog did help me when I was resarching the issue ...
Cheers,


On Thu, 6 Jun 2019 at 10:07, Marko Cupać  wrote:

> On Tue, 04 Jun 2019 19:30:08 +
> Mogens Jensen  wrote:
>
> > Can anyone with experience running OpenBSD routers without UPS, tell
> > if filesystem corruption is going to be a problem after power
> > outages, or if there are any officially supported ways to make the
> > system resilient enough to not break after a power outage?
>
> I have described my !!!UNSUPPORTED!!! setup !!!WARNING, BLATANT
> SELF-PROMOTION!!! here:
>
>
> https://www.mimar.rs/blog/how-to-increase-openbsds-resilience-to-power-outages
>
> So far I have two 6.5's on PCengine's apu2d4 (~20 6.2-6.4's). The only
> "problem" I have since 6.4 is that I have to mount / rw when tcpdumping
> because unveil does not like ro /etc.
>
> HTH,
> --
> Before enlightenment - chop wood, draw water.
> After  enlightenment - chop wood, draw water.
>
> Marko Cupać
> https://www.mimar.rs/
>
>

-- 
Kindest regards,
Tom Smyth.


Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-06 Thread Marko Cupać
On Tue, 04 Jun 2019 19:30:08 +
Mogens Jensen  wrote:

> Can anyone with experience running OpenBSD routers without UPS, tell
> if filesystem corruption is going to be a problem after power
> outages, or if there are any officially supported ways to make the
> system resilient enough to not break after a power outage?

I have described my !!!UNSUPPORTED!!! setup !!!WARNING, BLATANT
SELF-PROMOTION!!! here:

https://www.mimar.rs/blog/how-to-increase-openbsds-resilience-to-power-outages

So far I have two 6.5's on PCengine's apu2d4 (~20 6.2-6.4's). The only
"problem" I have since 6.4 is that I have to mount / rw when tcpdumping
because unveil does not like ro /etc.

HTH,
-- 
Before enlightenment - chop wood, draw water.
After  enlightenment - chop wood, draw water.

Marko Cupać
https://www.mimar.rs/



Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-05 Thread Jan Stary
On Jun 04 19:30:08, mogens-jen...@protonmail.com wrote:
> Can anyone with experience running OpenBSD routers without UPS, tell if
> filesystem corruption is going to be a problem after power outages

I have been using various ALIXes with a CF card as storage,
and in the 10+ years, I had to do a manual fsck on them about
four times after a power outage. (Most often, the only indication
of an outage is the SMS the router sends me upon reboot.)

Jan



Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-05 Thread Kenneth Gober
On Tue, Jun 4, 2019 at 3:34 PM Mogens Jensen 
wrote:

> Can anyone with experience running OpenBSD routers without UPS, tell if
> filesystem corruption is going to be a problem after power outages, or
> if there are any officially supported ways to make the system resilient
> enough to not break after a power outage?
>
> I'm using an mSATA disk with MLC flash in the router.
>

I have some OpenBSD routers without UPS protection (Soekris net6501
devices) and
after using them for some years, I think it's not possible to have absolute
100%
protection from filesystem corruption due to power problems, without causing
other problems such as making the system overly fragile to upgrade or
maintain.

However, it works reasonably well to put /var/log on an MFS file system,
and set
up a cron job (as well as a line in /etc/rc.shutdown) to periodically rsync
/var/log
to another directory (so that logs will be preserved after a reboot).  This
works
fairly well, and the system comes up properly after power failures easily
over 98%
of the time.  Very rarely (i.e. I have seen it happen twice in a decade)
you will get
unlucky and have corruption anyway that requires you to run "fsck -y"
manually.
This is rare enough that I haven't bothered trying to automate it away.

To accomplish this, I installed OpenBSD with /var/log being a separate
filesystem,
then edited /etc/fstab to rename /var/log to /mfs/log, and add a new entry
for /var/log:

swap /var/log mfs rw,nodev,nosuid,-s=128M,-P=/mfs/log 0 0

Initializing the MFS /var/log by loading from /mfs/log, combined with an
rsync
command in /etc/rc.shutdown, is what gives the illusion of /var/log being
preserved
across reboots.

-ken


Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-05 Thread Federico Giannici

Is there any way to tell the boot script to use the "-y" flag in fsck?

If something goes wrong with simple fsck, I always simply do a "fsck 
-y". There is no other option for me. So, it would be VERY useful if 
this could be done automatically instead of interrupting the router startup.


Thanks.



On 6/5/19 1:30 AM, Nick Holland wrote:

On 6/4/19 1:29 PM, Mogens Jensen wrote:

I'm going to build a router for use in a remote location, and I have
chosen OpenBSD 6.5 for the task. Unfortunately, it's not possible to
protect the router with an UPS, so it will have to be resilient enough
to survive sudden power outages and still boot without manual
intervention.

In the past I have built a few Linux based routers and they were
configured to run from RAM. I have made some research to see if this is
also possible on OpenBSD and found that, while there are solutions to
have / read-only, none of this is officially supported.

Can anyone with experience running OpenBSD routers without UPS, tell if
filesystem corruption is going to be a problem after power outages, or
if there are any officially supported ways to make the system resilient
enough to not break after a power outage?

I'm using an mSATA disk with MLC flash in the router.


I realized a few decades ago that consumer UPSs are a bad investment.
Industrial UPSs are a dubious idea in business unless you have a
dual-power supply machine and can hook each PS to a DIFFERENT UPS -- in
my area, grid power is more reliable than cheap UPSes (your mileage may
vary).  And you have to MAINTAIN your UPSs, otherwise after a few years,
UPSs turn minor glitches into power outages (thank you very much).

I'm also fond of proving my own claims, so I very often just yank the
cord on my systems rather than doing orderly shutdowns.

Yes, if you drop power on an OpenBSD system, you will get an fsck on
reboot.  Solution: Make your partitions as small as reasonable.  Just
because you got a 500G disk for cheap, no reason to allocate all 500G.
For a router, 10G is PLENTY, and will fsck quickly.  If you have slow
media (i.e., flash drives), you might want to aim for 1G.  Every once in
a long while, you might catch a really bad time for the power to go out,
and have to manually say "Fix it!" to fsck, but for the most part, the
system will just come back up after the power comes back on.

The less you write to disk, the less risk you have of having to manually
intervene in your system's reboot.  IF you want to do some fancy
logging, keep the logging partition out of the fstab file, and have a
script that brings it up with a "fsck -y" AFTER the system comes up, and
start the fancy logging AFTER the big logging partition successfully mounts.

But don't do stupid games to try to improve your chances, just make sure
there's a monitor and keyboard available to fix any problems that might
happen.  Simple systems have simple problems.  Complex systems break in
complex ways.  You want me to swear you'll never have to manually
intervene in boot after an "event"?  Nope.  But I've walked
non-technical people through single-user fsck's over the phone; when
your bastardized system breaks, you will be down for a lot longer and
you will be going on-site to fix.

Nick.





Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-05 Thread Marc Espie
On Wed, Jun 05, 2019 at 05:12:20AM +, Roderick wrote:
> 
> "-o union" was last in 3.7, disappeared in 3.8. Was there a reason?
> 
> https://man.openbsd.org/OpenBSD-3.7/mount

Yes, the developers felt we couldn't make it work without bugs in a sane
way.

Locks over locks is insanely hard to get right.



Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-04 Thread Otto Moerbeek
On Wed, Jun 05, 2019 at 05:12:20AM +, Roderick wrote:

> 
> "-o union" was last in 3.7, disappeared in 3.8. Was there a reason?

It was broken and complicated the filesystem code beyond measure.

-Otto
> 
> https://man.openbsd.org/OpenBSD-3.7/mount
> 
> Rodrigo
> 
> > What also would be practical is a "mount -o union" like in FreeBSD,
> > but unfortunately I do not see it in OpenBSD.
> > 
> > Then one could mount a mfs system over a normal one, only to be read.
> > 
> > Rodrigo
> 



Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-04 Thread Roderick



"-o union" was last in 3.7, disappeared in 3.8. Was there a reason?

https://man.openbsd.org/OpenBSD-3.7/mount

Rodrigo


What also would be practical is a "mount -o union" like in FreeBSD,
but unfortunately I do not see it in OpenBSD.

Then one could mount a mfs system over a normal one, only to be read.

Rodrigo




Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-04 Thread Roderick



What also would be practical is a "mount -o union" like in FreeBSD,
but unfortunately I do not see it in OpenBSD.

Then one could mount a mfs system over a normal one, only to be read.

Rodrigo



Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-04 Thread Roderick



Look at -P option in mount_mfs.

Rodrigo



Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-04 Thread gwes




On 6/4/19 3:30 PM, Mogens Jensen wrote:

I'm going to build a router for use in a remote location, and I have
chosen OpenBSD 6.5 for the task. Unfortunately, it's not possible to
protect the router with an UPS, so it will have to be resilient enough
to survive sudden power outages and still boot without manual
intervention.

In the past I have built a few Linux based routers and they were
configured to run from RAM. I have made some research to see if this is
also possible on OpenBSD and found that, while there are solutions to
have / read-only, none of this is officially supported.

Can anyone with experience running OpenBSD routers without UPS, tell if
filesystem corruption is going to be a problem after power outages, or
if there are any officially supported ways to make the system resilient
enough to not break after a power outage?

I'm using an mSATA disk with MLC flash in the router.

Thanks in advance.

Mogens Jensen

As Mr. Holland points out, a UPS doesn't really help overall reliability.

In practice, /, /bin, and /usr are effectively read-only except for
kernel and shared library randomization at boot time.
/var gets written infrequently for logs, etc.
/tmp, of course, is frequently written but its contents are irrelevant
after a reboot.

An important way to reduce disk activity is to mount all
filesystems "noatime". This suppresses effectively all writes
to /, /bin, and /usr after boot. Changes to /var get pushed to
disk fairly quickly.
The likelihood of significant corruption is very small.

In practice, I knock my router off-line once or twice a month by
messing with power cables nearby. The only way I find out is by
looking at the logs. I've never had to manually fsck any of my
routers except after electrical storms - and only then after moving
the disk to a non-smoking chassis.

Physical access to a console by a trusted person or remote console
access is required. Not for any failings of OpenBSD in particular but for
the guaranteed perversity of electronic devices and unforseeable
acts of nature and man messing up the local environment.

You will [should] access the system twice a year to install the latest
release.

[ insert standard disclaimers here ]

Geoff Steckel



Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-04 Thread Nick Holland
On 6/4/19 1:29 PM, Mogens Jensen wrote:
> I'm going to build a router for use in a remote location, and I have
> chosen OpenBSD 6.5 for the task. Unfortunately, it's not possible to
> protect the router with an UPS, so it will have to be resilient enough
> to survive sudden power outages and still boot without manual
> intervention.
> 
> In the past I have built a few Linux based routers and they were
> configured to run from RAM. I have made some research to see if this is
> also possible on OpenBSD and found that, while there are solutions to
> have / read-only, none of this is officially supported.
> 
> Can anyone with experience running OpenBSD routers without UPS, tell if
> filesystem corruption is going to be a problem after power outages, or
> if there are any officially supported ways to make the system resilient
> enough to not break after a power outage?
> 
> I'm using an mSATA disk with MLC flash in the router.

I realized a few decades ago that consumer UPSs are a bad investment.
Industrial UPSs are a dubious idea in business unless you have a
dual-power supply machine and can hook each PS to a DIFFERENT UPS -- in
my area, grid power is more reliable than cheap UPSes (your mileage may
vary).  And you have to MAINTAIN your UPSs, otherwise after a few years,
UPSs turn minor glitches into power outages (thank you very much).

I'm also fond of proving my own claims, so I very often just yank the
cord on my systems rather than doing orderly shutdowns.

Yes, if you drop power on an OpenBSD system, you will get an fsck on
reboot.  Solution: Make your partitions as small as reasonable.  Just
because you got a 500G disk for cheap, no reason to allocate all 500G.
For a router, 10G is PLENTY, and will fsck quickly.  If you have slow
media (i.e., flash drives), you might want to aim for 1G.  Every once in
a long while, you might catch a really bad time for the power to go out,
and have to manually say "Fix it!" to fsck, but for the most part, the
system will just come back up after the power comes back on.

The less you write to disk, the less risk you have of having to manually
intervene in your system's reboot.  IF you want to do some fancy
logging, keep the logging partition out of the fstab file, and have a
script that brings it up with a "fsck -y" AFTER the system comes up, and
start the fancy logging AFTER the big logging partition successfully mounts.

But don't do stupid games to try to improve your chances, just make sure
there's a monitor and keyboard available to fix any problems that might
happen.  Simple systems have simple problems.  Complex systems break in
complex ways.  You want me to swear you'll never have to manually
intervene in boot after an "event"?  Nope.  But I've walked
non-technical people through single-user fsck's over the phone; when
your bastardized system breaks, you will be down for a lot longer and
you will be going on-site to fix.

Nick.



Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-04 Thread Tom Smyth
there is also an option for setting fsck  to approve fixes without a prompt
..
but I cant think of it off the top of my head...  and this would be useful
to  set  on your routers also


On Tue, 4 Jun 2019 at 21:05, Tom Smyth  wrote:

> Hi Mogens,
>
> there are a number of threads on this if you search the misc archives on
> marc.info,
>
> but setting softdep,noatime  mount options on /etc/fstab is advisable
>
> for routers I tend to use mfs for partitions that tend to get written to
> alot
>
> the following entries (/etc/fstab) show How I use mfs on my routers...
> swap /tmp mfs
> rw,nosuid,noexec,nodev,-s=512000,-P=/directorythatcontainsfilesthatwillbecopiedtomemoryatbootup/tmp
> 0 0
> swap /var mfs rw,nosuid,noexec,nodev,-s=1024000,-P=/
> directorythatcontainsfilesthatwillbecopiedtomemoryatbootup/var 0 0
> swap /dev mfs rw,nosuid,noexec,-P=/
> directorythatcontainsfilesthatwillbecopiedtomemoryatbootup/dev,-i=2048,-s=102400
> 0 0
>
> but bear in mind that  that uses up to 1.6GB of ram ...  so  you might
> want to tweak. to what suits your needs...
>
> check out  conway's resflash and cappucios flashrd also
>
> https://www.packetmischief.ca/openbsd-compact-flash-firewall/
>
>  I hope this helps
> Tom Smyth
>
> On Tue, 4 Jun 2019 at 20:31, Mogens Jensen 
> wrote:
>
>> I'm going to build a router for use in a remote location, and I have
>> chosen OpenBSD 6.5 for the task. Unfortunately, it's not possible to
>> protect the router with an UPS, so it will have to be resilient enough
>> to survive sudden power outages and still boot without manual
>> intervention.
>>
>> In the past I have built a few Linux based routers and they were
>> configured to run from RAM. I have made some research to see if this is
>> also possible on OpenBSD and found that, while there are solutions to
>> have / read-only, none of this is officially supported.
>>
>> Can anyone with experience running OpenBSD routers without UPS, tell if
>> filesystem corruption is going to be a problem after power outages, or
>> if there are any officially supported ways to make the system resilient
>> enough to not break after a power outage?
>>
>> I'm using an mSATA disk with MLC flash in the router.
>>
>> Thanks in advance.
>>
>> Mogens Jensen
>>
>
>
> --
> Kindest regards,
> Tom Smyth.
>


-- 
Kindest regards,
Tom Smyth.


Re: Filesystem corruption on OpenBSD routers after power outage?

2019-06-04 Thread Tom Smyth
Hi Mogens,

there are a number of threads on this if you search the misc archives on
marc.info,

but setting softdep,noatime  mount options on /etc/fstab is advisable

for routers I tend to use mfs for partitions that tend to get written to
alot

the following entries (/etc/fstab) show How I use mfs on my routers...
swap /tmp mfs
rw,nosuid,noexec,nodev,-s=512000,-P=/directorythatcontainsfilesthatwillbecopiedtomemoryatbootup/tmp
0 0
swap /var mfs rw,nosuid,noexec,nodev,-s=1024000,-P=/
directorythatcontainsfilesthatwillbecopiedtomemoryatbootup/var 0 0
swap /dev mfs rw,nosuid,noexec,-P=/
directorythatcontainsfilesthatwillbecopiedtomemoryatbootup/dev,-i=2048,-s=102400
0 0

but bear in mind that  that uses up to 1.6GB of ram ...  so  you might want
to tweak. to what suits your needs...

check out  conway's resflash and cappucios flashrd also

https://www.packetmischief.ca/openbsd-compact-flash-firewall/

 I hope this helps
Tom Smyth

On Tue, 4 Jun 2019 at 20:31, Mogens Jensen 
wrote:

> I'm going to build a router for use in a remote location, and I have
> chosen OpenBSD 6.5 for the task. Unfortunately, it's not possible to
> protect the router with an UPS, so it will have to be resilient enough
> to survive sudden power outages and still boot without manual
> intervention.
>
> In the past I have built a few Linux based routers and they were
> configured to run from RAM. I have made some research to see if this is
> also possible on OpenBSD and found that, while there are solutions to
> have / read-only, none of this is officially supported.
>
> Can anyone with experience running OpenBSD routers without UPS, tell if
> filesystem corruption is going to be a problem after power outages, or
> if there are any officially supported ways to make the system resilient
> enough to not break after a power outage?
>
> I'm using an mSATA disk with MLC flash in the router.
>
> Thanks in advance.
>
> Mogens Jensen
>


-- 
Kindest regards,
Tom Smyth.


Filesystem corruption on OpenBSD routers after power outage?

2019-06-04 Thread Mogens Jensen
I'm going to build a router for use in a remote location, and I have
chosen OpenBSD 6.5 for the task. Unfortunately, it's not possible to
protect the router with an UPS, so it will have to be resilient enough
to survive sudden power outages and still boot without manual
intervention.

In the past I have built a few Linux based routers and they were
configured to run from RAM. I have made some research to see if this is
also possible on OpenBSD and found that, while there are solutions to
have / read-only, none of this is officially supported.

Can anyone with experience running OpenBSD routers without UPS, tell if
filesystem corruption is going to be a problem after power outages, or
if there are any officially supported ways to make the system resilient
enough to not break after a power outage?

I'm using an mSATA disk with MLC flash in the router.

Thanks in advance.

Mogens Jensen