Re: [ptxdist] [PATCH] rootfs: keep /var writable, even if the rootfs is read-only

2019-06-05 Thread Ulrich Ölmann
Hi Jürgen,

please find some adjustments inline.

On Tue, Jun 04 2019 at 18:00 +0200, Juergen Borleis  wrote:
> Having a read-only root filesystem is always a source of pain and trouble.
> Many applications and tools expect to be able to store their state or
> caching data or at least their logs somewhere in the filesystem.
>
> The '/var' directory tree has a well known structure according to the
> "File System Hierarchy Standard" and is used by all carefully designed
> programs. Thus, this change provides a way to have this '/var' directory
> tree writable, even if the main root filesystem is mounted read-only. It
> uses an overlay filesystem and by default a RAM disk to store changed and
> added data to this directory tree in a non persistent manner.
>
> Due to the nature of the overlay filesystem the underlaying files from the
> main root filesystem can still be accessed.
>
> This approach requires the overlay filesystem support from the Linux
> kernel. In order to use it, the feature CONFIG_OVERLAY_FS must be enabled.
>
> A BSP can change the overlaying filesystem by providing its own
> 'run-varoverlay.mount' in order to restrict the used RAM disk differently
> or switch to a different local storage.
>
> Signed-off-by: Juergen Borleis 
> ---
>  doc/daily_work.inc| 97 +++
>  projectroot/etc/fstab |  6 +-
>  .../lib/systemd/system/run-varoverlayfs.mount | 10 ++
>  projectroot/usr/lib/systemd/system/var.mount  |  9 ++
>  projectroot/usr/sbin/mount.varoverlayfs   | 11 +++
>  rules/rootfs.in   | 15 +++
>  rules/rootfs.make | 23 -
>  7 files changed, 164 insertions(+), 7 deletions(-)
>  create mode 100644 projectroot/usr/lib/systemd/system/run-varoverlayfs.mount
>  create mode 100644 projectroot/usr/lib/systemd/system/var.mount
>  create mode 100644 projectroot/usr/sbin/mount.varoverlayfs
>
> diff --git a/doc/daily_work.inc b/doc/daily_work.inc
> index 74da11953..093f069bf 100644
> --- a/doc/daily_work.inc
> +++ b/doc/daily_work.inc
> @@ -1371,3 +1371,100 @@ in the build machine's filesystem also for the target 
> filesystem image. With
>  a different ``umask`` than ``0022`` at build-time this may fail badly at
>  run-time with strange erroneous behaviour (for example some daemons with
>  regular user permissions cannot acces their own configuration files).
> +
> +Read Only Filesystem
> +
> +
> +A system can run a read-only root filesystem in order to have a unit which
> +can be powered off at any time, without any previous shutting down sequence.

s/shutting/shut/

> +
> +But many applications and tools are still expecting a writable filesystem to
> +temporarely store some kind of data or logging information for example. All

s/temporarely/temporarily/

> +these write attempts will fail and thus, the applications and tools will 
> fail,
> +too.
> +
> +According to the *Filesystem Hierarchy Standard 2.3* the directory tree in
> +'/var/' is traditionally writable and its content is persistent across system
> +restarts. Thus, this directory tree is used by most applications and tools to
> +store their data.
> +
> +The *Filesystem Hierarchy Standard 2.3* defines the following directories
> +below '/var':
> +
> +- 'cache/': Application specific cache data
> +- 'crash/': System crash dumps
> +- 'lib/':   Application specific variable state information
> +- 'lock/':  Lock files
> +- 'log/':   Log files and directories
> +- 'run/':   Data relevant to run processes

s/run/running/

> +- 'spool/': Application spool data
> +- 'tmp/':   Temporary files preserved between system reboots
> +
> +Since this writable directory tree is useful and valid for full blown host

s/Since/Although/ ?

> +machines, an embedded system can behave differently here: For example the

s/For example the/for example a/

> +requirement can drop the persistency of changed data across reboots and 
> always

s/persistency/persistence/

> +start with empty directories.
> +
> +Partially RAM Disks
> +~~~
> +
> +This is the default behaviour of PTXdist: it mounts a couple of RAM disks 
> over
> +directories in ``/var`` expected to be writable by various applications and
> +tools. These RAM disks start alway in an empty state and are defined as 
> follows:

s/alway/always/

> +
> ++-+---+
> +| mount point | mount options
>  |
> ++=+===+
> +| /var/log| nosuid,nodev,noexec,mode=0755,size=10%   
>  |
> ++-+---+
> +| /var/lock   | nosuid,nodev,noexec,mode=0755,size=1M
>  |
> ++-+---+
> +| /var/tmp| nosuid,node

[ptxdist] [PATCH] rootfs: keep /var writable, even if the rootfs is read-only

2019-06-04 Thread Juergen Borleis
Having a read-only root filesystem is always a source of pain and trouble.
Many applications and tools expect to be able to store their state or
caching data or at least their logs somewhere in the filesystem.

The '/var' directory tree has a well known structure according to the
"File System Hierarchy Standard" and is used by all carefully designed
programs. Thus, this change provides a way to have this '/var' directory
tree writable, even if the main root filesystem is mounted read-only. It
uses an overlay filesystem and by default a RAM disk to store changed and
added data to this directory tree in a non persistent manner.

Due to the nature of the overlay filesystem the underlaying files from the
main root filesystem can still be accessed.

This approach requires the overlay filesystem support from the Linux
kernel. In order to use it, the feature CONFIG_OVERLAY_FS must be enabled.

A BSP can change the overlaying filesystem by providing its own
'run-varoverlay.mount' in order to restrict the used RAM disk differently
or switch to a different local storage.

Signed-off-by: Juergen Borleis 
---
 doc/daily_work.inc| 97 +++
 projectroot/etc/fstab |  6 +-
 .../lib/systemd/system/run-varoverlayfs.mount | 10 ++
 projectroot/usr/lib/systemd/system/var.mount  |  9 ++
 projectroot/usr/sbin/mount.varoverlayfs   | 11 +++
 rules/rootfs.in   | 15 +++
 rules/rootfs.make | 23 -
 7 files changed, 164 insertions(+), 7 deletions(-)
 create mode 100644 projectroot/usr/lib/systemd/system/run-varoverlayfs.mount
 create mode 100644 projectroot/usr/lib/systemd/system/var.mount
 create mode 100644 projectroot/usr/sbin/mount.varoverlayfs

diff --git a/doc/daily_work.inc b/doc/daily_work.inc
index 74da11953..093f069bf 100644
--- a/doc/daily_work.inc
+++ b/doc/daily_work.inc
@@ -1371,3 +1371,100 @@ in the build machine's filesystem also for the target 
filesystem image. With
 a different ``umask`` than ``0022`` at build-time this may fail badly at
 run-time with strange erroneous behaviour (for example some daemons with
 regular user permissions cannot acces their own configuration files).
+
+Read Only Filesystem
+
+
+A system can run a read-only root filesystem in order to have a unit which
+can be powered off at any time, without any previous shutting down sequence.
+
+But many applications and tools are still expecting a writable filesystem to
+temporarely store some kind of data or logging information for example. All
+these write attempts will fail and thus, the applications and tools will fail,
+too.
+
+According to the *Filesystem Hierarchy Standard 2.3* the directory tree in
+'/var/' is traditionally writable and its content is persistent across system
+restarts. Thus, this directory tree is used by most applications and tools to
+store their data.
+
+The *Filesystem Hierarchy Standard 2.3* defines the following directories
+below '/var':
+
+- 'cache/': Application specific cache data
+- 'crash/': System crash dumps
+- 'lib/':   Application specific variable state information
+- 'lock/':  Lock files
+- 'log/':   Log files and directories
+- 'run/':   Data relevant to run processes
+- 'spool/': Application spool data
+- 'tmp/':   Temporary files preserved between system reboots
+
+Since this writable directory tree is useful and valid for full blown host
+machines, an embedded system can behave differently here: For example the
+requirement can drop the persistency of changed data across reboots and always
+start with empty directories.
+
+Partially RAM Disks
+~~~
+
+This is the default behaviour of PTXdist: it mounts a couple of RAM disks over
+directories in ``/var`` expected to be writable by various applications and
+tools. These RAM disks start alway in an empty state and are defined as 
follows:
+
++-+---+
+| mount point | mount options |
++=+===+
+| /var/log| nosuid,nodev,noexec,mode=0755,size=10%|
++-+---+
+| /var/lock   | nosuid,nodev,noexec,mode=0755,size=1M |
++-+---+
+| /var/tmp| nosuid,nodev,mode=1777,size=20%   |
++-+---+
+
+This is a very simple and optimistic approach and works for surprisingly many 
use
+cases. But some applications expect a writable ``/var/lib`` and will fail due
+to this setup. Using an additional RAM disk for ``/var/lib`` might not help in
+this use case, because it will bury at build-time generated data already 
present
+in this director