Re: [PATCH] ext2: add store_sync in strategic places.
Hello, Milos Nikic, le mer. 28 janv. 2026 11:29:29 -0800, a ecrit: > Fair...here is some "real world" test (even though its highly skewed towards a > developer-type person) Thanks for benchmarking! > These tests confirm that the patch follows the "pay for what you use" > principle. Users will not see a degradation in git, tar, or compilation times. > They will see a moderate (12%) slowdown in apt, which buys them protection > against corruption during power loss. Yes, that's what I was hoping to see :) I have applied the two getblk and write_disknode bits, thanks! Samuel
Re: [PATCH] ext2: add store_sync in strategic places.
Hello Samuel, Fair...here is some "real world" test (even though its highly skewed towards a developer-type person) Test Setup Storage: SSD-backed QEMU VM (cache:none). Comparison: Vanilla (Unsafe) vs. Full Sync Patch (Safe). 1. Unpacking Linux Kernel (tar -xf linux-5.15.tar.xz) Vanilla: 2m 39s Full Sync: 2m 45s Delta: +6s (~3.8%) Conclusion: Minimal regression. This confirms that standard file creation/write paths are not being penalized by the barrier logic. 2. Git Clone (git clone .../hurd.git) Vanilla: 2.12s Full Sync: 2.06s Delta: 0s (Noise) Conclusion: Zero regression for standard development tasks. 3. APT Reinstall (coreutils grep sed bash) Vanilla: 37.7s (Install time) Full Sync: 42.3s (Install time, adjusted for download) Delta: +4.6s (~12%) Conclusion: This 12% slowdown is the "Safety Tax." dpkg uses fsync to ensure database consistency. The Full Sync patch respects these requests, adding physical durability at a reasonable performance cost. These tests confirm that the patch follows the "pay for what you use" principle. Users will not see a degradation in git, tar, or compilation times. They will see a moderate (12%) slowdown in apt, which buys them protection against corruption during power loss. I believe this acts as a solid baseline for safety. The 12% overhead on synchronous workloads might be acceptable especially if it will be mitigated with JBD2 work, if we go there. Regards, Milos On Wed, Jan 28, 2026 at 12:20 AM Samuel Thibault wrote: > Hello, > > Milos Nikic, le mar. 27 janv. 2026 21:57:27 -0800, a ecrit: > > I have benchmarked the patch with "Full Sync" (sync in > diskfs_write_disknode > > and in block_getblk) on a clean build (latest GNUMach/Rumpdisk/Ext2fs) > > vs. the "Vanilla" (no sync in diskfs_write_disknode and block_getblk) > with > > the exact same setup to measure the impact of the barrier logic. > > > > Test Setup > > Workload: A run is 1,000 iterations of open() -> write() -> fsync() -> > close > > () > > (small C program compiled and ran inside Hurd). > > Storage: QEMU VM (cache:none) backed by SSD. > > I was rather thinking about the usual workload that people use: > installing debian packages, cloning a large git repo locally, etc., > typically. > > Synthetic benchmarks will always show the overhead with large > amplification, so they are not representative of what people will > actually see: we don't usually have fsync() called that often. > > Samuel >
Re: [PATCH] ext2: add store_sync in strategic places.
Hello, Milos Nikic, le mar. 27 janv. 2026 21:57:27 -0800, a ecrit: > I have benchmarked the patch with "Full Sync" (sync in diskfs_write_disknode > and in block_getblk) on a clean build (latest GNUMach/Rumpdisk/Ext2fs) > vs. the "Vanilla" (no sync in diskfs_write_disknode and block_getblk) with > the exact same setup to measure the impact of the barrier logic. > > Test Setup > Workload: A run is 1,000 iterations of open() -> write() -> fsync() -> close > () > (small C program compiled and ran inside Hurd). > Storage: QEMU VM (cache:none) backed by SSD. I was rather thinking about the usual workload that people use: installing debian packages, cloning a large git repo locally, etc., typically. Synthetic benchmarks will always show the overhead with large amplification, so they are not representative of what people will actually see: we don't usually have fsync() called that often. Samuel
Re: [PATCH] ext2: add store_sync in strategic places.
Hi Samuel,
I have benchmarked the patch with "Full Sync" (sync in
diskfs_write_disknode
and in block_getblk) on a clean build (latest GNUMach/Rumpdisk/Ext2fs)
vs. the "Vanilla" (no sync in diskfs_write_disknode and block_getblk) with
the exact same setup to measure the impact of the barrier logic.
Test Setup
Workload: A run is 1,000 iterations of open() -> write() -> fsync() ->
close()
(small C program compiled and ran inside Hurd).
Storage: QEMU VM (cache:none) backed by SSD.
Results
Vanilla (Unsafe): 0.75s per run (Avg over many independent runs)
Full Sync (Safe): 1.25s per run (Avg over many independent runs)
Analysis
The "Full Sync" patch introduces a ~66% regression in synchronous write
throughput. The increase of 0.5ms per operation corresponds to the
physical flush latency of the underlying SSD storage.
While this performance penalty is significant, it is necessary to guarantee
full data persistence. On rotational media (HDD), I project this
penalty would be significantly higher (adding ~10ms per operation), though
I do not have a rotational HDD handy to confirm that hypothesis.
I also confirmed that enabling these syncs incurs no cost on legacy systems
(those returning EOPNOTSUPP) while enforcing the physical barrier on
modern ones.
Conclusion
There is an undeniable cost to full safety. However, this patch provides
the
necessary baseline. We can later adopt the industry-standard "middle of the
road" path with JBD2 Journaling.
Journaling will allow us to batch many writes into a single transaction
and sync only the commit of that transaction. Allowing for full safety of
every change
that is contained in that transaction, at a fraction of the cost.
In my initial JBD2 prototype, this commit happens periodically (e.g., every
5 seconds) or when the journal
is full, rather than per every metadata write. This should allow us to find
the right balance between performance and safety. Please take a look at the
patch when you get the time.
It needs updates, but the original idea is there and it is totally
compatible with e2fsk and other linux tools.
Regards,
Milos
On Tue, Jan 27, 2026 at 3:43 PM Samuel Thibault
wrote:
> Hello,
>
> Milos Nikic, le mar. 27 janv. 2026 15:16:41 -0800, a ecrit:
> > Now that we have a write barrier lets use it at some
> > critical points.
>
> I have applied only the parts that sync on shutdown/syncfs/readonly,
> I have not applied the getblk and inode write parts: could you
> benchmark it a bit to check so we have an idea how much it can affect
> i/o-intensive performance?
>
> Samuel
>
> > ---
> > ext2fs/getblk.c | 9 -
> > ext2fs/hyper.c | 8 +++-
> > ext2fs/inode.c | 7 ++-
> > ext2fs/pager.c | 9 -
> > 4 files changed, 29 insertions(+), 4 deletions(-)
> >
> > diff --git a/ext2fs/getblk.c b/ext2fs/getblk.c
> > index ed6e3e09..49ab5740 100644
> > --- a/ext2fs/getblk.c
> > +++ b/ext2fs/getblk.c
> > @@ -234,7 +234,14 @@ block_getblk (struct node *node, block_t block, int
> nr, int create, int zero,
> >bh[nr] = *result;
> >
> >if (diskfs_synchronous || diskfs_node_disknode (node)->info.i_osync)
> > -sync_global_ptr (bh, 1);
> > +{
> > + sync_global_ptr (bh, 1);
> > + /* We just wrote a new indirect block pointer.
> > + If this doesn't hit the platter, the file is corrupt. */
> > + error_t err = store_sync (store);
> > + if (err && err != EOPNOTSUPP)
> > + ext2_warning ("indirect block flush failed: %s", strerror (err));
> > +}
> >else
> > record_indir_poke (node, bh);
> >
> > diff --git a/ext2fs/hyper.c b/ext2fs/hyper.c
> > index 59458724..847f9f2b 100644
> > --- a/ext2fs/hyper.c
> > +++ b/ext2fs/hyper.c
> > @@ -220,7 +220,13 @@ diskfs_set_hypermetadata (int wait, int clean)
> > }
> >
> >sync_global (wait);
> > -
> > + if (wait)
> > +{
> > + error_t err = store_sync (store);
> > + /* Ignore EOPNOTSUPP (legacy drivers), but warn on real I/O
> errors */
> > + if (err && err != EOPNOTSUPP)
> > +ext2_warning ("device flush failed: %s", strerror (err));
> > +}
> >return 0;
> > }
> >
> > diff --git a/ext2fs/inode.c b/ext2fs/inode.c
> > index 8d1656f6..8d10af01 100644
> > --- a/ext2fs/inode.c
> > +++ b/ext2fs/inode.c
> > @@ -569,7 +569,12 @@ diskfs_write_disknode (struct node *np, int wait)
> >if (di)
> > {
> >if (wait)
> > - sync_global_ptr (di, 1);
> > +{
> > + sync_global_ptr (di, 1);
> > + error_t err = store_sync (store);
> > + if (err && err != EOPNOTSUPP)
> > +ext2_warning ("inode flush failed: %s", strerror (err));
> > +}
> >else
> > record_global_poke (di);
> > }
> > diff --git a/ext2fs/pager.c b/ext2fs/pager.c
> > index a7801bea..1c795784 100644
> > --- a/ext2fs/pager.c
> > +++ b/ext2fs/pager.c
> > @@ -1575,7 +1575,7 @@ diskfs_shutdown_pager (void)
> >
> >/* Sync everything on the the disk pager. */
> >sync_global (1
Re: [PATCH] ext2: add store_sync in strategic places.
Hello,
Milos Nikic, le mar. 27 janv. 2026 15:16:41 -0800, a ecrit:
> Now that we have a write barrier lets use it at some
> critical points.
I have applied only the parts that sync on shutdown/syncfs/readonly,
I have not applied the getblk and inode write parts: could you
benchmark it a bit to check so we have an idea how much it can affect
i/o-intensive performance?
Samuel
> ---
> ext2fs/getblk.c | 9 -
> ext2fs/hyper.c | 8 +++-
> ext2fs/inode.c | 7 ++-
> ext2fs/pager.c | 9 -
> 4 files changed, 29 insertions(+), 4 deletions(-)
>
> diff --git a/ext2fs/getblk.c b/ext2fs/getblk.c
> index ed6e3e09..49ab5740 100644
> --- a/ext2fs/getblk.c
> +++ b/ext2fs/getblk.c
> @@ -234,7 +234,14 @@ block_getblk (struct node *node, block_t block, int nr,
> int create, int zero,
>bh[nr] = *result;
>
>if (diskfs_synchronous || diskfs_node_disknode (node)->info.i_osync)
> -sync_global_ptr (bh, 1);
> +{
> + sync_global_ptr (bh, 1);
> + /* We just wrote a new indirect block pointer.
> + If this doesn't hit the platter, the file is corrupt. */
> + error_t err = store_sync (store);
> + if (err && err != EOPNOTSUPP)
> + ext2_warning ("indirect block flush failed: %s", strerror (err));
> +}
>else
> record_indir_poke (node, bh);
>
> diff --git a/ext2fs/hyper.c b/ext2fs/hyper.c
> index 59458724..847f9f2b 100644
> --- a/ext2fs/hyper.c
> +++ b/ext2fs/hyper.c
> @@ -220,7 +220,13 @@ diskfs_set_hypermetadata (int wait, int clean)
> }
>
>sync_global (wait);
> -
> + if (wait)
> +{
> + error_t err = store_sync (store);
> + /* Ignore EOPNOTSUPP (legacy drivers), but warn on real I/O errors */
> + if (err && err != EOPNOTSUPP)
> +ext2_warning ("device flush failed: %s", strerror (err));
> +}
>return 0;
> }
>
> diff --git a/ext2fs/inode.c b/ext2fs/inode.c
> index 8d1656f6..8d10af01 100644
> --- a/ext2fs/inode.c
> +++ b/ext2fs/inode.c
> @@ -569,7 +569,12 @@ diskfs_write_disknode (struct node *np, int wait)
>if (di)
> {
>if (wait)
> - sync_global_ptr (di, 1);
> +{
> + sync_global_ptr (di, 1);
> + error_t err = store_sync (store);
> + if (err && err != EOPNOTSUPP)
> +ext2_warning ("inode flush failed: %s", strerror (err));
> +}
>else
> record_global_poke (di);
> }
> diff --git a/ext2fs/pager.c b/ext2fs/pager.c
> index a7801bea..1c795784 100644
> --- a/ext2fs/pager.c
> +++ b/ext2fs/pager.c
> @@ -1575,7 +1575,7 @@ diskfs_shutdown_pager (void)
>
>/* Sync everything on the the disk pager. */
>sync_global (1);
> -
> + store_sync (store);
>/* Despite the name of this function, we never actually shutdown the disk
> pager, just make sure it's synced. */
> }
> @@ -1596,6 +1596,13 @@ diskfs_sync_everything (int wait)
>
>/* Do things on the the disk pager. */
>sync_global (wait);
> + if (wait)
> +{
> + error_t err = store_sync (store);
> + /* Ignore EOPNOTSUPP (drivers), but warn on real I/O errors */
> + if (err && err != EOPNOTSUPP)
> +ext2_warning ("device flush failed: %s", strerror (err));
> +}
> }
>
> static void
> --
> 2.52.0
>
>
--
Samuel
/* Amuse the user in a SPARC fashion */
if (err) printk(
KERN_CRIT " ___ \n"
KERN_CRIT " < Your System ate a SPARC! Gah! >\n"
KERN_CRIT " --- \n"
KERN_CRIT " \\ ^__^\n"
KERN_CRIT " \\ (xx)\\___\n"
KERN_CRIT " (__)\\ )\\/\\\n"
KERN_CRIT " U ||w |\n"
KERN_CRIT " || ||\n");
(From linux/arch/parisc/kernel/traps.c:die_if_kernel())
[PATCH] ext2: add store_sync in strategic places.
Now that we have a write barrier lets use it at some
critical points.
---
ext2fs/getblk.c | 9 -
ext2fs/hyper.c | 8 +++-
ext2fs/inode.c | 7 ++-
ext2fs/pager.c | 9 -
4 files changed, 29 insertions(+), 4 deletions(-)
diff --git a/ext2fs/getblk.c b/ext2fs/getblk.c
index ed6e3e09..49ab5740 100644
--- a/ext2fs/getblk.c
+++ b/ext2fs/getblk.c
@@ -234,7 +234,14 @@ block_getblk (struct node *node, block_t block, int nr,
int create, int zero,
bh[nr] = *result;
if (diskfs_synchronous || diskfs_node_disknode (node)->info.i_osync)
-sync_global_ptr (bh, 1);
+{
+ sync_global_ptr (bh, 1);
+ /* We just wrote a new indirect block pointer.
+ If this doesn't hit the platter, the file is corrupt. */
+ error_t err = store_sync (store);
+ if (err && err != EOPNOTSUPP)
+ ext2_warning ("indirect block flush failed: %s", strerror (err));
+}
else
record_indir_poke (node, bh);
diff --git a/ext2fs/hyper.c b/ext2fs/hyper.c
index 59458724..847f9f2b 100644
--- a/ext2fs/hyper.c
+++ b/ext2fs/hyper.c
@@ -220,7 +220,13 @@ diskfs_set_hypermetadata (int wait, int clean)
}
sync_global (wait);
-
+ if (wait)
+{
+ error_t err = store_sync (store);
+ /* Ignore EOPNOTSUPP (legacy drivers), but warn on real I/O errors */
+ if (err && err != EOPNOTSUPP)
+ext2_warning ("device flush failed: %s", strerror (err));
+}
return 0;
}
diff --git a/ext2fs/inode.c b/ext2fs/inode.c
index 8d1656f6..8d10af01 100644
--- a/ext2fs/inode.c
+++ b/ext2fs/inode.c
@@ -569,7 +569,12 @@ diskfs_write_disknode (struct node *np, int wait)
if (di)
{
if (wait)
- sync_global_ptr (di, 1);
+{
+ sync_global_ptr (di, 1);
+ error_t err = store_sync (store);
+ if (err && err != EOPNOTSUPP)
+ext2_warning ("inode flush failed: %s", strerror (err));
+}
else
record_global_poke (di);
}
diff --git a/ext2fs/pager.c b/ext2fs/pager.c
index a7801bea..1c795784 100644
--- a/ext2fs/pager.c
+++ b/ext2fs/pager.c
@@ -1575,7 +1575,7 @@ diskfs_shutdown_pager (void)
/* Sync everything on the the disk pager. */
sync_global (1);
-
+ store_sync (store);
/* Despite the name of this function, we never actually shutdown the disk
pager, just make sure it's synced. */
}
@@ -1596,6 +1596,13 @@ diskfs_sync_everything (int wait)
/* Do things on the the disk pager. */
sync_global (wait);
+ if (wait)
+{
+ error_t err = store_sync (store);
+ /* Ignore EOPNOTSUPP (drivers), but warn on real I/O errors */
+ if (err && err != EOPNOTSUPP)
+ext2_warning ("device flush failed: %s", strerror (err));
+}
}
static void
--
2.52.0
