Re: [PATCH] base-files: call "sync" after initial setup

2022-03-07 Thread Sergey Ryazanov
On Tue, Mar 1, 2022 at 8:51 PM Rafał Miłecki  wrote:
> OpenWrt uses a lot of (b)ash scripts for initial setup. This isn't the
> best solution as they almost never consider syncing files / data. Still
> this is what we have and we need to try living with it.
>
> Without proper syncing OpenWrt can easily get into an inconsistent state
> on power cut. It's because:
> 1. Actual (flash) inode and data writes are not synchronized
> 2. Data writeback can take up to 30 seconds (dirty_expire_centisecs)
> 3. ubifs adds extra 5 seconds (dirty_writeback_centisecs) "delay"
>
> Some possible cases (examples) for new files:
> 1. Power cut during 5 seconds after write() can result in all data loss
> 2. Power cut happening between 5 and 35 seconds after write() can result
>in empty file (inode flushed after 5 seconds, data flush queued)
>
> Above affects e.g. uci-defaults. After executing some migration script
> it may get deleted (whited out) without generated data getting actually
> written. Power cut will result in missing data and deleted file.
>
> There are three ways of dealing with that:
> 1. Rewriting all user-space init to proper C with syncs
> 2. Trying bash hacks (like creating tmp files & moving them)
> 3. Adding sync and hoping for no power cut during critical section
>
> This change introduces the last solution that is the simplest. It
> reduces time during which things may go wrong from ~35 seconds to
> probably less than a second. Of course it applies only to IO operations
> performed before /etc/init.d/boot . It's probably the stage when the
> most new files get created.
>
> All later changes are usually done using smarter C apps (e.g. busybox or
> uci) that creates tmp files and uses rename() that is expected to be
> atomic.
>
> Signed-off-by: Rafał Miłecki 

Acked-by: Sergey Ryazanov 

And thank you for such a detailed analysis of the situation!

--
Sergey

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [PATCH] base-files: call "sync" after initial setup

2022-03-02 Thread Richard Weinberger
- Ursprüngliche Mail -
> Von: "Koen Vandeputte" 
>>> Signed-off-by: Rafał Miłecki 
>>
>> Acked-by: Hauke Mehrtens 
>>
>> This is not the best solution as you said but a simple one.
>>
>> How do we handle the situation in the first boot when the overlay file
>> system is not ready yet and we are in a ramdisk in the beginning?
>>
>> Hauke
> 
> As a small addendum on this topic:
> 
> There is another way:
> 
> I also have issues with data loss on power cuts using ubifs since a few
> years now,
> exactly as described above.
> 
> As in my usecase writes are only happening at the absolute minimum (the
> user changing a config setting), I 'solved' it by simply adding
> rootflags=sync to the kernel cmdline.
> 
> This seems to force immediate flushed to nand (at the cost of maybe a
> little bit faster wear) and reduced the issue with a huge factor.
> 
> In the past before this flag, it happened nearly every powercut that
> some file got corrupted.
> after using this .. I can only recall a single case in roughly 3 years.

I guess you faced a situation like described here?
http://www.linux-mtd.infradead.org/doc/ubifs.html#L_sync_semantics

Thanks,
//richard

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [PATCH] base-files: call "sync" after initial setup

2022-03-02 Thread Koen Vandeputte


On 01.03.22 19:57, Hauke Mehrtens wrote:

On 3/1/22 18:46, Rafał Miłecki wrote:

From: Rafał Miłecki 

OpenWrt uses a lot of (b)ash scripts for initial setup. This isn't the
best solution as they almost never consider syncing files / data. Still
this is what we have and we need to try living with it.

Without proper syncing OpenWrt can easily get into an inconsistent state
on power cut. It's because:
1. Actual (flash) inode and data writes are not synchronized
2. Data writeback can take up to 30 seconds (dirty_expire_centisecs)
3. ubifs adds extra 5 seconds (dirty_writeback_centisecs) "delay"

Some possible cases (examples) for new files:
1. Power cut during 5 seconds after write() can result in all data loss
2. Power cut happening between 5 and 35 seconds after write() can result
    in empty file (inode flushed after 5 seconds, data flush queued)

Above affects e.g. uci-defaults. After executing some migration script
it may get deleted (whited out) without generated data getting actually
written. Power cut will result in missing data and deleted file.

There are three ways of dealing with that:
1. Rewriting all user-space init to proper C with syncs
2. Trying bash hacks (like creating tmp files & moving them)
3. Adding sync and hoping for no power cut during critical section

This change introduces the last solution that is the simplest. It
reduces time during which things may go wrong from ~35 seconds to
probably less than a second. Of course it applies only to IO operations
performed before /etc/init.d/boot . It's probably the stage when the
most new files get created.

All later changes are usually done using smarter C apps (e.g. busybox or
uci) that creates tmp files and uses rename() that is expected to be
atomic.

Signed-off-by: Rafał Miłecki 


Acked-by: Hauke Mehrtens 

This is not the best solution as you said but a simple one.

How do we handle the situation in the first boot when the overlay file 
system is not ready yet and we are in a ramdisk in the beginning?


Hauke


As a small addendum on this topic:

There is another way:

I also have issues with data loss on power cuts using ubifs since a few 
years now,

exactly as described above.

As in my usecase writes are only happening at the absolute minimum (the 
user changing a config setting), I 'solved' it by simply adding 
rootflags=sync to the kernel cmdline.


This seems to force immediate flushed to nand (at the cost of maybe a 
little bit faster wear) and reduced the issue with a huge factor.


In the past before this flag, it happened nearly every powercut that 
some file got corrupted.

after using this .. I can only recall a single case in roughly 3 years.


[    0.00] Kernel command line: console=ttymxc4,115200 ubi0:ubi 
ubi.mtd=2 rootfstype=squashfs,ubifs rootflags=sync pci=nomsi

[    2.298922] UBIFS: parse sync


Regards,

Koen


___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [PATCH] base-files: call "sync" after initial setup

2022-03-01 Thread Hauke Mehrtens

On 3/1/22 18:46, Rafał Miłecki wrote:

From: Rafał Miłecki 

OpenWrt uses a lot of (b)ash scripts for initial setup. This isn't the
best solution as they almost never consider syncing files / data. Still
this is what we have and we need to try living with it.

Without proper syncing OpenWrt can easily get into an inconsistent state
on power cut. It's because:
1. Actual (flash) inode and data writes are not synchronized
2. Data writeback can take up to 30 seconds (dirty_expire_centisecs)
3. ubifs adds extra 5 seconds (dirty_writeback_centisecs) "delay"

Some possible cases (examples) for new files:
1. Power cut during 5 seconds after write() can result in all data loss
2. Power cut happening between 5 and 35 seconds after write() can result
in empty file (inode flushed after 5 seconds, data flush queued)

Above affects e.g. uci-defaults. After executing some migration script
it may get deleted (whited out) without generated data getting actually
written. Power cut will result in missing data and deleted file.

There are three ways of dealing with that:
1. Rewriting all user-space init to proper C with syncs
2. Trying bash hacks (like creating tmp files & moving them)
3. Adding sync and hoping for no power cut during critical section

This change introduces the last solution that is the simplest. It
reduces time during which things may go wrong from ~35 seconds to
probably less than a second. Of course it applies only to IO operations
performed before /etc/init.d/boot . It's probably the stage when the
most new files get created.

All later changes are usually done using smarter C apps (e.g. busybox or
uci) that creates tmp files and uses rename() that is expected to be
atomic.

Signed-off-by: Rafał Miłecki 


Acked-by: Hauke Mehrtens 

This is not the best solution as you said but a simple one.

How do we handle the situation in the first boot when the overlay file 
system is not ready yet and we are in a ramdisk in the beginning?


Hauke

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


[PATCH] base-files: call "sync" after initial setup

2022-03-01 Thread Rafał Miłecki
From: Rafał Miłecki 

OpenWrt uses a lot of (b)ash scripts for initial setup. This isn't the
best solution as they almost never consider syncing files / data. Still
this is what we have and we need to try living with it.

Without proper syncing OpenWrt can easily get into an inconsistent state
on power cut. It's because:
1. Actual (flash) inode and data writes are not synchronized
2. Data writeback can take up to 30 seconds (dirty_expire_centisecs)
3. ubifs adds extra 5 seconds (dirty_writeback_centisecs) "delay"

Some possible cases (examples) for new files:
1. Power cut during 5 seconds after write() can result in all data loss
2. Power cut happening between 5 and 35 seconds after write() can result
   in empty file (inode flushed after 5 seconds, data flush queued)

Above affects e.g. uci-defaults. After executing some migration script
it may get deleted (whited out) without generated data getting actually
written. Power cut will result in missing data and deleted file.

There are three ways of dealing with that:
1. Rewriting all user-space init to proper C with syncs
2. Trying bash hacks (like creating tmp files & moving them)
3. Adding sync and hoping for no power cut during critical section

This change introduces the last solution that is the simplest. It
reduces time during which things may go wrong from ~35 seconds to
probably less than a second. Of course it applies only to IO operations
performed before /etc/init.d/boot . It's probably the stage when the
most new files get created.

All later changes are usually done using smarter C apps (e.g. busybox or
uci) that creates tmp files and uses rename() that is expected to be
atomic.

Signed-off-by: Rafał Miłecki 
---
Related:

[RFC] base-files: call "sync" before removing uci-defaults file
https://patchwork.ozlabs.org/project/lede/patch/20160905202931.3013-1-zaj...@gmail.com/

https://www.slideshare.net/nan1nan1/eat-my-data
---
 package/base-files/files/etc/init.d/boot | 1 +
 1 file changed, 1 insertion(+)

diff --git a/package/base-files/files/etc/init.d/boot 
b/package/base-files/files/etc/init.d/boot
index e1c60c1c2f..749d9e9711 100755
--- a/package/base-files/files/etc/init.d/boot
+++ b/package/base-files/files/etc/init.d/boot
@@ -48,6 +48,7 @@ boot() {
 
/bin/config_generate
uci_apply_defaults
+   sync

# temporary hack until configd exists
/sbin/reload_config
-- 
2.34.1


___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel