Re: [PATCH] base-files: call "sync" after initial setup
On Tue, Mar 1, 2022 at 8:51 PM Rafał Miłecki wrote: > OpenWrt uses a lot of (b)ash scripts for initial setup. This isn't the > best solution as they almost never consider syncing files / data. Still > this is what we have and we need to try living with it. > > Without proper syncing OpenWrt can easily get into an inconsistent state > on power cut. It's because: > 1. Actual (flash) inode and data writes are not synchronized > 2. Data writeback can take up to 30 seconds (dirty_expire_centisecs) > 3. ubifs adds extra 5 seconds (dirty_writeback_centisecs) "delay" > > Some possible cases (examples) for new files: > 1. Power cut during 5 seconds after write() can result in all data loss > 2. Power cut happening between 5 and 35 seconds after write() can result >in empty file (inode flushed after 5 seconds, data flush queued) > > Above affects e.g. uci-defaults. After executing some migration script > it may get deleted (whited out) without generated data getting actually > written. Power cut will result in missing data and deleted file. > > There are three ways of dealing with that: > 1. Rewriting all user-space init to proper C with syncs > 2. Trying bash hacks (like creating tmp files & moving them) > 3. Adding sync and hoping for no power cut during critical section > > This change introduces the last solution that is the simplest. It > reduces time during which things may go wrong from ~35 seconds to > probably less than a second. Of course it applies only to IO operations > performed before /etc/init.d/boot . It's probably the stage when the > most new files get created. > > All later changes are usually done using smarter C apps (e.g. busybox or > uci) that creates tmp files and uses rename() that is expected to be > atomic. > > Signed-off-by: Rafał Miłecki Acked-by: Sergey Ryazanov And thank you for such a detailed analysis of the situation! -- Sergey ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [PATCH] base-files: call "sync" after initial setup
- Ursprüngliche Mail - > Von: "Koen Vandeputte" >>> Signed-off-by: Rafał Miłecki >> >> Acked-by: Hauke Mehrtens >> >> This is not the best solution as you said but a simple one. >> >> How do we handle the situation in the first boot when the overlay file >> system is not ready yet and we are in a ramdisk in the beginning? >> >> Hauke > > As a small addendum on this topic: > > There is another way: > > I also have issues with data loss on power cuts using ubifs since a few > years now, > exactly as described above. > > As in my usecase writes are only happening at the absolute minimum (the > user changing a config setting), I 'solved' it by simply adding > rootflags=sync to the kernel cmdline. > > This seems to force immediate flushed to nand (at the cost of maybe a > little bit faster wear) and reduced the issue with a huge factor. > > In the past before this flag, it happened nearly every powercut that > some file got corrupted. > after using this .. I can only recall a single case in roughly 3 years. I guess you faced a situation like described here? http://www.linux-mtd.infradead.org/doc/ubifs.html#L_sync_semantics Thanks, //richard ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [PATCH] base-files: call "sync" after initial setup
On 01.03.22 19:57, Hauke Mehrtens wrote: On 3/1/22 18:46, Rafał Miłecki wrote: From: Rafał Miłecki OpenWrt uses a lot of (b)ash scripts for initial setup. This isn't the best solution as they almost never consider syncing files / data. Still this is what we have and we need to try living with it. Without proper syncing OpenWrt can easily get into an inconsistent state on power cut. It's because: 1. Actual (flash) inode and data writes are not synchronized 2. Data writeback can take up to 30 seconds (dirty_expire_centisecs) 3. ubifs adds extra 5 seconds (dirty_writeback_centisecs) "delay" Some possible cases (examples) for new files: 1. Power cut during 5 seconds after write() can result in all data loss 2. Power cut happening between 5 and 35 seconds after write() can result in empty file (inode flushed after 5 seconds, data flush queued) Above affects e.g. uci-defaults. After executing some migration script it may get deleted (whited out) without generated data getting actually written. Power cut will result in missing data and deleted file. There are three ways of dealing with that: 1. Rewriting all user-space init to proper C with syncs 2. Trying bash hacks (like creating tmp files & moving them) 3. Adding sync and hoping for no power cut during critical section This change introduces the last solution that is the simplest. It reduces time during which things may go wrong from ~35 seconds to probably less than a second. Of course it applies only to IO operations performed before /etc/init.d/boot . It's probably the stage when the most new files get created. All later changes are usually done using smarter C apps (e.g. busybox or uci) that creates tmp files and uses rename() that is expected to be atomic. Signed-off-by: Rafał Miłecki Acked-by: Hauke Mehrtens This is not the best solution as you said but a simple one. How do we handle the situation in the first boot when the overlay file system is not ready yet and we are in a ramdisk in the beginning? Hauke As a small addendum on this topic: There is another way: I also have issues with data loss on power cuts using ubifs since a few years now, exactly as described above. As in my usecase writes are only happening at the absolute minimum (the user changing a config setting), I 'solved' it by simply adding rootflags=sync to the kernel cmdline. This seems to force immediate flushed to nand (at the cost of maybe a little bit faster wear) and reduced the issue with a huge factor. In the past before this flag, it happened nearly every powercut that some file got corrupted. after using this .. I can only recall a single case in roughly 3 years. [ 0.00] Kernel command line: console=ttymxc4,115200 ubi0:ubi ubi.mtd=2 rootfstype=squashfs,ubifs rootflags=sync pci=nomsi [ 2.298922] UBIFS: parse sync Regards, Koen ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Re: [PATCH] base-files: call "sync" after initial setup
On 3/1/22 18:46, Rafał Miłecki wrote: From: Rafał Miłecki OpenWrt uses a lot of (b)ash scripts for initial setup. This isn't the best solution as they almost never consider syncing files / data. Still this is what we have and we need to try living with it. Without proper syncing OpenWrt can easily get into an inconsistent state on power cut. It's because: 1. Actual (flash) inode and data writes are not synchronized 2. Data writeback can take up to 30 seconds (dirty_expire_centisecs) 3. ubifs adds extra 5 seconds (dirty_writeback_centisecs) "delay" Some possible cases (examples) for new files: 1. Power cut during 5 seconds after write() can result in all data loss 2. Power cut happening between 5 and 35 seconds after write() can result in empty file (inode flushed after 5 seconds, data flush queued) Above affects e.g. uci-defaults. After executing some migration script it may get deleted (whited out) without generated data getting actually written. Power cut will result in missing data and deleted file. There are three ways of dealing with that: 1. Rewriting all user-space init to proper C with syncs 2. Trying bash hacks (like creating tmp files & moving them) 3. Adding sync and hoping for no power cut during critical section This change introduces the last solution that is the simplest. It reduces time during which things may go wrong from ~35 seconds to probably less than a second. Of course it applies only to IO operations performed before /etc/init.d/boot . It's probably the stage when the most new files get created. All later changes are usually done using smarter C apps (e.g. busybox or uci) that creates tmp files and uses rename() that is expected to be atomic. Signed-off-by: Rafał Miłecki Acked-by: Hauke Mehrtens This is not the best solution as you said but a simple one. How do we handle the situation in the first boot when the overlay file system is not ready yet and we are in a ramdisk in the beginning? Hauke ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel
[PATCH] base-files: call "sync" after initial setup
From: Rafał Miłecki OpenWrt uses a lot of (b)ash scripts for initial setup. This isn't the best solution as they almost never consider syncing files / data. Still this is what we have and we need to try living with it. Without proper syncing OpenWrt can easily get into an inconsistent state on power cut. It's because: 1. Actual (flash) inode and data writes are not synchronized 2. Data writeback can take up to 30 seconds (dirty_expire_centisecs) 3. ubifs adds extra 5 seconds (dirty_writeback_centisecs) "delay" Some possible cases (examples) for new files: 1. Power cut during 5 seconds after write() can result in all data loss 2. Power cut happening between 5 and 35 seconds after write() can result in empty file (inode flushed after 5 seconds, data flush queued) Above affects e.g. uci-defaults. After executing some migration script it may get deleted (whited out) without generated data getting actually written. Power cut will result in missing data and deleted file. There are three ways of dealing with that: 1. Rewriting all user-space init to proper C with syncs 2. Trying bash hacks (like creating tmp files & moving them) 3. Adding sync and hoping for no power cut during critical section This change introduces the last solution that is the simplest. It reduces time during which things may go wrong from ~35 seconds to probably less than a second. Of course it applies only to IO operations performed before /etc/init.d/boot . It's probably the stage when the most new files get created. All later changes are usually done using smarter C apps (e.g. busybox or uci) that creates tmp files and uses rename() that is expected to be atomic. Signed-off-by: Rafał Miłecki --- Related: [RFC] base-files: call "sync" before removing uci-defaults file https://patchwork.ozlabs.org/project/lede/patch/20160905202931.3013-1-zaj...@gmail.com/ https://www.slideshare.net/nan1nan1/eat-my-data --- package/base-files/files/etc/init.d/boot | 1 + 1 file changed, 1 insertion(+) diff --git a/package/base-files/files/etc/init.d/boot b/package/base-files/files/etc/init.d/boot index e1c60c1c2f..749d9e9711 100755 --- a/package/base-files/files/etc/init.d/boot +++ b/package/base-files/files/etc/init.d/boot @@ -48,6 +48,7 @@ boot() { /bin/config_generate uci_apply_defaults + sync # temporary hack until configd exists /sbin/reload_config -- 2.34.1 ___ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel