Ciaran McCreesh wrote:
> On Sun, 04 Oct 2009 18:25:14 +0200
> Rodolphe Rocca <[email protected]> wrote:
>> I understand that in case of a machine crashing, a package gets
>> partially installed, as long as the package files content is either
>> the old one or the new one. This can potentially cause trouble, but
>> has a higher resilience than ending with some empty files in VDB
>> causing paludis to be unusable until a manual intervention
>> in /var/cache/db.
> 
> No, it can also result in partially written new files being installed.
> 
>> I tried to turn the auto_da_alloc ext4 mount option on as it is
>> supposed to fix the "zero file length" issue for some file replacing
>> patterns. A few minutes later I got a new crash and empty files
>> again. So I guess the corruption is unrelated or paludis uses a
>> rename pattern which is not detected by ext4.
> 
> The problem is not a rename pattern, and it has nothing to do with
> allocation. It's quite simple: if you don't cleanly unmount a
> filesystem, 

OK I understand that...

> or if you don't cleanly power off your computer after
> unmounting a filesystem, things will break.

but not this one.

AFAIK when a fs is unmounted, a sync happens on the virtual device and
the unmount operation is blocking until all data has been flushed to the
disk (write cache). After what it's up to the hard drive firmware to
flush the write cache if it is enabled. So normally when unmount
returns, data is at least in the write cache of the drive. A power off
during the internal write cache flushing process will trigger data loss.
Not sure if the reboot will too.

> Where Paludis does renames for merging, it does so to prevent a
> partially written executable from existing. If anyone tried to launch
> an executable when it was partially written, weird things would happen;
> using a rename removes that case, although there is a small amount of
> time between when the old executable is removed and the new one is
> renamed into place. Handling unclean unmounts is not a consideration.

Concerning the merged files I agree that not much more can be done to
make paludis more resilient to a system crash.

But what about /var/cache/db ?

What I'm thinking about is letting paludis work as much as possible in a
temporary vdb directory and rename this directory in the safest possible
way once everything is done.

For some "paludis -i pkg" command :

1. d1 = "/var/cache/db/cat/pkg"
2. d2 = "/var/cache/db/cat/.pkgtmp"
3. cp -R $d1/. $d2
4. compile, merge files etc.
5. update DB entries in $d2
6. fsync files in $d2
7. remove $d1
8. rename $d2 $d1

I even imagine a mechanism that would make recoverable a crash happening
between 7 and 8.

Something like :

7. move $d1 $d11(=/var/cache/db/cat/.pkg.original)
8. rename $d2 $d1
9. remove $d11

Next time paludis runs, it could be able to detect inconsistencies and
automatically fix them. Something like :

if $d11 exists:
    if ! $d1 exists:
        # crash between 7 and 8
        mv $d11 $d1
    else
        # crash between 8 and 9
        remove $d11
fi

I'm getting a bit tired so I probably miss some cases but you see the
picture.

Would it be insane ?

>> Now I'm at the point where I disabled ext4 delayed allocation
>> (nodelalloc mount option). Let's see what happens.
> 
> Things will still break if you randomly power off your computer. The
> only difference is that the breakage may display itself slightly
> differently. You can still end up with partially written or empty
> files; you may just not notice them as frequently.

Agreed.
_______________________________________________
paludis-user mailing list
[email protected]
http://lists.pioto.org/mailman/listinfo/paludis-user

Reply via email to