On 02/14/2010 05:50 PM, Raphael Hertzog wrote:
> tag 567089 + patch
> thanks
>
>
> Please make it conform the coding style (see doc/coding-style.txt in git
> repo), I saw several stylistic problems:
> - missing space before comma, operator, or =
> - unwanted spaces after ( or before )
> - invalid indenting (be consistent with the rest of the file)
> - use "" or '' quotes in new strings (and not `')
Thanks for the reference, I'll provide a more stylish patch then.

>
> On question though, is that the minimal set of fsync needed to get the
> right result? I fear that we loose lots of performance in particular for
> the change in src/archive.c (fsync newly extracted files). Maybe it's best
> to not do it at this point but add a generic sync() somewhere when the
> database is synced to disk. Not sure though. Also even ext4 implicitely
> sync files on rename() IIRC so it's not strictly needed AFAIK.
We need to fsync all extracted files, info directory and database
directory. I've made extensive crash tests, and I think this is the
minimal set of fsync to ensure that control files and archive files are
there and renaming is done correctly.

I did some profiling and the perfomance hit is less than 10%. Installing
4 packages/1601 files takes an average of 10.04s over 10 runs with the
current version of dpkg and 10.931s with the `fsync' version (9.27%) (on
a laptop with an ATA 5400rpm hard drive)
Initially, I didn't synced the files extracted from the data area but
they were all 0 bytes in case of crash. The situation would be the
same as today.
Using sync instead of fsync won't make a big difference I guess, sooner
or later we need do pages need to be flushed to disk and it will have an
effect on system's performance. On the other hand, calling fsync on each
file makes operation atomic and prevent data loss if the crash occurs
during the installation.
The call to rename doesn't fsync files nor guarantees which file will be
visible after a crash so we need to call fsync on the directory.

>
> Can you explain your changes in more details, because I'm not sure I
> understand it completely:
>
>> +  * Add fsync on files and directories to prevent delayed allocation
and the
>> +    zero-length file problem after a system crash LP: #512096
(Debian #567089)
>> +    - dpkg-deb/extract.c: fsync control files
>
> You modify extracthalf() which is used for extracting control files but
> also the main files of the package itself (depending on whether admin is
> true or not). You call fsync on all files of the directory indicated.
>
> In the case of the control archives (control.tar.gz), it's fine we only
> have files at the root directory but in the case of data.tar.gz you
> definitely won't fsync all installed files.
>
> This lead me to believe that it's not the proper place for this code.
> Instead I think (but I would want Guillem's opinion) that it should
> be in process_archive() just after the call to dpkg-deb --control.
>
The idea was to fsync the files as soon as they're extracted. That's why
I've put it there instead of inside dpkg's code. In fact the problem is
more with tar than dpkg.

>From what I've understood of the behavior of dpkg, when called from
dpkg, the argument `directory' is not null only when we extract the
control files. When dpkg-deb is run directly with the extract or
vextract command we want to fsync files.
I was unsure where to fsync the control file so I will clearly follow
your advice here.

> And you probably want a separate function to sync all files of a
> directory.
>
>> +    - src/archive.c: fsync newly extracted files
>
> See comment in first para.
>
>> +    - lib/dpkg/cleanup.c: fsync directory before closing
>
> I doubt that this is the proper place. What directory needs to be
> fsynced() and at what time of the installation/upgrade process ?
We need to fsync INFODIR after the rename operations to ensure that the
entries in that directory are flushed to disk.

>
> Probably only the pop_cleanup() calls in src/processarc.c and not the
others
> so it really needs to be moved there directly.
OK. just before the last call to "pop_cleanup(ehflag_normaltidy); /*
closedir */" should be fine.

>
>> +    - lib/dpkg/dump.c: fsync db directory
>
> This one looks okay.
>
Great :)

>
> Code review:
>
>> +    while ((de = readdir(dsd)) != NULL){
>> +      if (strchr(de->d_name,'.'))
>> +        continue;
>
> Why skipping files containing a dot? Maybe you wanted to skip "." and ".."
> but that's not what this code does.
>
Right, I will change it.

>> +
>> +  if ( (dirp = opendir(fname))==NULL)
>> +    ohshite(_("failed to opendir `%.250s'"), fname );
>> +  if ( fsync(dirfd(dirp)) == -1 )
>> +    ohshite(_("failed to fsync `%.250s'"), fname );
>> +  closedir( dirp );
>> +  free( fname );
>
> Can't we open the directory directly instead of using opendir + dirfd? Or
> is that non-portable (I don't know)?
>
That's fine. I'll correct that too.

Thanks for reviewing the code. I'll submit a corrected patch for debian
within a week.





-- 
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]
Archive: http://lists.debian.org/[email protected]

Reply via email to