Hi,
On 02/17/2010 09:09 PM, Guillem Jover wrote:
The patch (attached) is completely untested, it builds though :), and
there's still few things to polish, but I think it handles most of the
problematic cases now (except for syncing dirs on removal), it is also
I tested the fsync patch. There is one issue in processarc.c. The call
to dir_sync_contents fails because at this point 'cidir' is the full
path of the control file, not the name of the directory containing the
control file.
Following your work, I added a dir_sync_contents_parent in
lib/dpkg/dir.{c,h} and replaced the call to dir_sync_contents in
process_archive. I'm not sure if it's done the way you're expecting, but
it builds and run fine. The patch is attached.
I tested the following cases: install, upgrade, removal and purge. Then
ran a sha256sum on the files (archive, control files, and dpkg db) and
compared to the expected checksum. All the files are correctly synced or
removed without residual files left behind (some .dpkg-new or status-new)
doing too many fsync()s on directories, and I'll probably switch that
to mark them as dirty and only fsync() them then.
I confirm that there are too many syncs on directories. Some directories
are sync as many times as the number of files it contains. Marking them
as dirty when the status of the package is set 'unpacked' should fix it.
You'll find the number of sync/file for bsdgames at
http://www.pastebin.Com/NdHNWsyQ
I think I'll split it in few pieces, one for the missing fsync()/fflush()
on files, another one for dir fsync()s on db paths, and a last one to
be worked on for fsync()s on the installed directories. This way we get
incremental improvements, also I've not yet checked the impact of the
additional fsync()s, which might be notibable for the installed files.
The major issue seems to be the performance. I tested
texlive-latex-extra-doc (more than 8000 files) some directories are sync
around 500 times. I've only tested on a VM up to now, but the
performance loss is a factor 3. Take this number with caution, it's a VM
with a virtual disk, not a raw device hence disk access does not means
much. I'll do some profiling on a real machine this week end.
Don't hesitate to tell me if you need few more tests.
Thanks.
--
-JB
diff -Nru dpkg.patched/lib/dpkg/dir.c dpkg.patched.02/lib/dpkg/dir.c
--- dpkg.patched/lib/dpkg/dir.c 2010-02-24 17:05:35.207917256 +0100
+++ dpkg.patched.02/lib/dpkg/dir.c 2010-02-26 21:20:02.870717405 +0100
@@ -120,3 +120,19 @@
free(dirname);
}
+
+void
+dir_sync_contents_parent(const char *path)
+{
+ char *dirname, *slash;
+
+ dirname = m_strdup(path);
+
+ slash = strrchr(dirname, '/');
+ if (slash != NULL) {
+ *slash = '\0';
+ dir_sync_contents(dirname);
+ }
+
+ free(dirname);
+}
diff -Nru dpkg.patched/lib/dpkg/dir.h dpkg.patched.02/lib/dpkg/dir.h
--- dpkg.patched/lib/dpkg/dir.h 2010-02-24 17:05:35.207917256 +0100
+++ dpkg.patched.02/lib/dpkg/dir.h 2010-02-26 21:20:10.154742899 +0100
@@ -28,6 +28,7 @@
DPKG_BEGIN_DECLS
void dir_sync_contents(const char *path);
+void dir_sync_contents_parent(const char *path);
void dir_sync(DIR *dir, const char *path);
void dir_sync_path(const char *path);
void dir_sync_path_parent(const char *path);
diff -Nru dpkg.patched/src/processarc.c dpkg.patched.02/src/processarc.c
--- dpkg.patched/src/processarc.c 2010-02-24 17:05:35.215922475 +0100
+++ dpkg.patched.02/src/processarc.c 2010-02-26 21:24:49.930792586 +0100
@@ -222,7 +222,7 @@
* files. As neither dpkg-deb nor tar do explicit fsync()s, we have to do
* them here. XXX: We might replace this at some point with an internal
* tar implementation. */
- dir_sync_contents(cidir);
+ dir_sync_contents_parent(cidir);
parsedb(cidir, pdb_recordavailable | pdb_rejectstatus | pdb_ignorefiles,
&pkg,NULL,NULL);