After suffering from extremely bad fink performance on my iBook in the past couple months (despite the nice improvements recently done), I begin to wonder if maybe we should reconsider our current package DB indexing behavior.

The current approach is restricted in its performance by the fact that it traverses the full /sw/fink/dists hierarchy to determine if the DB is up-to-date. It takes longest if the DB is actually clean (since then it can't cutoff the search early). This alone takes a lot of time. On my iBook, with a clean pkg DB, and the HD cache empty, it takes a simple "fink info foo" ~8 secs to just load the DB; a second call still takes 4 secs; and from our past profiling, a noticeable amount is spent on dir traversing.

The motivation for this approach (which causes an automatic reindex when a .info file was changed) was/is that we wanted to be fail safe: if the user (or developer in our case) messes with an .info file, if the index is not updated, this causes unexpected results (namely the .info file changes are not honored).

While it's "fail safe", it's *slow* and there is not much room to speed it up much as long as we keep traversing the directories (the speed of that is determined by the HD, the FS driver in the OS, and the OS HD cache, none of which we can control). And with new package it gets worse, as we have to stat more and more files.


Hence, I am thinking about changing to this behavior:

1) Stop the auto-indexing-when-info-files change (i.e. don't traverse the /sw/fink/dists hierarchy, thus cutting the speed hog)
2) Of course keep the reindexing in selfupdate(-cvs).
3) track the value of the "Trees" setting in fink.conf in the DB, if it changes, reindex

This way for the average joe user, nothing should change compared to the current setup (please correct me if I am wrong). The people for which something changes are the developers. Now we have to remember to reindex manually whenever we change a package. In my situation, that's often a boon, I hate it when I corrected a typo in a .info file, then need to quickly check some other package, type "fink info foo", and have to wait ages till it finishes indexing.

To ease the developer's pain somewhat, we could add a "fink index foo.info" command, which would add the specified .info file(s) to the index. This way, you could just (re)index the files you are working on, avoiding having to do full indexing too often.


I know this is quite a reversal from my previous stance on this; and I am still not sure if I really like it, but I guess we should at least consider it. Comments? Thoughts? What did I miss?


Cheers,

Max


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Fink-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/fink-devel

Reply via email to