On 01/11/12 15:12, Allan McRae wrote: > I was thinking about the local database backend in pacman and how we > could improve it. > > The tar-based backed for sync dbs have been quite a success. Ever since > we did that, I have been wanting to do the same with the local database. > But there were two issues: > > 1) tar is not designed for removing and updating files in place > > 2) with a directory/files per package structure, we are quite robust to > corruption as a single file effects only a single package. > > > Well... I have a cunning plan... How about we do both! > > > Have the local database in a tarball but also extracted. All reading is > done from the tarball, so -Q operations would be fast due to not require > reading from lots of small files. With an operation that > adds/removes/updates a package, the local database is still read from > the tarball, but the modifications are done on the files backend and > then the tarball is recreated at the end. We could even be efficient > during the recreation and read all the old files from the old tarball > and only read the new files from the filesystem (which will be in the > kernel cache...). > > This would also give another use for "pacman -D" - an option could be > added to recreate the local db tarball - in case it became corrupt or > the files were manually edited. > > > What do people think?
Just to be clear, I wanted comments on the dual local database (one "binary", one filesystem based) would be a good solution to increase read speed (due to not having many small files), but also keeping the robustness of the non-binary format to corruption. I.e would the extra 10-20MB be an acceptable trade-off. Given I have absolutely no interest in using sqlite, bdb, etc... and I can almost guarantee that I will be the one to provide a patchset that changes the local backend, comments about choosing a relational database are not needed. Allan
