* Benjamin Drung <benjamin.dr...@profitbricks.com> [140108 15:51]:

First sorry for my late reply. I must have totally missed your mail.

> The first step is to agree on the database layout change. I came up with
> two alternatives:
>
> 1) Allow duplicate entries in packages.db and sort duplicate entries by
> their Debian version. They can be sorted a) upwards or b) downwards.
> Depending on the request, we will either search for all versions of a
> package, one specific version of the package, or for the latest version
> of a package.
>
> 2) Rename the key of packages.db to also contain the version of the
> package, e.g. "sl|3.03-17" or "hello_2.8-4" (which delimiter should we
> use?). This would allow us to check directly for a specific version of a
> package. We need to add a secondary table that allows us to access the
> database as described in 1) through the secondary table. This secondary
> table will allow duplicate entries and the values of the secondary table
> point to the key in packages.db. Depending on the task, we either query
> the first or secondary table. The secondary table will be kept in sync
> by BerkeleyDB.
>
> In the first case, we need to add a function to iterate over the
> duplicate packages to find a specific version. In the second case, we
> need to create the secondary table and transform the database.
>
> Which layout do you prefer?

I think that layout is better that better fits the code. Not yet having
looked at the code, I cannot say. I guess 1 might be simpler. In the
case of 2 I think "|" is fine, as it is already used elsewhere (though
I guess one should make sure reprepro does not allow | in package
names).

> Another issue is the sorting of the packages in the database. We need
> one function to sort all entries in the table. So we need one function
> to sort binary packages and source packages, but we have
> binaries_getversion() and source_getversion(). Here's the example code
> (without the error handling) of the sorting function:
> 
> static int debianversioncompare(UNUSED(DB *db), const DBT *a, const DBT *b) {
>     char *a_version, *b_version;
>     int versioncmp;
> 
>     binaries_getversion(a->data, &a_version);
>     binaries_getversion(b->data, &b_version);
>     dpkgversions_cmp(a_version, b_version, &versioncmp);
>     return versioncmp;
> }
> 
> Do you have a suggestion how to improve this function?

It sounds quite slow either way. Perhaps the way to go is instead
changing the data format, like having the version first (perhaps even in
preparsed format to speed things up).


        Bernhard R. Link
-- 
F8AC 04D5 0B9B 064B 3383  C3DA AFFC 96D1 151D FFDC


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to