I would advocate for a local copy (if missing) and an environment variable
to override so that users can get a newer/different version.

I would also encourage upstream to find a way to embed a hash + download
date in their logs and outputs, if possible.

We should also ask PDB to version their files. Do they keep old versions
around?

--
Michael R. Crusoe

On Wed, Sep 8, 2021, 09:07 Andrius Merkys <mer...@debian.org> wrote:

> Hi all,
>
> On 2021-07-19 10:24, Nilesh Patra wrote:
> > On 19 July 2021 12:50:03 pm IST, Andrius Merkys <mer...@debian.org>
> wrote:
> >> Currently I am looking into ProMod3 [3], which seems to be the engine
> >> behind the great SWISS-MODEL service [4]. I seem to have figured out
> >> the
> >> dependencies, will go on to packaging next.
> > Let us know if you need help with packaging the chain, in case you need
> helping hands :-)
>
> So here I am asking for help/suggestions :)
>
> Problem: OpenStructure, a dependency of ProMod3, requires PDB components
> library, components.cif.gz, for some of its protein modeling routines.
> This library is provided by the PDB at [1] and is itself freely
> distributable (PDB discourages from modifying it though), but is updated
> quite often and does not get a version number. Furthermore, people often
> prefer to obtain the most up-to-date copy of components.cif.gz for their
> research, thus providing it in a Debian package of its own would not be
> very convenient.
>
> I am aware of solutions to similar problems, for example, libcifpp
> package, which keeps an up-to-date mmcif_pdbx_v50.dic.gz at
> /var/cache/libcifpp/mmcif_pdbx_v50.dic.gz. This could work for
> components.cif.gz as well, but my main concern is whether keeping
> system-wide components.cif.gz up-to-date is what every user would want.
>
> As a researcher I do my best to perform reproducible science. Thus I
> want to know precise versions/timestamps/checksums of my input
> databases, and have them suddenly change overnight is something akin to
> a nightmare. What is more, there might be more than one user on a
> machine wanting different versions of components.cif.gz.
>
> Thus my candidate solution for providing components.cif.gz for
> OpenStructure would be to talk to the upstream to implement an
> environment variable allowing for greater flexibility. Or maybe there
> are other solutions?
>
> [1] ftp://ftp.wwpdb.org/pub/pdb/data/monomers/components.cif.gz
>
> Best,
> Andrius
>
>

Reply via email to