Le Mon, Feb 20, 2012 at 09:43:09AM +0100, Andreas Tille a écrit : > > there are tools which assemble informations for Sources.gz files - I guess > this could be implemented if say 20% of the packages will contain such a > file.
In such a model, the packages need to be uploaded so that Sources.gz is updated. This is exactly what I aim at avoiding by feeding the UDD with Umegaya. > > This is why I designed a push model. After updating debian/upstream for the > > package 'foo', visit http://upstream-metadata.debian.net/foo/YAML-URL, and > > Umegaya will refresh its information. (This will work after I transfer the > > service to debian-med.debian.net; I really hope to do it this evening). > > I admit I do not trust that a developer will really do regular visits to > http://upstream-metadata.debian.net/foo/YAML-URL or any similar URL. Note that anybody can trigger a refresh. For instance, I ran this command to load all the upstream metadata for the packages known by debcheckout, and that are recommended by one of our tasks. for package in $(svn cat svn://svn.debian.org/blends/projects/med/trunk/debian-med/debian/control | grep Recommends | sed -e 's/,//g' -e 's/|//g' -e 's/Recommends://g' ); do curl http://upstream-metadata.debian.net/$package/Name ; done I can set up a cron job along these lines, in addition to VCS hooks. > BTW, it came to my mind that we should also gather > fields from debian/copyright if it is DEP5 compatible. I specifically > consider Upstream-Contact a very valuable field and at a later stage I > would even ask for a lintian check "Upstream-Contact is missing" or > something like this. I actually opposed - with no success - the includsion of the Upstream-Contact and Upstream-Name fields in DEP 5 as they usually do not contribute to respect the package's redistribution terms, with is the purpose of the Debian copyright file. The debian/upstream file features Contact and Name fields that can be used for the same purpose. > 1. scripts/fetch_bibref.sh > fetches all available debian/upstream files and move them to > /org/udd.debian.org/mirrors/upstream/package.upstream > I would like to stress the fact that I would fetch these > files *unchanged* as they are edited by the author > 2. udd/bibref_gatherer.py > Just parse the upstream files for bibliographic information > and push them into UDD > This is the really cheap part of the job and I volunteer to > do this in one afternoon. The problem with this approach is that it can only run on udd.debian.org, which is quite loaded if I understand well. Regardless the mean, I provide a table that can be downloaded daily and that can be loaded in the UDD. That is how the gatherers work as I have seen so far. That the data transits in a Berkeley DB is just a detail. It is as unimportant as having the data processed with one programming language or another. What matters is the final product, the table to be loaded. > However, regarding practical usage of these data I do not see > an application currently. You need a problem first which needs to be > solved to invent something new. The goal of the sytem is: - Let the maintainer update the data without uploading the package, - Gather data for our tasks pages. In addition to the bibliography, I think that, while rare, the Registration and Donation fields can be very useful to better cooperate with Upstream. http://upstream-metadata.debian.net/table/registration http://upstream-metadata.debian.net/table/donation > dh_bibref > > which turns debian/upstream data into a usable BibTeX database on the > users system. This is technically definitely not hard - it just needs > to be *done*. The challenge will be to have it ran by default by Debhelper. But I think that indeed it is the good direction. In the meantime, such a tool will need to produce a reference that is stored in the directory. > A. Gather *all* existing debian/upstream files and making sure they > will be updated after at least 24h at a place where they can be > fetched for UDD (I explicitely do not mention that we should do this > via the web service and I would really prefer not to go the detour > of another database) Currently I have the following cron job running on debian-med.debian.net: @hourly for key in DOI PMID Reference-Author Reference-Eprint Reference-Journal Reference-Number Reference-Pages Reference-Title Reference-URL Reference-Volume Reference-Year References; do curl -s http://upstream-metadata.debian.net/yaml/$key; done > public_html/biblio.yaml Therefore, the bibliographic data can now be accessed at the following URL. http://upstream-metadata.debian.net/~plessy/biblio.yaml [You may need to wait a bit for the DNS to propagate the new IP for upstream-metadata.debian.net] Let's see how it goes before deciding to redo everyghing from scratch with a new design. Cheers, -- Charles -- To UNSUBSCRIBE, email to debian-med-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120220160051.ga1...@falafel.plessy.net