Le Sun, Feb 19, 2012 at 03:13:19PM +0100, Andreas Tille a écrit : > > I somehow assumed that if > I'm editing a debian/upstream file and commit it to our Vcs after some > delay (say 1 day) this change would be reflected in Umegaya and (in the > worst case one day later) the UDD bibref gatherer would fetch the > changed status.
Hi Andreas, let's imagine that every source package in Debian has a debian/upstream file. To refresh the information daily, it would take more than 18,000 requests on Alioth. On my computer, it takes 2-3 seconds to interrogate the Subversion repository on Alioth, and 1-2 seconds for Git. $ time svn cat svn://svn.debian.org/svn/debian-med/trunk/packages/primer3/trunk/debian/upstream > /dev/null real 0m2.727s user 0m0.068s sys 0m0.020s $ time curl -L -s $(umegaya-guess-url git://git.debian.org/git/debian-med/emboss.git | cut -f2) > /dev/null real 0m1.218s user 0m0.012s sys 0m0.012s It would take hours to check every package daily, and I worry for the load on Alioth. This is why I designed a push model. After updating debian/upstream for the package 'foo', visit http://upstream-metadata.debian.net/foo/YAML-URL, and Umegaya will refresh its information. (This will work after I transfer the service to debian-med.debian.net; I really hope to do it this evening). Nevertheless, as long as only Debian Med is using Umegaya, we can forcibly refresh the information daily. A better way would be to have Subversion and Git commit hooks that do the job. I will work on this after the transfer of upstream-metadata.d.n. > So how exactly will a package be registered in the Umegaya database. Currently one needs to log in on upstream-metadata.d.n, and run umegaya-adm --register. Alternatively, a cron job can use a similar script as you posted, monitor new additions, and run umegaya-adm --register. Later, I would like to have a possibility to do this over the network; that what I meant by "HTTP interface"; I should have written "URL API". I want the CGI script to be able to recieve new URLs to track. To prevent kiddies to trick the system and make us upload illegal stuff in the UDD, the system would for instance decline to track any URL that is not in a "debian.org" domain. Another alternative is to let Umegaya try to search for unknown packages in svn.debian.org and git.debian.org. > (BTW, I keep on cut-n-pasing even the short name - could we call the > database the same as the file and name it upstream database? ;-)) Isn't "upstream database" too generic ? But within the scope of this thread it is not a problem. > I did not dived into PET but as far as I know this is more what I > consider an automatic update driven by the data inside the VCS and I > wonder, whether we should not rather somehow tweak the debian/upstream > files into the PET mechanism. Did you considered this? The PET could also be a good starting point for monitoring the VCS and pinging Umegaya. > When thinking twice about it: What is the sense of having this Berkeley > DB at all if we have UDD? Why not importing the content of the upstream > files straight into UDD. For me this somehow looks like a detour but as > I said I might be a bit narrow mindet on the usage on the tasks pages. If I understand well the UDD, it is updated by reloading whole tables. Umegaya is the table producer. There could be other ways to do it, but since I am aiming at a system that can cope with tens of thousands of packages, I think that it rules out alternatives such as checking out all Alioth repositories everyday. I am sorry that I kept http://upstream-metadata.debian.net in a miserable state this year. I have done a lot of ground work that week-end, and the transfer of to debian-med.debian.net, hopefully today, will be a fresh restart. Cheers, -- Charles -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: http://lists.debian.org/[email protected]

