Package: qa.debian.org User: qa.debian....@packages.debian.org Usertags: udd
Hi, The upload_history importer works as follows: 1) /srv/udd.debian.org/email-archives/debian-devel-changes/ contains a copy of the email archives, copied manually from master.debian.org. The latest emails are received directly on ullmann, to /srv/udd.debian.org/email-archives/debian-devel-changes/debian-devel-changes.current This part is about OK. It would be better if DSA provided a way to access those archives from ullmann without having to copy them from time to time. 2) When started, the importer first runs 'make' in /srv/udd.debian.org/upload-history/. This: 2.1) updates local copies of keyrings 2.2) using 'munge_ddc.py', converts email archives into summarized versions, stored as, e.g.: /srv/udd.debian.org/upload-history/debian-devel-changes.201209.gz.out 3) then the importer reads *.out and import them into postgres. 'munge_ddc.py' has the following issues: - it's not version-controlled - it doesn't support xz email archives, so it's broken for recent archives - it's python2 (but the lzma module is python3-only) Help would be welcomed to port it to python3 and resolve the other issues. Also, the data files around the upload_history gatherer should probably be reorganized with a cleaner separation between code (that should be versioned in UDD) and data. Lucas