Package: qa.debian.org
User: qa.debian....@packages.debian.org
Usertags: udd

Hi,

The upload_history importer works as follows:

1) /srv/udd.debian.org/email-archives/debian-devel-changes/ contains a copy
of the email archives, copied manually from master.debian.org. The
latest emails are received directly on ullmann, to 
/srv/udd.debian.org/email-archives/debian-devel-changes/debian-devel-changes.current
This part is about OK. It would be better if DSA provided a way to
access those archives from ullmann without having to copy them from time
to time.

2) When started, the importer first runs 'make' in 
/srv/udd.debian.org/upload-history/. This:
2.1) updates local copies of keyrings
2.2) using 'munge_ddc.py', converts email archives into summarized versions, 
stored as, e.g.:
/srv/udd.debian.org/upload-history/debian-devel-changes.201209.gz.out

3) then the importer reads *.out and import them into postgres.

'munge_ddc.py' has the following issues:
- it's not version-controlled
- it doesn't support xz email archives, so it's broken for recent
  archives
- it's python2 (but the lzma module is python3-only)

Help would be welcomed to port it to python3 and resolve the other
issues. Also, the data files around the upload_history gatherer should
probably be reorganized with a cleaner separation between code (that
should be versioned in UDD) and data.

Lucas

Reply via email to