Package: qa.debian.org
Severity: wishlist
User: qa.debian....@packages.debian.org
Usertags: udd


Dear UDD heroes,

the publicly available udd.dump [1] is 1.5 GB and I'd like to ask for
the possibility to turn it from a PGDMP in `pg_dump -Fc` format
to a dir'ed `pg_dump -Fd` which might, with the many tables within udd,
provide a measure of bandwidth-savings for mirrors of the database when
the periodic dumps don't update some of the tables and can therefore
be skipped in a mirror run. (Potentially could save time on dumps and
restores as well, as -Fd can dump and restore tables in parallel.)

Further saving (and ease of mirror use) could possibly be achieved by
providing an rsync module of the dump directory [1], if the possibility
exists of setting that up on ullmann.d.o.

Reasoning:

Since the database is constantly changing and may require a mirror to
update often, grabbing a new 1.5GB file perhaps multiple times daily
can be daunting to some lower-bandwidth connections (while some tables
actually get updates only daily or less).


After a chat with mapreri I don't have high hopes wrt the rsync module,
but I was encouraged to just put it out there anyway.

< mapreri> nyov: of course, if you wish a -Fd dump, it should be provided 
alongside the current -Fc one, not instead.

And that is probably the final nail in the coffin - having two having both
is really a necessity.


As an alternative option -- perhaps udd.dump could be dumped without
pg_dump internal compression (-Z0) and instead piped through lzma/xz [2],
as the popcon dump already does?


[1] https://udd.debian.org/dumps/
[2] https://salsa.debian.org/qa/udd/blob/master/scripts/dump-db-more-frequently
    https://salsa.debian.org/qa/udd/blob/master/scripts/dump-db.sh


Cheers

Reply via email to