On Fri, Sep 12, 2008 at 3:23 PM, Dusty Phillips <[EMAIL PROTECTED]> wrote: > 2008/9/12 Eric Belanger <[EMAIL PROTECTED]>: >> Hi, >> >> I don't know if you remember but a while ago a huge part of extra i686 (IIRC >> it was all packages from L to Z) were orphaned and erroneouly showing up as >> recently updated on the web site. This just happened again with packages in >> extra x86_64. I don't know what could caused that but it's very annoying as >> we has to readopt all our packages back. > > Fuck. > > I remember Judd telling me not to swear at users but its ok to swear > at scripts right? > > This has to be happening in reporead.py. Fucking reporead.py. To the > best of my knowledge, no other script updates the web database in > anyway, am I wrong? > > > The actual db_update script splits the packages into those that are in > the database and those that are not and processes them separately. > Packages that are not currently in the database get added as orphans > because apparently its hard to interrogate the maintainer from the > db.tar.gz. At first, I assumed that it is doing an add when it should > be doing an update, which would add new packages with orphan > maintainer. But this doesn't appear to be the case because there are > not currently any duplicate x86_64 packages (that aren't in testing). > > My second more likely hypothesis is race conditions. I don't know how > the db scripts update exactly, but I suspect reporead is reading a > db.tar.gz file that is either broken or not yet fully uploaded. It > sees this broken db file and drops all the packages in the web > interface that are not in that file. Then x minutes later (crontab), > it runs again on a proper db and sees the missing packages again. It > adds them to the database and sets the maintainer to orphan. > > Are such broken dbs possible/likely/happening? If its a race > condition, we need to put a lock on the database (maybe dbtools does > this already) so that reporead isn't accessing it at the same time as > dbtools. If its just that when the database gets updated it sometimes > breaks the database well.. that just needs to be fixed.
Hmmm, the DBs are constructed in /tmp and then moved live to /home/ftp/whatever it's possible that reporead may be opening it mid-move, but that doesn't seem right. It's gzipped. Wouldn't that balk if you took half of a DB file, and tried to gunzip it?

