On Mon, 31 Dec 2007 11:00:12 -0500, Dan Williams wrote: > On Sun, 2007-12-30 at 17:54 +0100, Michael Schwendt wrote: > > If in a failed job.log you see the message > > > > Job waited too long for repo to unlock. Killing it... > > > > please notify me. > > > > It's a problem in the plague server code that results in a denial of > > service for subsequent build jobs. I have a traceback from Dec 28th, but > > in the context of the source code it doesn't make sense yet (because a few > > lines earlier the code ensures that the files to be copied exist and are > > readable). Buildsys runs a slightly modified version that adds a bit more > > debug output in this area. > > Maybe just trap the exception, print it out, and continue? That way at > least the server doesn't fall over, it just fails to copy one item.
The buildsys runs such a patched Repo.py already. It catches OSError, IOError, unlocks the locks and prints/logs the results of the file access check prior to when files are copied. I also added a debug line in the package job code to see when it starts deleting the copied files. Normally it waits until a callback tells it that all files are copied. > It might also help debugging to see if only specific files can't be > copied... The offending file was copied, but shutil.copy() failed in its second part when trying to copy the file mode. It didn't find the source file it had just copied. :-} -- Fedora-buildsys-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/fedora-buildsys-list
