On 11/22/2015 12:47 AM, Tom Lane wrote:
Andrew Dunstan <[email protected]> writes:
I have just released version 4.16 of the PostgreSQL Buildfarm client
I updated my critters to 4.16, and since nothing much was happening in
git, decided to test by doing "run_build.pl --nosend --verbose --force"
manually on prairiedog. That run went fine, but the cron job firing
run_branches.pl every few minutes was still live, and one of its runs
went a tad nuts even though nothing was happening in git:
Buildfarm member prairiedog failed on REL9_3_STABLE stage pgsql-Git
Buildfarm member prairiedog failed on REL9_4_STABLE stage pgsql-Git
Buildfarm member prairiedog failed on REL9_5_STABLE stage pgsql-Git
That resulted in these reports uploaded to the server:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2015-11-21%2018%3A27%3A29
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2015-11-21%2018%3A27%3A19
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2015-11-21%2018%3A26%3A12
which contain the following failure reports, respectively:
Missing checked out branch bf_REL9_5_STABLE:
fatal: Not a git repository (or any of the parent directories): .git
Missing checked out branch bf_REL9_4_STABLE:
fatal: Not a git repository (or any of the parent directories): .git
fatal: unable to read tree 719b1b413b507d0fc86162f6aa45b6e44e6d82a1
Cannot rebase: Your index contains uncommitted changes.
Please commit or stash them.
None of that makes any possible sense, because I certainly wasn't touching
the git tree by hand, and the run_build job was only touching HEAD.
There's nothing really broken on the machine, because the next set of
runs went through fine.
Don't know what to make of this, except that probably the buildfarm
script's concurrent-job interlocks need some attention.
Oh, ouch. Well, that message comes from us just doing "git branch" to
sanity check what branch we're on, and that happens before anything that
was changed in this release. I had assumed, possibly naively, that git
would lock against itself. Maybe not with multiple workdirs. This only
matters if you're using git_use_workdirs, like you are, since otherwise
the git repos are totally independent, and run_build is definitely
locked against itself on a given branch. I'll look at adding a global
wait lock, just while git checkout is running, to cover this case. In
normal operation we don't expect this to occur, since run_branches.pl
just runs branches one at a time, so I don't think we need to put out an
emergency fix, but you've uncovered a corner case that all my testing
has missed.
Thanks for the report.
cheers
andrew
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers