Hello everyone,

I cross-post to Solr and Lucene dev lists because we share ASF jenkins
boxes that
execute jenkins workflows.

There has been a series of odd exceptions from Jenkins, looking pretty much
like this:

...
> git reset --hard # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from
https://github.com/apache/lucene.git
...
Caused by: hudson.plugins.git.GitException: Command "git reset --hard"
returned status code 128:
stdout:
stderr: error: unable to read sha1 file of .asf.yaml
(48083930a50d886827c1f87e23875e5da98c58c6)
error: unable to read sha1 file of .dir-locals.el
(c51e1232603b85b8bc74fbed7d1de08186920379)
error: unable to read sha1 file of .git-blame-ignore-revs
(945e687d0f3cdfcc99cb69b8b35ab631a70371a7)
error: unable to read sha1 file of .gitattributes
(a3135003e80fa8f49fc0f2250f40b85cde12ebc5)
...

I've ssh'd to lucene3/lucene4 and it seems like something leaves the .git
folder of a workspace checkout in a broken state, leading to subsequent git
reset failures. I'm not sure if it's a jenkins task timeout kicking in or
something else.

There seems to be enough drive space on both boxes, although it's close to
full. Looking at asf INFRA jira, the same type of exception was always
caused by disk space running low so for now I've:

- removed all existing workspaces under
/home/jenkins/jenkins-agent/workspace/Lucene
- removed all existing workspaces under
/home/jenkins/jenkins-agent/workspace/Solr

I didn't know how to pause jenkins for this, apologies if you get an error
from a job that was running at the moment I removed those folders.

Oddly enough, I can't see the same error appearing on Solr jenkins runs -
could be that all the jobs there have a "wipe workspace before you start"
checkbox enabled... I'm really not sure, sorry.

Right now, df -h shows:

lucene3 /dev/sda2 503G 59G 419G 13% /
lucene4 /dev/sda2 503G 99G 379G 21% /

Let's see if this fixes the problem.

Dawid

Reply via email to