This job finally finished.

Looks like the full disk problem was triggered when writing logs out - the 
following is recorded in consoleText 1,478 times:

-----
  [junit4] java.io.IOException: No space left on device
   [junit4]     at java.io.RandomAccessFile.writeBytes(Native Method)
   [junit4]     at java.io.RandomAccessFile.write(RandomAccessFile.java:525)
   [junit4]     at 
com.carrotsearch.ant.tasks.junit4.LocalSlaveStreamHandler$1.write(LocalSlaveStreamHandler.java:74)
   [junit4]     at 
com.carrotsearch.ant.tasks.junit4.events.AppendStdErrEvent.copyTo(AppendStdErrEvent.java:24)
   [junit4]     at 
com.carrotsearch.ant.tasks.junit4.LocalSlaveStreamHandler.pumpEvents(LocalSlaveStreamHandler.java:252)
   [junit4]     at 
com.carrotsearch.ant.tasks.junit4.LocalSlaveStreamHandler$2.run(LocalSlaveStreamHandler.java:122)
-----

I ssh’d into lucene1-us-west.apache.org, where the lucene Jenkins slave is 
hosted, to look at the disk space situation.

-----
jenkins@lucene1-us-west:~$ df -k .
Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/sdb1      139204584 90449280  42237900  69% /x1
jenkins@lucene1-us-west:~$ df -k
Filesystem                             1K-blocks     Used Available Use% 
Mounted on
/dev/mapper/lucene1--us--west--vg-root  30582652 23554540   5451564  82% /
[…]
/dev/sdb1                              139204584 90449280  42237900  69% /x1
-----

All Jenkins workspaces are under /x1/jenkins/.

Separately (I think) I see that Uwe has got the enwiki.random.lines.txt file 
checked out multiple times (looks like once per job, of which there are 
currently 17, though I doubt all of them will need this file), so each copy is 
taking up 3GB:

-----
jenkins@lucene1-us-west:~/jenkins-slave$ ls -l workspace/*/test-data
workspace/Lucene-Solr-NightlyTests-6.x/test-data:
total 2966980
-rw-r--r-- 1 jenkins jenkins 3038178822 Aug 16 03:18 enwiki.random.lines.txt
-rw-r--r-- 1 jenkins jenkins        452 Aug 16 03:18 README.txt

workspace/Lucene-Solr-NightlyTests-master/test-data:
total 2966980
-rw-r--r-- 1 jenkins jenkins 3038178822 Aug 15 22:27 enwiki.random.lines.txt
-rw-r--r-- 1 jenkins jenkins        452 Aug 15 22:27 README.txt
-----

Uwe, is there any way we can just have one copy shared by all jobs?

Here are the disk footprints by job:

-----
jenkins@lucene1-us-west:~/jenkins-slave/workspace$ du -sh 
/x1/jenkins/jenkins-slave/workspace/*
28K     /x1/jenkins/jenkins-slave/workspace/infra-test-ant-ubuntu
44K     /x1/jenkins/jenkins-slave/workspace/infra-test-maven-ubuntu
971M    /x1/jenkins/jenkins-slave/workspace/Lucene-Artifacts-6.x
968M    /x1/jenkins/jenkins-slave/workspace/Lucene-Artifacts-master
368M    /x1/jenkins/jenkins-slave/workspace/Lucene-Ivy-Bootstrap
6.5G    /x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Clover-master
1.7G    /x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-6.x
1.7G    /x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-master
6.5G    /x1/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-6.x
56G     /x1/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master
2.0G    /x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-6.x
2.0G    /x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master
1.2G    /x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-6.x
1.6G    /x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master
468M    /x1/jenkins/jenkins-slave/workspace/Lucene-Tests-MMAP-master
1.7G    /x1/jenkins/jenkins-slave/workspace/Solr-Artifacts-6.x
1.7G    /x1/jenkins/jenkins-slave/workspace/Solr-Artifacts-master
-----

Turns out there is a single *45GB* file in the job with the largest disk 
footprint (also the job that started this thread) - under 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/: 

  solr/build/solr-core/test/temp/junit4-J2-20160817_095505_593.events

Does anybody know if we can limit the size of these *.events files, which seem 
to be created under OOM conditions?

I ran ‘rm -rf solr/build’ to reclaim the disk space.

--
Steve
www.lucidworks.com

> On Aug 17, 2016, at 5:49 PM, Kevin Risden <[email protected]> wrote:
> 
> Usually the build takes 5-6 hours and now its been ~14hrs. 
> 
> https://builds.apache.org/job/Lucene-Solr-NightlyTests-master/1100
> 
> I saw in the console logs:
> 
> java.security.PrivilegedActionException: java.io.IOException: No space left 
> on device
> 
> Looks like it might be stuck here:
> 
> Archiving artifacts
> 
> Not sure if there is something that can be done about this?
> 
> Kevin Risden


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to