TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure, TestDataNodeVolumeFailureReporting, and TestDataNodeVolumeFailureToleration all remove executable permissions from directories like the one Colin mentioned to simulate disk failures at data nodes. I reviewed the code for all of those, and they all appear to be doing the necessary work to restore executable permissions at the end of the test. The only recent uncommitted patch I¹ve seen that makes changes in these test suites is HDFS-7722. That patch still looks fine though. I don¹t know if there are other uncommitted patches that changed these test suites.
I suppose it¹s also possible that the JUnit process unexpectedly died after removing executable permissions but before restoring them. That always would have been a weakness of these test suites, regardless of any recent changes. Chris Nauroth Hortonworks http://hortonworks.com/ On 3/10/15, 1:47 PM, "Aaron T. Myers" <a...@cloudera.com> wrote: >Hey Colin, > >I asked Andrew Bayer, who works with Apache Infra, what's going on with >these boxes. He took a look and concluded that some perms are being set in >those directories by our unit tests which are precluding those files from >getting deleted. He's going to clean up the boxes for us, but we should >expect this to keep happening until we can fix the test in question to >properly clean up after itself. > >To help narrow down which commit it was that started this, Andrew sent me >this info: > >"/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS- >Build/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data3/ has >500 perms, so I'm guessing that's the problem. Been that way since 9:32 >UTC >on March 5th." > >-- >Aaron T. Myers >Software Engineer, Cloudera > >On Tue, Mar 10, 2015 at 1:24 PM, Colin P. McCabe <cmcc...@apache.org> >wrote: > >> Hi all, >> >> A very quick (and not thorough) survey shows that I can't find any >> jenkins jobs that succeeded from the last 24 hours. Most of them seem >> to be failing with some variant of this message: >> >> [ERROR] Failed to execute goal >> org.apache.maven.plugins:maven-clean-plugin:2.5:clean (default-clean) >> on project hadoop-hdfs: Failed to clean project: Failed to delete >> >> >>/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/hadoop-hdfs-pr >>oject/hadoop-hdfs/target/test/data/dfs/data/data3 >> -> [Help 1] >> >> Any ideas how this happened? Bad disk, unit test setting wrong >> permissions? >> >> Colin >>