Sorry for the resend. I figured this deserves a [DISCUSS] flag.
On Sat, Jun 6, 2015 at 10:39 PM, Sean Busbey <bus...@cloudera.com> wrote: > Hi Folks! > > After working on test-patch with other folks for the last few months, I > think we've reached the point where we can make the fastest progress > towards the goal of a general use pre-commit patch tester by spinning > things into a project focused on just that. I think we have a mature enough > code base and a sufficient fledgling community, so I'm going to put > together a tlp proposal. > > Thanks for the feedback thus far from use within Hadoop. I hope we can > continue to make things more useful. > > -Sean > > On Wed, Mar 11, 2015 at 5:16 PM, Sean Busbey <bus...@cloudera.com> wrote: > >> HBase's dev-support folder is where the scripts and support files live. >> We've only recently started adding anything to the maven builds that's >> specific to jenkins[1]; so far it's diagnostic stuff, but that's where I'd >> add in more if we ran into the same permissions problems y'all are having. >> >> There's also our precommit job itself, though it isn't large[2]. AFAIK, >> we don't properly back this up anywhere, we just notify each other of >> changes on a particular mail thread[3]. >> >> [1]: https://github.com/apache/hbase/blob/master/pom.xml#L1687 >> [2]: https://builds.apache.org/job/PreCommit-HBASE-Build/ (they're all >> read because I just finished fixing "mvn site" running out of permgen) >> [3]: http://s.apache.org/NT0 >> >> >> On Wed, Mar 11, 2015 at 4:51 PM, Chris Nauroth <cnaur...@hortonworks.com> >> wrote: >> >>> Sure, thanks Sean! Do we just look in the dev-support folder in the >>> HBase >>> repo? Is there any additional context we need to be aware of? >>> >>> Chris Nauroth >>> Hortonworks >>> http://hortonworks.com/ >>> >>> >>> >>> >>> >>> >>> On 3/11/15, 2:44 PM, "Sean Busbey" <bus...@cloudera.com> wrote: >>> >>> >+dev@hbase >>> > >>> >HBase has recently been cleaning up our precommit jenkins jobs to make >>> >them >>> >more robust. From what I can tell our stuff started off as an earlier >>> >version of what Hadoop uses for testing. >>> > >>> >Folks on either side open to an experiment of combining our precommit >>> >check >>> >tooling? In principle we should be looking for the same kinds of things. >>> > >>> >Naturally we'll still need different jenkins jobs to handle different >>> >resource needs and we'd need to figure out where stuff eventually lives, >>> >but that could come later. >>> > >>> >On Wed, Mar 11, 2015 at 4:34 PM, Chris Nauroth < >>> cnaur...@hortonworks.com> >>> >wrote: >>> > >>> >> The only thing I'm aware of is the failOnError option: >>> >> >>> >> >>> >> >>> http://maven.apache.org/plugins/maven-clean-plugin/examples/ignoring-erro >>> >>rs >>> >> .html >>> >> >>> >> >>> >> I prefer that we don't disable this, because ignoring different kinds >>> of >>> >> failures could leave our build directories in an indeterminate state. >>> >>For >>> >> example, we could end up with an old class file on the classpath for >>> >>test >>> >> runs that was supposedly deleted. >>> >> >>> >> I think it's worth exploring Eddy's suggestion to try simulating >>> failure >>> >> by placing a file where the code expects to see a directory. That >>> might >>> >> even let us enable some of these tests that are skipped on Windows, >>> >> because Windows allows access for the owner even after permissions >>> have >>> >> been stripped. >>> >> >>> >> Chris Nauroth >>> >> Hortonworks >>> >> http://hortonworks.com/ >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> On 3/11/15, 2:10 PM, "Colin McCabe" <cmcc...@alumni.cmu.edu> wrote: >>> >> >>> >> >Is there a maven plugin or setting we can use to simply remove >>> >> >directories that have no executable permissions on them? Clearly we >>> >> >have the permission to do this from a technical point of view (since >>> >> >we created the directories as the jenkins user), it's simply that the >>> >> >code refuses to do it. >>> >> > >>> >> >Otherwise I guess we can just fix those tests... >>> >> > >>> >> >Colin >>> >> > >>> >> >On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu <l...@cloudera.com> wrote: >>> >> >> Thanks a lot for looking into HDFS-7722, Chris. >>> >> >> >>> >> >> In HDFS-7722: >>> >> >> TestDataNodeVolumeFailureXXX tests reset data dir permissions in >>> >> >>TearDown(). >>> >> >> TestDataNodeHotSwapVolumes reset permissions in a finally clause. >>> >> >> >>> >> >> Also I ran mvn test several times on my machine and all tests >>> passed. >>> >> >> >>> >> >> However, since in DiskChecker#checkDirAccess(): >>> >> >> >>> >> >> private static void checkDirAccess(File dir) throws >>> >>DiskErrorException { >>> >> >> if (!dir.isDirectory()) { >>> >> >> throw new DiskErrorException("Not a directory: " >>> >> >> + dir.toString()); >>> >> >> } >>> >> >> >>> >> >> checkAccessByFileMethods(dir); >>> >> >> } >>> >> >> >>> >> >> One potentially safer alternative is replacing data dir with a >>> >>regular >>> >> >> file to stimulate disk failures. >>> >> >> >>> >> >> On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth >>> >> >><cnaur...@hortonworks.com> wrote: >>> >> >>> TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure, >>> >> >>> TestDataNodeVolumeFailureReporting, and >>> >> >>> TestDataNodeVolumeFailureToleration all remove executable >>> >>permissions >>> >> >>>from >>> >> >>> directories like the one Colin mentioned to simulate disk failures >>> >>at >>> >> >>>data >>> >> >>> nodes. I reviewed the code for all of those, and they all appear >>> >>to be >>> >> >>> doing the necessary work to restore executable permissions at the >>> >>end >>> >> >>>of >>> >> >>> the test. The only recent uncommitted patch I¹ve seen that makes >>> >> >>>changes >>> >> >>> in these test suites is HDFS-7722. That patch still looks fine >>> >> >>>though. I >>> >> >>> don¹t know if there are other uncommitted patches that changed >>> these >>> >> >>>test >>> >> >>> suites. >>> >> >>> >>> >> >>> I suppose it¹s also possible that the JUnit process unexpectedly >>> >>died >>> >> >>> after removing executable permissions but before restoring them. >>> >>That >>> >> >>> always would have been a weakness of these test suites, regardless >>> >>of >>> >> >>>any >>> >> >>> recent changes. >>> >> >>> >>> >> >>> Chris Nauroth >>> >> >>> Hortonworks >>> >> >>> http://hortonworks.com/ >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> On 3/10/15, 1:47 PM, "Aaron T. Myers" <a...@cloudera.com> wrote: >>> >> >>> >>> >> >>>>Hey Colin, >>> >> >>>> >>> >> >>>>I asked Andrew Bayer, who works with Apache Infra, what's going on >>> >>with >>> >> >>>>these boxes. He took a look and concluded that some perms are >>> being >>> >> >>>>set in >>> >> >>>>those directories by our unit tests which are precluding those >>> files >>> >> >>>>from >>> >> >>>>getting deleted. He's going to clean up the boxes for us, but we >>> >>should >>> >> >>>>expect this to keep happening until we can fix the test in >>> question >>> >>to >>> >> >>>>properly clean up after itself. >>> >> >>>> >>> >> >>>>To help narrow down which commit it was that started this, Andrew >>> >>sent >>> >> >>>>me >>> >> >>>>this info: >>> >> >>>> >>> >> >>>>"/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS- >>> >> >>> >>> >>>>>>Build/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data3 >>> >>>>>>/ >>> >> >>>>has >>> >> >>>>500 perms, so I'm guessing that's the problem. Been that way since >>> >>9:32 >>> >> >>>>UTC >>> >> >>>>on March 5th." >>> >> >>>> >>> >> >>>>-- >>> >> >>>>Aaron T. Myers >>> >> >>>>Software Engineer, Cloudera >>> >> >>>> >>> >> >>>>On Tue, Mar 10, 2015 at 1:24 PM, Colin P. McCabe >>> >><cmcc...@apache.org> >>> >> >>>>wrote: >>> >> >>>> >>> >> >>>>> Hi all, >>> >> >>>>> >>> >> >>>>> A very quick (and not thorough) survey shows that I can't find >>> any >>> >> >>>>> jenkins jobs that succeeded from the last 24 hours. Most of >>> them >>> >> >>>>>seem >>> >> >>>>> to be failing with some variant of this message: >>> >> >>>>> >>> >> >>>>> [ERROR] Failed to execute goal >>> >> >>>>> org.apache.maven.plugins:maven-clean-plugin:2.5:clean >>> >>(default-clean) >>> >> >>>>> on project hadoop-hdfs: Failed to clean project: Failed to >>> delete >>> >> >>>>> >>> >> >>>>> >>> >> >>> >>> >>>>>>>/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/hadoop-hd >>> >>>>>>>fs >>> >> >>>>>-pr >>> >> >>>>>oject/hadoop-hdfs/target/test/data/dfs/data/data3 >>> >> >>>>> -> [Help 1] >>> >> >>>>> >>> >> >>>>> Any ideas how this happened? Bad disk, unit test setting wrong >>> >> >>>>> permissions? >>> >> >>>>> >>> >> >>>>> Colin >>> >> >>>>> >>> >> >>> >>> >> >> >>> >> >> >>> >> >> >>> >> >> -- >>> >> >> Lei (Eddy) Xu >>> >> >> Software Engineer, Cloudera >>> >> >>> >> >>> > >>> > >>> >-- >>> >Sean >>> >>> >> >> >> -- >> Sean >> > > > > -- > Sean > -- Sean