Great work indeed! Agreed, occasional failed runs may not be that bad, but fairly regular failed runs ruin the idea of CI. Especially for released or otherwise supposedly stable branches.
-Mikhail On Mon, Sep 12, 2016 at 4:53 PM, Sean Busbey <bus...@cloudera.com> wrote: > awesome work Appy! > > That's certainly good news to hear. > > On Mon, Sep 12, 2016 at 2:14 PM, Apekshit Sharma <a...@cloudera.com> > wrote: > > On a separate note: > > Trunk had 8 green runs in last 3 days! ( > > https://builds.apache.org/job/HBase-Trunk_matrix/) > > This was due to fixing just the mass failures on trunk and no change in > > flaky infra. Which made me to conclude two things: > > 1. Flaky infra works. > > 2. It relies heavily on the post-commit build's stability (which every > > project should anyways strive for). If the build fails catastrophically > > once in a while, we can just exclude that one run using a flag and > > everything will work, but if it happens frequently, then it won't work > > right. > > > > I have re-enabled Flaky tests job ( > > https://builds.apache.org/view/H-L/view/HBase/job/HBASE-Flaky-Tests/) > which > > was disabled for almost a month due to trunk being on fire. > > I will keep an eye on how things are going. > > > > > > On Mon, Sep 12, 2016 at 2:02 PM, Apekshit Sharma <a...@cloudera.com> > wrote: > > > >> @Sean, Mikhail: I found the alternate solution. Using user defined axis, > >> tool environment and env variable injection. > >> See latest diff to https://builds.apache.org/job/HBase-Trunk_matrix/ > job > >> for reference. > >> > >> > >> On Tue, Aug 30, 2016 at 7:39 PM, Mikhail Antonov <olorinb...@gmail.com> > >> wrote: > >> > >>> FYI, I did the same for branch-1.3 builds. I've disabled hbase-1.3 and > >>> hbase-1.3-IT jobs and instead created > >>> > >>> https://builds.apache.org/job/HBase-1.3-JDK8 and > >>> https://builds.apache.org/job/HBase-1.3-JDK7 > >>> > >>> This should work for now until we figure out how to move forward. > >>> > >>> Thanks, > >>> Mikhail > >>> > >>> On Wed, Aug 17, 2016 at 1:41 PM, Sean Busbey <bus...@cloudera.com> > wrote: > >>> > >>> > /me smacks forehead > >>> > > >>> > these replacement jobs, of course, also have special characters in > >>> > their names which then show up in the working path. > >>> > > >>> > renaming them to skip spaces and parens. > >>> > > >>> > On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey <sean.bus...@gmail.com> > >>> > wrote: > >>> > > FYI, it looks like essentially our entire CI suite is red, probably > >>> due > >>> > to > >>> > > parts of our codebase not tolerating spaces or other special > >>> characters > >>> > in > >>> > > the working directory. > >>> > > > >>> > > I've made a stop-gap non-multi-configuration set of jobs for > running > >>> unit > >>> > > tests for the 1.2 branch against JDK 7 and JDK 8: > >>> > > > >>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase% > >>> > 201.2%20(JDK%201.7)/ > >>> > > > >>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase% > >>> > 201.2%20(JDK%201.8)/ > >>> > > > >>> > > Due to the lack of response from infra@ I suspect our only options > >>> for > >>> > > continuing on ASF infra is to fix whatever part of our build > doesn't > >>> > > tolerate the new paths, or stop using multiconfiguration > deployments. > >>> I > >>> > am > >>> > > obviously less than thrilled at the idea of having several > multiples > >>> of > >>> > > current jobs. > >>> > > > >>> > > > >>> > > On Wed, Aug 10, 2016 at 6:28 PM, Sean Busbey <bus...@cloudera.com> > >>> > wrote: > >>> > > > >>> > >> Ugh. > >>> > >> > >>> > >> I sent a reply to Gav on builds@ about maybe getting names that > >>> don't > >>> > >> have spaces in them: > >>> > >> > >>> > >> https://lists.apache.org/thread.html/ > 8ac03dc62f9d6862d4f3d5eb37119c > >>> > >> 9c73b4059aaa3ebba52fc63bb6@%3Cbuilds.apache.org%3E > >>> > >> > >>> > >> In the mean time, is this an issue we need file with Hadoop or > >>> > >> something we need to fix in our own code? > >>> > >> > >>> > >> On Wed, Aug 10, 2016 at 6:04 PM, Matteo Bertozzi > >>> > >> <theo.berto...@gmail.com> wrote: > >>> > >> > There are a bunch of builds that have most of the test failing. > >>> > >> > > >>> > >> > Example: > >>> > >> > https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk= > >>> > >> JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/ > >>> > >> org.apache.hadoop.hbase/TestLocalHBaseCluster/ > testLocalHBaseCluster/ > >>> > >> > > >>> > >> > from the stack trace looks like the problem is with the jdk name > >>> that > >>> > has > >>> > >> > spaces: > >>> > >> > the hadoop FsVolumeImpl calls setNameFormat(... + > >>> fileName.toString() > >>> > + > >>> > >> ...) > >>> > >> > and this seems to not be escaped > >>> > >> > so we end up with JDK%25201.7%2520(latest) in the string format > >>> and we > >>> > >> get > >>> > >> > a IllegalFormatPrecisionException: 7 > >>> > >> > > >>> > >> > 2016-08-10 22:07:46,108 WARN [DataNode: > >>> > >> > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase- > >>> > >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not- > >>> > >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de- > >>> > >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a- > >>> > >> 9c88f385e6f1/dfs/data/data1/, > >>> > >> > [DISK]file:/home/jenkins/jenkins-slave/workspace/HBase- > >>> > >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not- > >>> > >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de- > >>> > >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a- > >>> > >> 9c88f385e6f1/dfs/data/data2/]] > >>> > >> > heartbeating to localhost/127.0.0.1:34629] > >>> > >> > datanode.BPServiceActor(831): Unexpected exception in block pool > >>> Block > >>> > >> > pool <registering> (Datanode Uuid unassigned) service to > >>> > >> > localhost/127.0.0.1:34629 > >>> > >> > java.util.IllegalFormatPrecisionException: 7 > >>> > >> > at java.util.Formatter$FormatSpecifier.checkText( > >>> > >> Formatter.java:2984) > >>> > >> > at java.util.Formatter$FormatSpecifier.<init>( > >>> > >> Formatter.java:2688) > >>> > >> > at java.util.Formatter.parse(Formatter.java:2528) > >>> > >> > at java.util.Formatter.format(Formatter.java:2469) > >>> > >> > at java.util.Formatter.format(Formatter.java:2423) > >>> > >> > at java.lang.String.format(String.java:2792) > >>> > >> > at com.google.common.util.concurrent. > ThreadFactoryBuilder. > >>> > >> setNameFormat(ThreadFactoryBuilder.java:68) > >>> > >> > at org.apache.hadoop.hdfs.server. > datanode.fsdataset.impl. > >>> > >> FsVolumeImpl.initializeCacheExecutor(FsVolumeImpl.java:140) > >>> > >> > > >>> > >> > > >>> > >> > > >>> > >> > Matteo > >>> > >> > > >>> > >> > > >>> > >> > On Tue, Aug 9, 2016 at 9:55 AM, Stack <st...@duboce.net> wrote: > >>> > >> > > >>> > >> >> Good on you Sean. > >>> > >> >> S > >>> > >> >> > >>> > >> >> On Mon, Aug 8, 2016 at 9:43 PM, Sean Busbey <bus...@apache.org > > > >>> > wrote: > >>> > >> >> > >>> > >> >> > I updated all of our jobs to use the updated JDK versions > from > >>> > infra. > >>> > >> >> > These have spaces in the names, and those names end up in our > >>> > >> >> > workspace path, so try to keep an eye out. > >>> > >> >> > > >>> > >> >> > > >>> > >> >> > > >>> > >> >> > On Mon, Aug 8, 2016 at 10:42 AM, Sean Busbey < > >>> bus...@cloudera.com> > >>> > >> >> wrote: > >>> > >> >> > > running in docker is the default now. relying on the > default > >>> > docker > >>> > >> >> > > image that comes with Yetus means that our protoc checks > are > >>> > >> >> > > failing[1]. > >>> > >> >> > > > >>> > >> >> > > > >>> > >> >> > > [1]: https://issues.apache.org/jira/browse/HBASE-16373 > >>> > >> >> > > > >>> > >> >> > > On Sat, Aug 6, 2016 at 5:03 PM, Sean Busbey < > >>> bus...@apache.org> > >>> > >> wrote: > >>> > >> >> > >> Hi folks! > >>> > >> >> > >> > >>> > >> >> > >> this morning I merged the patch that updates us to Yetus > >>> > 0.3.0[1] > >>> > >> and > >>> > >> >> > updated the precommit job appropriately. I also changed it to > >>> use > >>> > one > >>> > >> of > >>> > >> >> > the Java versions post the puppet changes to asf build. > >>> > >> >> > >> > >>> > >> >> > >> The last three builds look normal (#2975 - #2977). I'm > gonna > >>> try > >>> > >> >> > running things in docker next. I'll email again when I make > it > >>> the > >>> > >> >> default. > >>> > >> >> > >> > >>> > >> >> > >> [1]: https://issues.apache.org/jira/browse/HBASE-15882 > >>> > >> >> > >> > >>> > >> >> > >> On 2016-06-16 10:43 (-0500), Sean Busbey < > bus...@apache.org> > >>> > >> wrote: > >>> > >> >> > >>> FYI, today our precommit jobs started failing because our > >>> > chosen > >>> > >> jdk > >>> > >> >> > >>> (1.7.0.79) disappeared (mentioned on HBASE-16032). > >>> > >> >> > >>> > >>> > >> >> > >>> Initially we were doing something wrong, namely directly > >>> > >> referencing > >>> > >> >> > >>> the jenkins build tools area without telling jenkins to > give > >>> > us an > >>> > >> >> env > >>> > >> >> > >>> variable that stated where the jdk is located. However, > >>> after > >>> > >> >> > >>> attempting to switch to the appropriate tooling variable > for > >>> > jdk > >>> > >> >> > >>> 1.7.0.79, I found that it didn't point to a place that > >>> worked. > >>> > >> >> > >>> > >>> > >> >> > >>> I've now updated the job to rely on the latest 1.7 jdk, > >>> which > >>> > is > >>> > >> >> > >>> currently 1.7.0.80. I don't know how often "latest" > updates. > >>> > >> >> > >>> > >>> > >> >> > >>> Personally, I think this is a sign that we need to > >>> prioritize > >>> > >> >> > >>> HBASE-15882 so that we can switch back to using Docker. I > >>> won't > >>> > >> have > >>> > >> >> > >>> time this week, so if anyone else does please pick up the > >>> > ticket. > >>> > >> >> > >>> > >>> > >> >> > >>> On Thu, Mar 17, 2016 at 5:19 PM, Stack <st...@duboce.net > > > >>> > wrote: > >>> > >> >> > >>> > Thanks Sean. > >>> > >> >> > >>> > St.Ack > >>> > >> >> > >>> > > >>> > >> >> > >>> > On Wed, Mar 16, 2016 at 12:04 PM, Sean Busbey < > >>> > >> bus...@cloudera.com > >>> > >> >> > > >>> > >> >> > wrote: > >>> > >> >> > >>> > > >>> > >> >> > >>> >> FYI, I updated the precommit job today to specify that > >>> only > >>> > >> >> compile > >>> > >> >> > time > >>> > >> >> > >>> >> checks should be done against jdks other than the > primary > >>> > jdk7 > >>> > >> >> > instance. > >>> > >> >> > >>> >> > >>> > >> >> > >>> >> On Mon, Mar 7, 2016 at 8:43 PM, Sean Busbey < > >>> > >> bus...@cloudera.com> > >>> > >> >> > wrote: > >>> > >> >> > >>> >> > >>> > >> >> > >>> >> > I tested things out, and while YETUS-297[1] is > present > >>> the > >>> > >> >> > default runs > >>> > >> >> > >>> >> > all plugins that can do multiple jdks against those > >>> > available > >>> > >> >> > (jdk7 and > >>> > >> >> > >>> >> > jdk8 in our case). > >>> > >> >> > >>> >> > > >>> > >> >> > >>> >> > We can configure things to only do a single run of > unit > >>> > >> tests. > >>> > >> >> > They'll be > >>> > >> >> > >>> >> > against jdk7, since that is our default jdk. That > fine > >>> by > >>> > >> >> > everyone? It'll > >>> > >> >> > >>> >> > save ~1.5 hours on any build that hits hbase-server. > >>> > >> >> > >>> >> > > >>> > >> >> > >>> >> > On Mon, Mar 7, 2016 at 1:22 PM, Stack < > >>> st...@duboce.net> > >>> > >> wrote: > >>> > >> >> > >>> >> > > >>> > >> >> > >>> >> >> Hurray! > >>> > >> >> > >>> >> >> > >>> > >> >> > >>> >> >> It looks like YETUS-96 is in there and we are only > >>> > running > >>> > >> on > >>> > >> >> > jdk build > >>> > >> >> > >>> >> >> now, the default (but testing compile against > >>> both).... > >>> > Will > >>> > >> >> > keep an > >>> > >> >> > >>> >> eye. > >>> > >> >> > >>> >> >> > >>> > >> >> > >>> >> >> St.Ack > >>> > >> >> > >>> >> >> > >>> > >> >> > >>> >> >> > >>> > >> >> > >>> >> >> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey < > >>> > >> >> > bus...@cloudera.com> > >>> > >> >> > >>> >> wrote: > >>> > >> >> > >>> >> >> > >>> > >> >> > >>> >> >> > FYI, I've just updated our precommit jobs to use > the > >>> > 0.2.0 > >>> > >> >> > release of > >>> > >> >> > >>> >> >> Yetus > >>> > >> >> > >>> >> >> > that came out today. > >>> > >> >> > >>> >> >> > > >>> > >> >> > >>> >> >> > After keeping an eye out for strangeness today > I'll > >>> > turn > >>> > >> >> > docker mode > >>> > >> >> > >>> >> >> back > >>> > >> >> > >>> >> >> > on by default tonight. > >>> > >> >> > >>> >> >> > > >>> > >> >> > >>> >> >> > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey < > >>> > >> >> > bus...@apache.org> > >>> > >> >> > >>> >> >> wrote: > >>> > >> >> > >>> >> >> > > >>> > >> >> > >>> >> >> > > FYI, I added a new parameter to the precommit > job: > >>> > >> >> > >>> >> >> > > > >>> > >> >> > >>> >> >> > > * USE_YETUS_PRERELEASE - causes us to use the > >>> HEAD of > >>> > >> the > >>> > >> >> > >>> >> apache/yetus > >>> > >> >> > >>> >> >> > > repo rather than our chosen release > >>> > >> >> > >>> >> >> > > > >>> > >> >> > >>> >> >> > > It defaults to inactive, but can be used in > >>> > >> >> > manually-triggered runs > >>> > >> >> > >>> >> to > >>> > >> >> > >>> >> >> > > test a solution to a problem in the yetus > >>> library. At > >>> > >> the > >>> > >> >> > moment, > >>> > >> >> > >>> >> I'm > >>> > >> >> > >>> >> >> > > using it to test a solution to default module > >>> > ordering > >>> > >> as > >>> > >> >> > seen in > >>> > >> >> > >>> >> >> > > HBASE-15075. > >>> > >> >> > >>> >> >> > > > >>> > >> >> > >>> >> >> > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey < > >>> > >> >> > bus...@cloudera.com> > >>> > >> >> > >>> >> >> wrote: > >>> > >> >> > >>> >> >> > > > FYI, I just pushed HBASE-13525 (switch to > Apache > >>> > Yetus > >>> > >> >> for > >>> > >> >> > >>> >> precommit > >>> > >> >> > >>> >> >> > > tests) > >>> > >> >> > >>> >> >> > > > and updated our jenkins precommit build to > use > >>> it. > >>> > >> >> > >>> >> >> > > > > >>> > >> >> > >>> >> >> > > > Jenkins job has some explanation: > >>> > >> >> > >>> >> >> > > > > >>> > >> >> > >>> >> >> > > > >>> > >> >> > >>> >> >> > > >>> > >> >> > >>> >> >> > >>> > >> >> > >>> >> https://builds.apache.org/ > view/PreCommit%20Builds/job/ > >>> > >> >> > PreCommit-HBASE-Build/ > >>> > >> >> > >>> >> >> > > > > >>> > >> >> > >>> >> >> > > > Release note from HBASE-13525 does as well. > >>> > >> >> > >>> >> >> > > > > >>> > >> >> > >>> >> >> > > > The old job will stick around here for a > couple > >>> of > >>> > >> weeks, > >>> > >> >> > in case > >>> > >> >> > >>> >> we > >>> > >> >> > >>> >> >> > need > >>> > >> >> > >>> >> >> > > > to refer back to it: > >>> > >> >> > >>> >> >> > > > > >>> > >> >> > >>> >> >> > > > > >>> > >> >> > >>> >> >> > > > >>> > >> >> > >>> >> >> > > >>> > >> >> > >>> >> >> > >>> > >> >> > >>> >> https://builds.apache.org/ > view/PreCommit%20Builds/job/ > >>> > >> >> > PreCommit-HBASE-Build-deprecated/ > >>> > >> >> > >>> >> >> > > > > >>> > >> >> > >>> >> >> > > > If something looks awry, please drop a note > on > >>> > >> >> HBASE-13525 > >>> > >> >> > while > >>> > >> >> > >>> >> it > >>> > >> >> > >>> >> >> > > remains > >>> > >> >> > >>> >> >> > > > open (and make a new issue after). > >>> > >> >> > >>> >> >> > > > > >>> > >> >> > >>> >> >> > > > > >>> > >> >> > >>> >> >> > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack < > >>> > >> st...@duboce.net> > >>> > >> >> > wrote: > >>> > >> >> > >>> >> >> > > > > >>> > >> >> > >>> >> >> > > >> As part of my continuing advocacy of > >>> > >> builds.apache.org > >>> > >> >> > and that > >>> > >> >> > >>> >> >> their > >>> > >> >> > >>> >> >> > > >> results are now worthy of our trust and > >>> nurture, > >>> > here > >>> > >> >> are > >>> > >> >> > some > >>> > >> >> > >>> >> >> > > highlights > >>> > >> >> > >>> >> >> > > >> from the last few days of builds: > >>> > >> >> > >>> >> >> > > >> > >>> > >> >> > >>> >> >> > > >> + hadoopqa is now finding zombies before the > >>> > patch is > >>> > >> >> > committed. > >>> > >> >> > >>> >> >> > > >> HBASE-14888 showed "-1 core tests. The patch > >>> > failed > >>> > >> >> these > >>> > >> >> > unit > >>> > >> >> > >>> >> >> tests:" > >>> > >> >> > >>> >> >> > > but > >>> > >> >> > >>> >> >> > > >> didn't have any failed tests listed (I'm > >>> trying to > >>> > >> see > >>> > >> >> if > >>> > >> >> > I can > >>> > >> >> > >>> >> do > >>> > >> >> > >>> >> >> > > anything > >>> > >> >> > >>> >> >> > > >> about this...). Running our little > >>> > >> >> > >>> >> ./dev-tools/findHangingTests.py > >>> > >> >> > >>> >> >> > > against > >>> > >> >> > >>> >> >> > > >> the consoleText, it showed a hanging test. > >>> Running > >>> > >> >> > locally, I see > >>> > >> >> > >>> >> >> same > >>> > >> >> > >>> >> >> > > >> hang. This is before the patch landed. > >>> > >> >> > >>> >> >> > > >> + Our branch runs are now near totally > zombie > >>> and > >>> > >> flakey > >>> > >> >> > free -- > >>> > >> >> > >>> >> >> still > >>> > >> >> > >>> >> >> > > some > >>> > >> >> > >>> >> >> > > >> work to do -- but a recent patch that seemed > >>> > harmless > >>> > >> >> was > >>> > >> >> > >>> >> causing a > >>> > >> >> > >>> >> >> > > >> reliable flake fail in the backport to > >>> branch-1* > >>> > >> >> > confirmed by > >>> > >> >> > >>> >> local > >>> > >> >> > >>> >> >> > > runs. > >>> > >> >> > >>> >> >> > > >> The flakeyness was plain to see up in > >>> > >> builds.apache.org > >>> > >> >> . > >>> > >> >> > >>> >> >> > > >> + In the last few days I've committed a > patch > >>> that > >>> > >> >> > included > >>> > >> >> > >>> >> javadoc > >>> > >> >> > >>> >> >> > > >> warnings even though hadoopqa said the patch > >>> > >> introduced > >>> > >> >> > javadoc > >>> > >> >> > >>> >> >> issues > >>> > >> >> > >>> >> >> > > (I > >>> > >> >> > >>> >> >> > > >> missed it). This messed up life for folks > >>> > >> subsequently > >>> > >> >> as > >>> > >> >> > their > >>> > >> >> > >>> >> >> > patches > >>> > >> >> > >>> >> >> > > now > >>> > >> >> > >>> >> >> > > >> reported javadoc issues.... > >>> > >> >> > >>> >> >> > > >> > >>> > >> >> > >>> >> >> > > >> In short, I suggest that builds.apache.org > is > >>> > worth > >>> > >> >> > keeping an > >>> > >> >> > >>> >> eye > >>> > >> >> > >>> >> >> > on, > >>> > >> >> > >>> >> >> > > >> make > >>> > >> >> > >>> >> >> > > >> sure you get a clean build out of hadoopqa > >>> before > >>> > >> >> > committing > >>> > >> >> > >>> >> >> anything, > >>> > >> >> > >>> >> >> > > and > >>> > >> >> > >>> >> >> > > >> lets all work together to try and keep our > >>> builds > >>> > >> blue: > >>> > >> >> > it'll > >>> > >> >> > >>> >> save > >>> > >> >> > >>> >> >> us > >>> > >> >> > >>> >> >> > > all > >>> > >> >> > >>> >> >> > > >> work in the long run. > >>> > >> >> > >>> >> >> > > >> > >>> > >> >> > >>> >> >> > > >> St.Ack > >>> > >> >> > >>> >> >> > > >> > >>> > >> >> > >>> >> >> > > >> > >>> > >> >> > >>> >> >> > > >> On Tue, Nov 4, 2014 at 9:38 AM, Stack < > >>> > >> st...@duboce.net > >>> > >> >> > > >>> > >> >> > wrote: > >>> > >> >> > >>> >> >> > > >> > >>> > >> >> > >>> >> >> > > >> > Branch-1 and master have stabilized and > now > >>> run > >>> > >> mostly > >>> > >> >> > blue > >>> > >> >> > >>> >> >> (give or > >>> > >> >> > >>> >> >> > > take > >>> > >> >> > >>> >> >> > > >> > the odd failure) [1][2]. Having a mostly > blue > >>> > >> branch-1 > >>> > >> >> > has > >>> > >> >> > >>> >> >> helped us > >>> > >> >> > >>> >> >> > > >> > identify at least one destabilizing > commit in > >>> > the > >>> > >> last > >>> > >> >> > few > >>> > >> >> > >>> >> days, > >>> > >> >> > >>> >> >> > maybe > >>> > >> >> > >>> >> >> > > >> two; > >>> > >> >> > >>> >> >> > > >> > this is as it should be (smile). > >>> > >> >> > >>> >> >> > > >> > > >>> > >> >> > >>> >> >> > > >> > Lets keep our builds blue. If you commit a > >>> > patch, > >>> > >> make > >>> > >> >> > sure > >>> > >> >> > >>> >> >> > subsequent > >>> > >> >> > >>> >> >> > > >> > builds stay blue. You can subscribe to > >>> > >> >> > bui...@hbase.apache.org > >>> > >> >> > >>> >> >> to > >>> > >> >> > >>> >> >> > get > >>> > >> >> > >>> >> >> > > >> > notice of failures if not already > subscribed. > >>> > >> >> > >>> >> >> > > >> > > >>> > >> >> > >>> >> >> > > >> > Thanks, > >>> > >> >> > >>> >> >> > > >> > St.Ack > >>> > >> >> > >>> >> >> > > >> > > >>> > >> >> > >>> >> >> > > >> > 1. > >>> > >> >> > >>> >> https://builds.apache.org/ > view/H-L/view/HBase/job/HBase- > >>> > 1.0/ > >>> > >> >> > >>> >> >> > > >> > 2. > >>> > >> >> > >>> >> >> https://builds.apache.org/view > >>> /H-L/view/HBase/job/HBase- > >>> > >> TRUNK/ > >>> > >> >> > >>> >> >> > > >> > > >>> > >> >> > >>> >> >> > > >> > > >>> > >> >> > >>> >> >> > > >> > On Mon, Oct 13, 2014 at 4:41 PM, Stack < > >>> > >> >> > st...@duboce.net> > >>> > >> >> > >>> >> wrote: > >>> > >> >> > >>> >> >> > > >> > > >>> > >> >> > >>> >> >> > > >> >> A few notes on testing. > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> >> Too long to read, infra is more capable > now > >>> and > >>> > >> after > >>> > >> >> > some > >>> > >> >> > >>> >> >> work, we > >>> > >> >> > >>> >> >> > > are > >>> > >> >> > >>> >> >> > > >> >> seeing branch-1 and trunk mostly running > >>> blue. > >>> > >> Lets > >>> > >> >> > try and > >>> > >> >> > >>> >> >> keep it > >>> > >> >> > >>> >> >> > > this > >>> > >> >> > >>> >> >> > > >> >> way going forward. > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> >> Apache Infra has new, more capable > hardware. > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> >> A recent spurt of test fixing combined > with > >>> > more > >>> > >> >> > capable > >>> > >> >> > >>> >> >> hardware > >>> > >> >> > >>> >> >> > > seems > >>> > >> >> > >>> >> >> > > >> >> to have gotten us to a new place; tests > are > >>> > mostly > >>> > >> >> > passing now > >>> > >> >> > >>> >> >> on > >>> > >> >> > >>> >> >> > > >> branch-1 > >>> > >> >> > >>> >> >> > > >> >> and master. Lets try and keep it this > way > >>> and > >>> > >> start > >>> > >> >> > to trust > >>> > >> >> > >>> >> >> our > >>> > >> >> > >>> >> >> > > test > >>> > >> >> > >>> >> >> > > >> runs > >>> > >> >> > >>> >> >> > > >> >> again. Just a few flakies remain. Lets > try > >>> > and > >>> > >> nail > >>> > >> >> > them. > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> >> Our tests now run in parallel with other > >>> test > >>> > >> suites > >>> > >> >> > where > >>> > >> >> > >>> >> >> previous > >>> > >> >> > >>> >> >> > > we > >>> > >> >> > >>> >> >> > > >> >> ran alone. You can see this sometimes > when > >>> our > >>> > >> zombie > >>> > >> >> > detector > >>> > >> >> > >>> >> >> > > reports > >>> > >> >> > >>> >> >> > > >> >> tests from another project altogether as > >>> > lingerers > >>> > >> >> (To > >>> > >> >> > be > >>> > >> >> > >>> >> >> fixed). > >>> > >> >> > >>> >> >> > > Some > >>> > >> >> > >>> >> >> > > >> of > >>> > >> >> > >>> >> >> > > >> >> our tests are failing because a > concurrent > >>> > hbase > >>> > >> run > >>> > >> >> is > >>> > >> >> > >>> >> undoing > >>> > >> >> > >>> >> >> > > classes > >>> > >> >> > >>> >> >> > > >> and > >>> > >> >> > >>> >> >> > > >> >> data from under it. Also, lets fix. > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> >> Our tests are brittle. It takes 75minutes > >>> for > >>> > >> them to > >>> > >> >> > >>> >> complete. > >>> > >> >> > >>> >> >> > Many > >>> > >> >> > >>> >> >> > > >> are > >>> > >> >> > >>> >> >> > > >> >> heavy-duty integration tests starting up > >>> > multiple > >>> > >> >> > clusters and > >>> > >> >> > >>> >> >> > > mapreduce > >>> > >> >> > >>> >> >> > > >> >> all in the one JVM. It is a miracle they > >>> pass > >>> > at > >>> > >> all. > >>> > >> >> > Usually > >>> > >> >> > >>> >> >> > > >> integration > >>> > >> >> > >>> >> >> > > >> >> tests have been cast as unit tests > because > >>> > there > >>> > >> was > >>> > >> >> > no where > >>> > >> >> > >>> >> >> else > >>> > >> >> > >>> >> >> > > for > >>> > >> >> > >>> >> >> > > >> them > >>> > >> >> > >>> >> >> > > >> >> to get an airing. We have the hbase-it > >>> suite > >>> > now > >>> > >> >> > which would > >>> > >> >> > >>> >> >> be a > >>> > >> >> > >>> >> >> > > more > >>> > >> >> > >>> >> >> > > >> apt > >>> > >> >> > >>> >> >> > > >> >> place but until these are run on a > regular > >>> > basis > >>> > >> in > >>> > >> >> > public for > >>> > >> >> > >>> >> >> all > >>> > >> >> > >>> >> >> > to > >>> > >> >> > >>> >> >> > > >> see, > >>> > >> >> > >>> >> >> > > >> >> the fat integration tests disguised as > unit > >>> > tests > >>> > >> >> will > >>> > >> >> > remain. > >>> > >> >> > >>> >> >> A > >>> > >> >> > >>> >> >> > > >> review of > >>> > >> >> > >>> >> >> > > >> >> our current unit tests weeding the old > cruft > >>> > and > >>> > >> the > >>> > >> >> > no longer > >>> > >> >> > >>> >> >> > > relevant > >>> > >> >> > >>> >> >> > > >> or > >>> > >> >> > >>> >> >> > > >> >> duplicates would be a nice undertaking if > >>> > someone > >>> > >> is > >>> > >> >> > looking > >>> > >> >> > >>> >> to > >>> > >> >> > >>> >> >> > > >> contribute. > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> >> Alex Newman has been working on making > our > >>> > tests > >>> > >> work > >>> > >> >> > up on > >>> > >> >> > >>> >> >> travis > >>> > >> >> > >>> >> >> > > and > >>> > >> >> > >>> >> >> > > >> >> circle-ci. That'll be sweet when it goes > >>> > >> end-to-end. > >>> > >> >> > He also > >>> > >> >> > >>> >> >> > added > >>> > >> >> > >>> >> >> > > in > >>> > >> >> > >>> >> >> > > >> >> some "type" categorizations -- client, > >>> filter, > >>> > >> >> > mapreduce -- > >>> > >> >> > >>> >> >> > alongside > >>> > >> >> > >>> >> >> > > >> our > >>> > >> >> > >>> >> >> > > >> >> old "sizing" categorizations of > >>> > >> small/medium/large. > >>> > >> >> > His > >>> > >> >> > >>> >> >> thinking > >>> > >> >> > >>> >> >> > is > >>> > >> >> > >>> >> >> > > >> that > >>> > >> >> > >>> >> >> > > >> >> we can run these categorizations in > parallel > >>> > so we > >>> > >> >> > could run > >>> > >> >> > >>> >> the > >>> > >> >> > >>> >> >> > > total > >>> > >> >> > >>> >> >> > > >> >> suite in about the time of the longest > test, > >>> > say > >>> > >> >> > 20-30minutes? > >>> > >> >> > >>> >> >> We > >>> > >> >> > >>> >> >> > > could > >>> > >> >> > >>> >> >> > > >> >> even change Apache to run them this way. > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> >> FYI, > >>> > >> >> > >>> >> >> > > >> >> St.Ack > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> >> > >>> > >> >> > >>> >> >> > > >> > > >>> > >> >> > >>> >> >> > > >> > >>> > >> >> > >>> >> >> > > > > >>> > >> >> > >>> >> >> > > > > >>> > >> >> > >>> >> >> > > > > >>> > >> >> > >>> >> >> > > > -- > >>> > >> >> > >>> >> >> > > > Sean > >>> > >> >> > >>> >> >> > > > >>> > >> >> > >>> >> >> > > >>> > >> >> > >>> >> >> > > >>> > >> >> > >>> >> >> > > >>> > >> >> > >>> >> >> > -- > >>> > >> >> > >>> >> >> > busbey > >>> > >> >> > >>> >> >> > > >>> > >> >> > >>> >> >> > >>> > >> >> > >>> >> > > >>> > >> >> > >>> >> > > >>> > >> >> > >>> >> > > >>> > >> >> > >>> >> > -- > >>> > >> >> > >>> >> > busbey > >>> > >> >> > >>> >> > > >>> > >> >> > >>> >> > >>> > >> >> > >>> >> > >>> > >> >> > >>> >> > >>> > >> >> > >>> >> -- > >>> > >> >> > >>> >> busbey > >>> > >> >> > >>> >> > >>> > >> >> > >>> > >>> > >> >> > > > >>> > >> >> > > > >>> > >> >> > > > >>> > >> >> > > -- > >>> > >> >> > > busbey > >>> > >> >> > > >>> > >> >> > >>> > >> > >>> > >> > >>> > >> > >>> > >> -- > >>> > >> busbey > >>> > >> > >>> > > > >>> > > > >>> > > > >>> > > -- > >>> > > Sean > >>> > > >>> > > >>> > > >>> > -- > >>> > busbey > >>> > > >>> > >>> > >>> > >>> -- > >>> Thanks, > >>> Michael Antonov > >>> > >> > >> > >> > >> -- > >> > >> -- Appy > >> > > > > > > > > -- > > > > -- Appy > > > > -- > busbey > -- Thanks, Michael Antonov