Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2018-02-01 Thread Stack
All seems to have settled now. Hadoopqa is running 'normally' again with
yetus 0.7.0 and some new configs (Thanks to Allen Wittenhauer for the
help/input...). That said, we need to work on curbing resources used during
test runs
St.Ack

On Wed, Jan 31, 2018 at 9:01 AM, Stack  wrote:

> I just set hadoopqa to be 0.7.0 again with an upped proclimit to see if
> this fixes our OOME failures.. HadoopQA builds numbered 11295 and later
> will have this change.
> Thanks
> S
>
> On Wed, Jan 31, 2018 at 6:46 AM, Stack  wrote:
>
>> Note that I reverted our yetus version last night. It discombobulated our
>> builds (OOMEs). Meantime, you'll have to do the patch naming trick for
>> another day or so. Our test runs seem to use an ungodly number of file
>> descriptors Stay tuned.
>> S
>>
>> On Mon, Jan 29, 2018 at 10:56 PM, Stack  wrote:
>>
>>> Our brothers and sisters over in yetus-land made a release that deals w/
>>> the changed JIRA behavior regards ordering attached-patches. No need of
>>> deleting all but the intended patch going forward nor gymnastics with
>>> prefixes when naming.  It seems to be working properly. The one-liner
>>> change that moves us from yetus 0.6.0 to 0.7.0 has been pushed to all
>>> active branches and our hadoopqa up on jenkins has been configured to use
>>> it going forward.
>>>
>>> FYI,
>>> S
>>>
>>> On Fri, Dec 8, 2017 at 12:13 PM, Stack  wrote:
>>>
 Thanks Andrew. I disabled the job. Use the nightly going forward. The
 jdk7 builds seem to run fine. The jdk8 has some timeout going on. Need to
 dig in. You can see here:

 https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Ni
 ghtly/job/branch-1.4/

 Thanks,
 M

 On Fri, Dec 8, 2017 at 11:29 AM, Andrew Purtell 
 wrote:

> Ok with me, Stack. Thanks for asking.
>
>
> On Thu, Nov 30, 2017 at 5:33 PM, Stack  wrote:
>
> > On the move over to nightly test runs:
> >
> > 1.2 nightly had a successful build last night after the branch-1
> > stabilization effort (HBASE-19204) and fixing a few unit test
> failures. See
> > build 150
> > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> > 20Nightly/job/branch-1.2/
> > It then failed, 151, because of timed out test. Need to dig in.
> Clean up a
> > few more unit tests and branch-1.2 is probably ready for a
> release-cutting.
> >
> > 1.3 has a few flakies. The last build failed because of:
> >
> >   Test Result (1 failure / ±0)
> > org.apache.hadoop.hbase.regionserver.TestEncryptionKeyRotation.
> > testCFKeyRotation
> >
> > Just a little effort should turn 1.3 green.
> >
> > I was going to disable the 1.4 job,
> > https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.4/,  in
> favor of
> > the 1.4 nightly,
> > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> > 20Nightly/job/branch-1.4/,
> > if ok w/ you Andrew Purtell... And move over the branch-1, branch-2,
> and
> > master too.
> >
> > Thanks,
> > S
> >
> >
> >
> > On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
> >
> > > Example of the new nice reporting: vhttps://builds.apache.org/
> > > view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/
> > > S
> > >
> > > On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
> > >
> > >> Note that I have disabled the HBase-1.2-JDK7, HBase-1.2-JDK8,
> > >> HBase-1.3-JDK7, and HBase-1.3-JDK8 jobs. They have been broken
> for a
> > good
> > >> while now. In their place, refer to an ongoing Sean "Nightly"
> project,
> > an
> > >> effort he has been at for a while. It does more checking with
> pretty
> > >> reports that will help figuring general stability over time. See
> under
> > >> https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Ni
> ghtly/
> > >> See the nightly builds for 1.2 and 1.3. They have some teething
> issues
> > >> still but are almost there. See the 1.2 build from last night. In
> recent
> > >> days, the 1.2 branch went from trash-can fire to stable. See how
> all
> > tests
> > >> passed in the last build but then we failed generating the src
> bundle on
> > >> the end (this is what I mean by 'teething' issue). Will work on
> fixing
> > this
> > >> last step and moving over 1.4, etc., in the next few days.
> > >>
> > >> FYI,
> > >> St.Ack
> > >>
> > >>
> > >> On Tue, Nov 7, 2017 at 7:45 AM, Stack  wrote:
> > >>
> > >>> On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey 
> wrote:
> > >>>
> >  > Should I be able to see the machine dir when I look at
> nightlies
> >  output?
> >  > (Was trying to see what else is running).
> > 
> >  Ah. we don't have the same machine sampling on nightly as we do
> in
> >  precommit. I am 80% on a 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2018-01-31 Thread Stack
I just set hadoopqa to be 0.7.0 again with an upped proclimit to see if
this fixes our OOME failures.. HadoopQA builds numbered 11295 and later
will have this change.
Thanks
S

On Wed, Jan 31, 2018 at 6:46 AM, Stack  wrote:

> Note that I reverted our yetus version last night. It discombobulated our
> builds (OOMEs). Meantime, you'll have to do the patch naming trick for
> another day or so. Our test runs seem to use an ungodly number of file
> descriptors Stay tuned.
> S
>
> On Mon, Jan 29, 2018 at 10:56 PM, Stack  wrote:
>
>> Our brothers and sisters over in yetus-land made a release that deals w/
>> the changed JIRA behavior regards ordering attached-patches. No need of
>> deleting all but the intended patch going forward nor gymnastics with
>> prefixes when naming.  It seems to be working properly. The one-liner
>> change that moves us from yetus 0.6.0 to 0.7.0 has been pushed to all
>> active branches and our hadoopqa up on jenkins has been configured to use
>> it going forward.
>>
>> FYI,
>> S
>>
>> On Fri, Dec 8, 2017 at 12:13 PM, Stack  wrote:
>>
>>> Thanks Andrew. I disabled the job. Use the nightly going forward. The
>>> jdk7 builds seem to run fine. The jdk8 has some timeout going on. Need to
>>> dig in. You can see here:
>>>
>>> https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Ni
>>> ghtly/job/branch-1.4/
>>>
>>> Thanks,
>>> M
>>>
>>> On Fri, Dec 8, 2017 at 11:29 AM, Andrew Purtell 
>>> wrote:
>>>
 Ok with me, Stack. Thanks for asking.


 On Thu, Nov 30, 2017 at 5:33 PM, Stack  wrote:

 > On the move over to nightly test runs:
 >
 > 1.2 nightly had a successful build last night after the branch-1
 > stabilization effort (HBASE-19204) and fixing a few unit test
 failures. See
 > build 150
 > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
 > 20Nightly/job/branch-1.2/
 > It then failed, 151, because of timed out test. Need to dig in. Clean
 up a
 > few more unit tests and branch-1.2 is probably ready for a
 release-cutting.
 >
 > 1.3 has a few flakies. The last build failed because of:
 >
 >   Test Result (1 failure / ±0)
 > org.apache.hadoop.hbase.regionserver.TestEncryptionKeyRotation.
 > testCFKeyRotation
 >
 > Just a little effort should turn 1.3 green.
 >
 > I was going to disable the 1.4 job,
 > https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.4/,  in
 favor of
 > the 1.4 nightly,
 > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
 > 20Nightly/job/branch-1.4/,
 > if ok w/ you Andrew Purtell... And move over the branch-1, branch-2,
 and
 > master too.
 >
 > Thanks,
 > S
 >
 >
 >
 > On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
 >
 > > Example of the new nice reporting: vhttps://builds.apache.org/
 > > view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/
 > > S
 > >
 > > On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
 > >
 > >> Note that I have disabled the HBase-1.2-JDK7, HBase-1.2-JDK8,
 > >> HBase-1.3-JDK7, and HBase-1.3-JDK8 jobs. They have been broken for
 a
 > good
 > >> while now. In their place, refer to an ongoing Sean "Nightly"
 project,
 > an
 > >> effort he has been at for a while. It does more checking with
 pretty
 > >> reports that will help figuring general stability over time. See
 under
 > >> https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/
 > >> See the nightly builds for 1.2 and 1.3. They have some teething
 issues
 > >> still but are almost there. See the 1.2 build from last night. In
 recent
 > >> days, the 1.2 branch went from trash-can fire to stable. See how
 all
 > tests
 > >> passed in the last build but then we failed generating the src
 bundle on
 > >> the end (this is what I mean by 'teething' issue). Will work on
 fixing
 > this
 > >> last step and moving over 1.4, etc., in the next few days.
 > >>
 > >> FYI,
 > >> St.Ack
 > >>
 > >>
 > >> On Tue, Nov 7, 2017 at 7:45 AM, Stack  wrote:
 > >>
 > >>> On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey 
 wrote:
 > >>>
 >  > Should I be able to see the machine dir when I look at
 nightlies
 >  output?
 >  > (Was trying to see what else is running).
 > 
 >  Ah. we don't have the same machine sampling on nightly as we do
 in
 >  precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
 >  repeatedly)  that includes pulling that information gathering
 into a
 >  place where we could also use it in nightly.
 > 
 > 
 > >>> Sweet.
 > >>>
 > >>>
 > >>>
 >  Did we ever figure out how many cores we expect our tests to
 need? It
 >  looks like the Hadoop nodes have 8 cores. (with 2 executors that
 means

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2018-01-31 Thread Stack
Note that I reverted our yetus version last night. It discombobulated our
builds (OOMEs). Meantime, you'll have to do the patch naming trick for
another day or so. Our test runs seem to use an ungodly number of file
descriptors Stay tuned.
S

On Mon, Jan 29, 2018 at 10:56 PM, Stack  wrote:

> Our brothers and sisters over in yetus-land made a release that deals w/
> the changed JIRA behavior regards ordering attached-patches. No need of
> deleting all but the intended patch going forward nor gymnastics with
> prefixes when naming.  It seems to be working properly. The one-liner
> change that moves us from yetus 0.6.0 to 0.7.0 has been pushed to all
> active branches and our hadoopqa up on jenkins has been configured to use
> it going forward.
>
> FYI,
> S
>
> On Fri, Dec 8, 2017 at 12:13 PM, Stack  wrote:
>
>> Thanks Andrew. I disabled the job. Use the nightly going forward. The
>> jdk7 builds seem to run fine. The jdk8 has some timeout going on. Need to
>> dig in. You can see here:
>>
>> https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Ni
>> ghtly/job/branch-1.4/
>>
>> Thanks,
>> M
>>
>> On Fri, Dec 8, 2017 at 11:29 AM, Andrew Purtell 
>> wrote:
>>
>>> Ok with me, Stack. Thanks for asking.
>>>
>>>
>>> On Thu, Nov 30, 2017 at 5:33 PM, Stack  wrote:
>>>
>>> > On the move over to nightly test runs:
>>> >
>>> > 1.2 nightly had a successful build last night after the branch-1
>>> > stabilization effort (HBASE-19204) and fixing a few unit test
>>> failures. See
>>> > build 150
>>> > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
>>> > 20Nightly/job/branch-1.2/
>>> > It then failed, 151, because of timed out test. Need to dig in. Clean
>>> up a
>>> > few more unit tests and branch-1.2 is probably ready for a
>>> release-cutting.
>>> >
>>> > 1.3 has a few flakies. The last build failed because of:
>>> >
>>> >   Test Result (1 failure / ±0)
>>> > org.apache.hadoop.hbase.regionserver.TestEncryptionKeyRotation.
>>> > testCFKeyRotation
>>> >
>>> > Just a little effort should turn 1.3 green.
>>> >
>>> > I was going to disable the 1.4 job,
>>> > https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.4/,  in
>>> favor of
>>> > the 1.4 nightly,
>>> > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
>>> > 20Nightly/job/branch-1.4/,
>>> > if ok w/ you Andrew Purtell... And move over the branch-1, branch-2,
>>> and
>>> > master too.
>>> >
>>> > Thanks,
>>> > S
>>> >
>>> >
>>> >
>>> > On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
>>> >
>>> > > Example of the new nice reporting: vhttps://builds.apache.org/
>>> > > view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/
>>> > > S
>>> > >
>>> > > On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
>>> > >
>>> > >> Note that I have disabled the HBase-1.2-JDK7, HBase-1.2-JDK8,
>>> > >> HBase-1.3-JDK7, and HBase-1.3-JDK8 jobs. They have been broken for a
>>> > good
>>> > >> while now. In their place, refer to an ongoing Sean "Nightly"
>>> project,
>>> > an
>>> > >> effort he has been at for a while. It does more checking with pretty
>>> > >> reports that will help figuring general stability over time. See
>>> under
>>> > >> https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/
>>> > >> See the nightly builds for 1.2 and 1.3. They have some teething
>>> issues
>>> > >> still but are almost there. See the 1.2 build from last night. In
>>> recent
>>> > >> days, the 1.2 branch went from trash-can fire to stable. See how all
>>> > tests
>>> > >> passed in the last build but then we failed generating the src
>>> bundle on
>>> > >> the end (this is what I mean by 'teething' issue). Will work on
>>> fixing
>>> > this
>>> > >> last step and moving over 1.4, etc., in the next few days.
>>> > >>
>>> > >> FYI,
>>> > >> St.Ack
>>> > >>
>>> > >>
>>> > >> On Tue, Nov 7, 2017 at 7:45 AM, Stack  wrote:
>>> > >>
>>> > >>> On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey 
>>> wrote:
>>> > >>>
>>> >  > Should I be able to see the machine dir when I look at nightlies
>>> >  output?
>>> >  > (Was trying to see what else is running).
>>> > 
>>> >  Ah. we don't have the same machine sampling on nightly as we do in
>>> >  precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
>>> >  repeatedly)  that includes pulling that information gathering
>>> into a
>>> >  place where we could also use it in nightly.
>>> > 
>>> > 
>>> > >>> Sweet.
>>> > >>>
>>> > >>>
>>> > >>>
>>> >  Did we ever figure out how many cores we expect our tests to
>>> need? It
>>> >  looks like the Hadoop nodes have 8 cores. (with 2 executors that
>>> means
>>> >  4 is our fair share)
>>> > 
>>> > 
>>> > >>> At the end of the thread inquiry I suggested that we don't use
>>> enough
>>> > >>> cores, that we could up our fork counts and tests would complete in
>>> > less
>>> > >>> time. I wanted to experiment some w/ high fork counts -- 16 or so
>>> --
>>> > to see
>>> > >>> if concurrent running brought on  more failure.
>>> > >>>
>>>

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2018-01-29 Thread Stack
Our brothers and sisters over in yetus-land made a release that deals w/
the changed JIRA behavior regards ordering attached-patches. No need of
deleting all but the intended patch going forward nor gymnastics with
prefixes when naming.  It seems to be working properly. The one-liner
change that moves us from yetus 0.6.0 to 0.7.0 has been pushed to all
active branches and our hadoopqa up on jenkins has been configured to use
it going forward.

FYI,
S

On Fri, Dec 8, 2017 at 12:13 PM, Stack  wrote:

> Thanks Andrew. I disabled the job. Use the nightly going forward. The jdk7
> builds seem to run fine. The jdk8 has some timeout going on. Need to dig
> in. You can see here:
>
> https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> 20Nightly/job/branch-1.4/
>
> Thanks,
> M
>
> On Fri, Dec 8, 2017 at 11:29 AM, Andrew Purtell 
> wrote:
>
>> Ok with me, Stack. Thanks for asking.
>>
>>
>> On Thu, Nov 30, 2017 at 5:33 PM, Stack  wrote:
>>
>> > On the move over to nightly test runs:
>> >
>> > 1.2 nightly had a successful build last night after the branch-1
>> > stabilization effort (HBASE-19204) and fixing a few unit test failures.
>> See
>> > build 150
>> > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
>> > 20Nightly/job/branch-1.2/
>> > It then failed, 151, because of timed out test. Need to dig in. Clean
>> up a
>> > few more unit tests and branch-1.2 is probably ready for a
>> release-cutting.
>> >
>> > 1.3 has a few flakies. The last build failed because of:
>> >
>> >   Test Result (1 failure / ±0)
>> > org.apache.hadoop.hbase.regionserver.TestEncryptionKeyRotation.
>> > testCFKeyRotation
>> >
>> > Just a little effort should turn 1.3 green.
>> >
>> > I was going to disable the 1.4 job,
>> > https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.4/,  in
>> favor of
>> > the 1.4 nightly,
>> > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
>> > 20Nightly/job/branch-1.4/,
>> > if ok w/ you Andrew Purtell... And move over the branch-1, branch-2, and
>> > master too.
>> >
>> > Thanks,
>> > S
>> >
>> >
>> >
>> > On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
>> >
>> > > Example of the new nice reporting: vhttps://builds.apache.org/
>> > > view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/
>> > > S
>> > >
>> > > On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
>> > >
>> > >> Note that I have disabled the HBase-1.2-JDK7, HBase-1.2-JDK8,
>> > >> HBase-1.3-JDK7, and HBase-1.3-JDK8 jobs. They have been broken for a
>> > good
>> > >> while now. In their place, refer to an ongoing Sean "Nightly"
>> project,
>> > an
>> > >> effort he has been at for a while. It does more checking with pretty
>> > >> reports that will help figuring general stability over time. See
>> under
>> > >> https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/
>> > >> See the nightly builds for 1.2 and 1.3. They have some teething
>> issues
>> > >> still but are almost there. See the 1.2 build from last night. In
>> recent
>> > >> days, the 1.2 branch went from trash-can fire to stable. See how all
>> > tests
>> > >> passed in the last build but then we failed generating the src
>> bundle on
>> > >> the end (this is what I mean by 'teething' issue). Will work on
>> fixing
>> > this
>> > >> last step and moving over 1.4, etc., in the next few days.
>> > >>
>> > >> FYI,
>> > >> St.Ack
>> > >>
>> > >>
>> > >> On Tue, Nov 7, 2017 at 7:45 AM, Stack  wrote:
>> > >>
>> > >>> On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey 
>> wrote:
>> > >>>
>> >  > Should I be able to see the machine dir when I look at nightlies
>> >  output?
>> >  > (Was trying to see what else is running).
>> > 
>> >  Ah. we don't have the same machine sampling on nightly as we do in
>> >  precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
>> >  repeatedly)  that includes pulling that information gathering into
>> a
>> >  place where we could also use it in nightly.
>> > 
>> > 
>> > >>> Sweet.
>> > >>>
>> > >>>
>> > >>>
>> >  Did we ever figure out how many cores we expect our tests to need?
>> It
>> >  looks like the Hadoop nodes have 8 cores. (with 2 executors that
>> means
>> >  4 is our fair share)
>> > 
>> > 
>> > >>> At the end of the thread inquiry I suggested that we don't use
>> enough
>> > >>> cores, that we could up our fork counts and tests would complete in
>> > less
>> > >>> time. I wanted to experiment some w/ high fork counts -- 16 or so --
>> > to see
>> > >>> if concurrent running brought on  more failure.
>> > >>>
>> > >>> St.Ack
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> >  On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey 
>> > wrote:
>> >  > surefire results get zipped up (we were filling the jenkins hosts
>> > with
>> >  > old test logs previously) and stored in a file called
>> > "test_logs.zip"
>> >  > for each jvm run. So if that happend in the jdk7 run for
>> branch-1.2,
>> >  > it'd be in artifacts -> output-jdk7 -> test_logs.zip.
>> > 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-12-08 Thread Stack
Thanks Andrew. I disabled the job. Use the nightly going forward. The jdk7
builds seem to run fine. The jdk8 has some timeout going on. Need to dig
in. You can see here:

https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.4/

Thanks,
M

On Fri, Dec 8, 2017 at 11:29 AM, Andrew Purtell  wrote:

> Ok with me, Stack. Thanks for asking.
>
>
> On Thu, Nov 30, 2017 at 5:33 PM, Stack  wrote:
>
> > On the move over to nightly test runs:
> >
> > 1.2 nightly had a successful build last night after the branch-1
> > stabilization effort (HBASE-19204) and fixing a few unit test failures.
> See
> > build 150
> > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> > 20Nightly/job/branch-1.2/
> > It then failed, 151, because of timed out test. Need to dig in. Clean up
> a
> > few more unit tests and branch-1.2 is probably ready for a
> release-cutting.
> >
> > 1.3 has a few flakies. The last build failed because of:
> >
> >   Test Result (1 failure / ±0)
> > org.apache.hadoop.hbase.regionserver.TestEncryptionKeyRotation.
> > testCFKeyRotation
> >
> > Just a little effort should turn 1.3 green.
> >
> > I was going to disable the 1.4 job,
> > https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.4/,  in favor
> of
> > the 1.4 nightly,
> > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> > 20Nightly/job/branch-1.4/,
> > if ok w/ you Andrew Purtell... And move over the branch-1, branch-2, and
> > master too.
> >
> > Thanks,
> > S
> >
> >
> >
> > On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
> >
> > > Example of the new nice reporting: vhttps://builds.apache.org/
> > > view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/
> > > S
> > >
> > > On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
> > >
> > >> Note that I have disabled the HBase-1.2-JDK7, HBase-1.2-JDK8,
> > >> HBase-1.3-JDK7, and HBase-1.3-JDK8 jobs. They have been broken for a
> > good
> > >> while now. In their place, refer to an ongoing Sean "Nightly" project,
> > an
> > >> effort he has been at for a while. It does more checking with pretty
> > >> reports that will help figuring general stability over time. See under
> > >> https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/
> > >> See the nightly builds for 1.2 and 1.3. They have some teething issues
> > >> still but are almost there. See the 1.2 build from last night. In
> recent
> > >> days, the 1.2 branch went from trash-can fire to stable. See how all
> > tests
> > >> passed in the last build but then we failed generating the src bundle
> on
> > >> the end (this is what I mean by 'teething' issue). Will work on fixing
> > this
> > >> last step and moving over 1.4, etc., in the next few days.
> > >>
> > >> FYI,
> > >> St.Ack
> > >>
> > >>
> > >> On Tue, Nov 7, 2017 at 7:45 AM, Stack  wrote:
> > >>
> > >>> On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey 
> wrote:
> > >>>
> >  > Should I be able to see the machine dir when I look at nightlies
> >  output?
> >  > (Was trying to see what else is running).
> > 
> >  Ah. we don't have the same machine sampling on nightly as we do in
> >  precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
> >  repeatedly)  that includes pulling that information gathering into a
> >  place where we could also use it in nightly.
> > 
> > 
> > >>> Sweet.
> > >>>
> > >>>
> > >>>
> >  Did we ever figure out how many cores we expect our tests to need?
> It
> >  looks like the Hadoop nodes have 8 cores. (with 2 executors that
> means
> >  4 is our fair share)
> > 
> > 
> > >>> At the end of the thread inquiry I suggested that we don't use enough
> > >>> cores, that we could up our fork counts and tests would complete in
> > less
> > >>> time. I wanted to experiment some w/ high fork counts -- 16 or so --
> > to see
> > >>> if concurrent running brought on  more failure.
> > >>>
> > >>> St.Ack
> > >>>
> > >>>
> > >>>
> > >>>
> >  On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey 
> > wrote:
> >  > surefire results get zipped up (we were filling the jenkins hosts
> > with
> >  > old test logs previously) and stored in a file called
> > "test_logs.zip"
> >  > for each jvm run. So if that happend in the jdk7 run for
> branch-1.2,
> >  > it'd be in artifacts -> output-jdk7 -> test_logs.zip.
> >  >
> >  > I don't know if the archival process grabs things from surefire
> that
> >  > aren't the surefire XML files, but we can update it to do so if it
> >  > doesn't.
> >  >
> >  > On Mon, Nov 6, 2017 at 11:39 PM, Stack  wrote:
> >  >> I see this in the 1.2 nightly just when it gives up the ghost
> >  >>
> >  >> [WARNING] Corrupted STDOUT by directly writing to native stream
> in
> >  >> forked JVM 2. See FAQ web page and the dump file
> >  >> /testptch/hbase/hbase-server/target/surefire-reports/2017-11
> >  -06T20-11-30_219-jvmRun2.dumpstream
> >  >>
> >  >> .. but the pointed to dumpstream doesn

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-12-08 Thread Andrew Purtell
Ok with me, Stack. Thanks for asking.


On Thu, Nov 30, 2017 at 5:33 PM, Stack  wrote:

> On the move over to nightly test runs:
>
> 1.2 nightly had a successful build last night after the branch-1
> stabilization effort (HBASE-19204) and fixing a few unit test failures. See
> build 150
> https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> 20Nightly/job/branch-1.2/
> It then failed, 151, because of timed out test. Need to dig in. Clean up a
> few more unit tests and branch-1.2 is probably ready for a release-cutting.
>
> 1.3 has a few flakies. The last build failed because of:
>
>   Test Result (1 failure / ±0)
> org.apache.hadoop.hbase.regionserver.TestEncryptionKeyRotation.
> testCFKeyRotation
>
> Just a little effort should turn 1.3 green.
>
> I was going to disable the 1.4 job,
> https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.4/,  in favor of
> the 1.4 nightly,
> https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> 20Nightly/job/branch-1.4/,
> if ok w/ you Andrew Purtell... And move over the branch-1, branch-2, and
> master too.
>
> Thanks,
> S
>
>
>
> On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
>
> > Example of the new nice reporting: vhttps://builds.apache.org/
> > view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/
> > S
> >
> > On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
> >
> >> Note that I have disabled the HBase-1.2-JDK7, HBase-1.2-JDK8,
> >> HBase-1.3-JDK7, and HBase-1.3-JDK8 jobs. They have been broken for a
> good
> >> while now. In their place, refer to an ongoing Sean "Nightly" project,
> an
> >> effort he has been at for a while. It does more checking with pretty
> >> reports that will help figuring general stability over time. See under
> >> https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/
> >> See the nightly builds for 1.2 and 1.3. They have some teething issues
> >> still but are almost there. See the 1.2 build from last night. In recent
> >> days, the 1.2 branch went from trash-can fire to stable. See how all
> tests
> >> passed in the last build but then we failed generating the src bundle on
> >> the end (this is what I mean by 'teething' issue). Will work on fixing
> this
> >> last step and moving over 1.4, etc., in the next few days.
> >>
> >> FYI,
> >> St.Ack
> >>
> >>
> >> On Tue, Nov 7, 2017 at 7:45 AM, Stack  wrote:
> >>
> >>> On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey  wrote:
> >>>
>  > Should I be able to see the machine dir when I look at nightlies
>  output?
>  > (Was trying to see what else is running).
> 
>  Ah. we don't have the same machine sampling on nightly as we do in
>  precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
>  repeatedly)  that includes pulling that information gathering into a
>  place where we could also use it in nightly.
> 
> 
> >>> Sweet.
> >>>
> >>>
> >>>
>  Did we ever figure out how many cores we expect our tests to need? It
>  looks like the Hadoop nodes have 8 cores. (with 2 executors that means
>  4 is our fair share)
> 
> 
> >>> At the end of the thread inquiry I suggested that we don't use enough
> >>> cores, that we could up our fork counts and tests would complete in
> less
> >>> time. I wanted to experiment some w/ high fork counts -- 16 or so --
> to see
> >>> if concurrent running brought on  more failure.
> >>>
> >>> St.Ack
> >>>
> >>>
> >>>
> >>>
>  On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey 
> wrote:
>  > surefire results get zipped up (we were filling the jenkins hosts
> with
>  > old test logs previously) and stored in a file called
> "test_logs.zip"
>  > for each jvm run. So if that happend in the jdk7 run for branch-1.2,
>  > it'd be in artifacts -> output-jdk7 -> test_logs.zip.
>  >
>  > I don't know if the archival process grabs things from surefire that
>  > aren't the surefire XML files, but we can update it to do so if it
>  > doesn't.
>  >
>  > On Mon, Nov 6, 2017 at 11:39 PM, Stack  wrote:
>  >> I see this in the 1.2 nightly just when it gives up the ghost
>  >>
>  >> [WARNING] Corrupted STDOUT by directly writing to native stream in
>  >> forked JVM 2. See FAQ web page and the dump file
>  >> /testptch/hbase/hbase-server/target/surefire-reports/2017-11
>  -06T20-11-30_219-jvmRun2.dumpstream
>  >>
>  >> .. but the pointed to dumpstream doesn't seem to be around post
>  build.
>  >> I am looking in wrong place?
>  >>
>  >>
>  >> Thanks,
>  >>
>  >> S
>  >>
>  >>
>  >> On Mon, Nov 6, 2017 at 8:20 PM, Stack  wrote:
>  >>
>  >>> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey <
> [email protected]>
>  wrote:
>  >>>
>   Given that all of the old post-commit tests have been posting
> that
>   they're failing to JIRAs for what looks like a month, is there
> any
>   reason not to switch to the new tests that also say they're
>  failing?
> >>>

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-30 Thread Stack
On the move over to nightly test runs:

1.2 nightly had a successful build last night after the branch-1
stabilization effort (HBASE-19204) and fixing a few unit test failures. See
build 150
https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/
It then failed, 151, because of timed out test. Need to dig in. Clean up a
few more unit tests and branch-1.2 is probably ready for a release-cutting.

1.3 has a few flakies. The last build failed because of:

  Test Result (1 failure / ±0)
org.apache.hadoop.hbase.regionserver.TestEncryptionKeyRotation.testCFKeyRotation

Just a little effort should turn 1.3 green.

I was going to disable the 1.4 job,
https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.4/,  in favor of
the 1.4 nightly,
https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.4/,
if ok w/ you Andrew Purtell... And move over the branch-1, branch-2, and
master too.

Thanks,
S



On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:

> Example of the new nice reporting: vhttps://builds.apache.org/
> view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/
> S
>
> On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
>
>> Note that I have disabled the HBase-1.2-JDK7, HBase-1.2-JDK8,
>> HBase-1.3-JDK7, and HBase-1.3-JDK8 jobs. They have been broken for a good
>> while now. In their place, refer to an ongoing Sean "Nightly" project, an
>> effort he has been at for a while. It does more checking with pretty
>> reports that will help figuring general stability over time. See under
>> https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/
>> See the nightly builds for 1.2 and 1.3. They have some teething issues
>> still but are almost there. See the 1.2 build from last night. In recent
>> days, the 1.2 branch went from trash-can fire to stable. See how all tests
>> passed in the last build but then we failed generating the src bundle on
>> the end (this is what I mean by 'teething' issue). Will work on fixing this
>> last step and moving over 1.4, etc., in the next few days.
>>
>> FYI,
>> St.Ack
>>
>>
>> On Tue, Nov 7, 2017 at 7:45 AM, Stack  wrote:
>>
>>> On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey  wrote:
>>>
 > Should I be able to see the machine dir when I look at nightlies
 output?
 > (Was trying to see what else is running).

 Ah. we don't have the same machine sampling on nightly as we do in
 precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
 repeatedly)  that includes pulling that information gathering into a
 place where we could also use it in nightly.


>>> Sweet.
>>>
>>>
>>>
 Did we ever figure out how many cores we expect our tests to need? It
 looks like the Hadoop nodes have 8 cores. (with 2 executors that means
 4 is our fair share)


>>> At the end of the thread inquiry I suggested that we don't use enough
>>> cores, that we could up our fork counts and tests would complete in less
>>> time. I wanted to experiment some w/ high fork counts -- 16 or so -- to see
>>> if concurrent running brought on  more failure.
>>>
>>> St.Ack
>>>
>>>
>>>
>>>
 On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey  wrote:
 > surefire results get zipped up (we were filling the jenkins hosts with
 > old test logs previously) and stored in a file called "test_logs.zip"
 > for each jvm run. So if that happend in the jdk7 run for branch-1.2,
 > it'd be in artifacts -> output-jdk7 -> test_logs.zip.
 >
 > I don't know if the archival process grabs things from surefire that
 > aren't the surefire XML files, but we can update it to do so if it
 > doesn't.
 >
 > On Mon, Nov 6, 2017 at 11:39 PM, Stack  wrote:
 >> I see this in the 1.2 nightly just when it gives up the ghost
 >>
 >> [WARNING] Corrupted STDOUT by directly writing to native stream in
 >> forked JVM 2. See FAQ web page and the dump file
 >> /testptch/hbase/hbase-server/target/surefire-reports/2017-11
 -06T20-11-30_219-jvmRun2.dumpstream
 >>
 >> .. but the pointed to dumpstream doesn't seem to be around post
 build.
 >> I am looking in wrong place?
 >>
 >>
 >> Thanks,
 >>
 >> S
 >>
 >>
 >> On Mon, Nov 6, 2017 at 8:20 PM, Stack  wrote:
 >>
 >>> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey 
 wrote:
 >>>
  Given that all of the old post-commit tests have been posting that
  they're failing to JIRAs for what looks like a month, is there any
  reason not to switch to the new tests that also say they're
 failing?
 
 
 >>> No reason.
 >>>
 >>>
 >>>
  The reason HBASE-18467 has been sitting on hold this whole time has
  been because the new nightly branch tests keep complaining about
  failures.
 
 
 >>> Looking just now, it looks like killed-off test runs.
 >>>
 >>> +1 on move to nightlies.
 >>>
 >>> Ca

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-29 Thread Apekshit Sharma
Oh, btw, flaky dashboard job needs to be changed now to use those builds
instead.
I'll try to give it sometime.

On Wed, Nov 29, 2017 at 1:19 PM, Apekshit Sharma  wrote:

> Yeah, i liked that breakup a lot!  One look, and you know which part needs
> fixing.
> fyi:  It might take few seconds before the table we are talking about
> shows up.
>
>  -- Appy
>
> On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
>
>> Example of the new nice reporting: vhttps://
>> builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/
>> S
>>
>> On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
>>
>> > Note that I have disabled the HBase-1.2-JDK7, HBase-1.2-JDK8,
>> > HBase-1.3-JDK7, and HBase-1.3-JDK8 jobs. They have been broken for a
>> good
>> > while now. In their place, refer to an ongoing Sean "Nightly" project,
>> an
>> > effort he has been at for a while. It does more checking with pretty
>> > reports that will help figuring general stability over time. See under
>> > https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/
>> > See the nightly builds for 1.2 and 1.3. They have some teething issues
>> > still but are almost there. See the 1.2 build from last night. In recent
>> > days, the 1.2 branch went from trash-can fire to stable. See how all
>> tests
>> > passed in the last build but then we failed generating the src bundle on
>> > the end (this is what I mean by 'teething' issue). Will work on fixing
>> this
>> > last step and moving over 1.4, etc., in the next few days.
>> >
>> > FYI,
>> > St.Ack
>> >
>> >
>> > On Tue, Nov 7, 2017 at 7:45 AM, Stack  wrote:
>> >
>> >> On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey  wrote:
>> >>
>> >>> > Should I be able to see the machine dir when I look at nightlies
>> >>> output?
>> >>> > (Was trying to see what else is running).
>> >>>
>> >>> Ah. we don't have the same machine sampling on nightly as we do in
>> >>> precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
>> >>> repeatedly)  that includes pulling that information gathering into a
>> >>> place where we could also use it in nightly.
>> >>>
>> >>>
>> >> Sweet.
>> >>
>> >>
>> >>
>> >>> Did we ever figure out how many cores we expect our tests to need? It
>> >>> looks like the Hadoop nodes have 8 cores. (with 2 executors that means
>> >>> 4 is our fair share)
>> >>>
>> >>>
>> >> At the end of the thread inquiry I suggested that we don't use enough
>> >> cores, that we could up our fork counts and tests would complete in
>> less
>> >> time. I wanted to experiment some w/ high fork counts -- 16 or so --
>> to see
>> >> if concurrent running brought on  more failure.
>> >>
>> >> St.Ack
>> >>
>> >>
>> >>
>> >>
>> >>> On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey 
>> wrote:
>> >>> > surefire results get zipped up (we were filling the jenkins hosts
>> with
>> >>> > old test logs previously) and stored in a file called
>> "test_logs.zip"
>> >>> > for each jvm run. So if that happend in the jdk7 run for branch-1.2,
>> >>> > it'd be in artifacts -> output-jdk7 -> test_logs.zip.
>> >>> >
>> >>> > I don't know if the archival process grabs things from surefire that
>> >>> > aren't the surefire XML files, but we can update it to do so if it
>> >>> > doesn't.
>> >>> >
>> >>> > On Mon, Nov 6, 2017 at 11:39 PM, Stack  wrote:
>> >>> >> I see this in the 1.2 nightly just when it gives up the ghost
>> >>> >>
>> >>> >> [WARNING] Corrupted STDOUT by directly writing to native stream in
>> >>> >> forked JVM 2. See FAQ web page and the dump file
>> >>> >> /testptch/hbase/hbase-server/target/surefire-reports/2017-11
>> >>> -06T20-11-30_219-jvmRun2.dumpstream
>> >>> >>
>> >>> >> .. but the pointed to dumpstream doesn't seem to be around post
>> build.
>> >>> >> I am looking in wrong place?
>> >>> >>
>> >>> >>
>> >>> >> Thanks,
>> >>> >>
>> >>> >> S
>> >>> >>
>> >>> >>
>> >>> >> On Mon, Nov 6, 2017 at 8:20 PM, Stack  wrote:
>> >>> >>
>> >>> >>> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey <
>> [email protected]>
>> >>> wrote:
>> >>> >>>
>> >>>  Given that all of the old post-commit tests have been posting
>> that
>> >>>  they're failing to JIRAs for what looks like a month, is there
>> any
>> >>>  reason not to switch to the new tests that also say they're
>> failing?
>> >>> 
>> >>> 
>> >>> >>> No reason.
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>>  The reason HBASE-18467 has been sitting on hold this whole time
>> has
>> >>>  been because the new nightly branch tests keep complaining about
>> >>>  failures.
>> >>> 
>> >>> 
>> >>> >>> Looking just now, it looks like killed-off test runs.
>> >>> >>>
>> >>> >>> +1 on move to nightlies.
>> >>> >>>
>> >>> >>> Can I help?
>> >>> >>>
>> >>> >>> Should I be able to see the machine dir when I look at nightlies
>> >>> output?
>> >>> >>> (Was trying to see what else is running).
>> >>> >>>
>> >>> >>> Thanks Sean,
>> >>> >>> St.Ack
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>>  On Mon, Nov 6, 2017 at 10:21

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-29 Thread Apekshit Sharma
Yeah, i liked that breakup a lot!  One look, and you know which part needs
fixing.
fyi:  It might take few seconds before the table we are talking about shows
up.

 -- Appy

On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:

> Example of the new nice reporting: vhttps://
> builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/
> S
>
> On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:
>
> > Note that I have disabled the HBase-1.2-JDK7, HBase-1.2-JDK8,
> > HBase-1.3-JDK7, and HBase-1.3-JDK8 jobs. They have been broken for a good
> > while now. In their place, refer to an ongoing Sean "Nightly" project, an
> > effort he has been at for a while. It does more checking with pretty
> > reports that will help figuring general stability over time. See under
> > https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/
> > See the nightly builds for 1.2 and 1.3. They have some teething issues
> > still but are almost there. See the 1.2 build from last night. In recent
> > days, the 1.2 branch went from trash-can fire to stable. See how all
> tests
> > passed in the last build but then we failed generating the src bundle on
> > the end (this is what I mean by 'teething' issue). Will work on fixing
> this
> > last step and moving over 1.4, etc., in the next few days.
> >
> > FYI,
> > St.Ack
> >
> >
> > On Tue, Nov 7, 2017 at 7:45 AM, Stack  wrote:
> >
> >> On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey  wrote:
> >>
> >>> > Should I be able to see the machine dir when I look at nightlies
> >>> output?
> >>> > (Was trying to see what else is running).
> >>>
> >>> Ah. we don't have the same machine sampling on nightly as we do in
> >>> precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
> >>> repeatedly)  that includes pulling that information gathering into a
> >>> place where we could also use it in nightly.
> >>>
> >>>
> >> Sweet.
> >>
> >>
> >>
> >>> Did we ever figure out how many cores we expect our tests to need? It
> >>> looks like the Hadoop nodes have 8 cores. (with 2 executors that means
> >>> 4 is our fair share)
> >>>
> >>>
> >> At the end of the thread inquiry I suggested that we don't use enough
> >> cores, that we could up our fork counts and tests would complete in less
> >> time. I wanted to experiment some w/ high fork counts -- 16 or so -- to
> see
> >> if concurrent running brought on  more failure.
> >>
> >> St.Ack
> >>
> >>
> >>
> >>
> >>> On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey  wrote:
> >>> > surefire results get zipped up (we were filling the jenkins hosts
> with
> >>> > old test logs previously) and stored in a file called "test_logs.zip"
> >>> > for each jvm run. So if that happend in the jdk7 run for branch-1.2,
> >>> > it'd be in artifacts -> output-jdk7 -> test_logs.zip.
> >>> >
> >>> > I don't know if the archival process grabs things from surefire that
> >>> > aren't the surefire XML files, but we can update it to do so if it
> >>> > doesn't.
> >>> >
> >>> > On Mon, Nov 6, 2017 at 11:39 PM, Stack  wrote:
> >>> >> I see this in the 1.2 nightly just when it gives up the ghost
> >>> >>
> >>> >> [WARNING] Corrupted STDOUT by directly writing to native stream in
> >>> >> forked JVM 2. See FAQ web page and the dump file
> >>> >> /testptch/hbase/hbase-server/target/surefire-reports/2017-11
> >>> -06T20-11-30_219-jvmRun2.dumpstream
> >>> >>
> >>> >> .. but the pointed to dumpstream doesn't seem to be around post
> build.
> >>> >> I am looking in wrong place?
> >>> >>
> >>> >>
> >>> >> Thanks,
> >>> >>
> >>> >> S
> >>> >>
> >>> >>
> >>> >> On Mon, Nov 6, 2017 at 8:20 PM, Stack  wrote:
> >>> >>
> >>> >>> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey  >
> >>> wrote:
> >>> >>>
> >>>  Given that all of the old post-commit tests have been posting that
> >>>  they're failing to JIRAs for what looks like a month, is there any
> >>>  reason not to switch to the new tests that also say they're
> failing?
> >>> 
> >>> 
> >>> >>> No reason.
> >>> >>>
> >>> >>>
> >>> >>>
> >>>  The reason HBASE-18467 has been sitting on hold this whole time
> has
> >>>  been because the new nightly branch tests keep complaining about
> >>>  failures.
> >>> 
> >>> 
> >>> >>> Looking just now, it looks like killed-off test runs.
> >>> >>>
> >>> >>> +1 on move to nightlies.
> >>> >>>
> >>> >>> Can I help?
> >>> >>>
> >>> >>> Should I be able to see the machine dir when I look at nightlies
> >>> output?
> >>> >>> (Was trying to see what else is running).
> >>> >>>
> >>> >>> Thanks Sean,
> >>> >>> St.Ack
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>>
> >>>  On Mon, Nov 6, 2017 at 10:21 AM, Sean Busbey <
> [email protected]
> >>> >
> >>>  wrote:
> >>>  > It looks like old tests branch-1.2 and branch-1.3 are failing
> with
> >>>  > some maven enforcer problem that we thought we had fixed a few
> >>> times
> >>>  > before. It's probably fixable by changing the version of maven
> >>> they
> >>>  > use, but I'd much rat

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-29 Thread Stack
Example of the new nice reporting: vhttps://
builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/
S

On Wed, Nov 29, 2017 at 8:06 AM, Stack  wrote:

> Note that I have disabled the HBase-1.2-JDK7, HBase-1.2-JDK8,
> HBase-1.3-JDK7, and HBase-1.3-JDK8 jobs. They have been broken for a good
> while now. In their place, refer to an ongoing Sean "Nightly" project, an
> effort he has been at for a while. It does more checking with pretty
> reports that will help figuring general stability over time. See under
> https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/
> See the nightly builds for 1.2 and 1.3. They have some teething issues
> still but are almost there. See the 1.2 build from last night. In recent
> days, the 1.2 branch went from trash-can fire to stable. See how all tests
> passed in the last build but then we failed generating the src bundle on
> the end (this is what I mean by 'teething' issue). Will work on fixing this
> last step and moving over 1.4, etc., in the next few days.
>
> FYI,
> St.Ack
>
>
> On Tue, Nov 7, 2017 at 7:45 AM, Stack  wrote:
>
>> On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey  wrote:
>>
>>> > Should I be able to see the machine dir when I look at nightlies
>>> output?
>>> > (Was trying to see what else is running).
>>>
>>> Ah. we don't have the same machine sampling on nightly as we do in
>>> precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
>>> repeatedly)  that includes pulling that information gathering into a
>>> place where we could also use it in nightly.
>>>
>>>
>> Sweet.
>>
>>
>>
>>> Did we ever figure out how many cores we expect our tests to need? It
>>> looks like the Hadoop nodes have 8 cores. (with 2 executors that means
>>> 4 is our fair share)
>>>
>>>
>> At the end of the thread inquiry I suggested that we don't use enough
>> cores, that we could up our fork counts and tests would complete in less
>> time. I wanted to experiment some w/ high fork counts -- 16 or so -- to see
>> if concurrent running brought on  more failure.
>>
>> St.Ack
>>
>>
>>
>>
>>> On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey  wrote:
>>> > surefire results get zipped up (we were filling the jenkins hosts with
>>> > old test logs previously) and stored in a file called "test_logs.zip"
>>> > for each jvm run. So if that happend in the jdk7 run for branch-1.2,
>>> > it'd be in artifacts -> output-jdk7 -> test_logs.zip.
>>> >
>>> > I don't know if the archival process grabs things from surefire that
>>> > aren't the surefire XML files, but we can update it to do so if it
>>> > doesn't.
>>> >
>>> > On Mon, Nov 6, 2017 at 11:39 PM, Stack  wrote:
>>> >> I see this in the 1.2 nightly just when it gives up the ghost
>>> >>
>>> >> [WARNING] Corrupted STDOUT by directly writing to native stream in
>>> >> forked JVM 2. See FAQ web page and the dump file
>>> >> /testptch/hbase/hbase-server/target/surefire-reports/2017-11
>>> -06T20-11-30_219-jvmRun2.dumpstream
>>> >>
>>> >> .. but the pointed to dumpstream doesn't seem to be around post build.
>>> >> I am looking in wrong place?
>>> >>
>>> >>
>>> >> Thanks,
>>> >>
>>> >> S
>>> >>
>>> >>
>>> >> On Mon, Nov 6, 2017 at 8:20 PM, Stack  wrote:
>>> >>
>>> >>> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey 
>>> wrote:
>>> >>>
>>>  Given that all of the old post-commit tests have been posting that
>>>  they're failing to JIRAs for what looks like a month, is there any
>>>  reason not to switch to the new tests that also say they're failing?
>>> 
>>> 
>>> >>> No reason.
>>> >>>
>>> >>>
>>> >>>
>>>  The reason HBASE-18467 has been sitting on hold this whole time has
>>>  been because the new nightly branch tests keep complaining about
>>>  failures.
>>> 
>>> 
>>> >>> Looking just now, it looks like killed-off test runs.
>>> >>>
>>> >>> +1 on move to nightlies.
>>> >>>
>>> >>> Can I help?
>>> >>>
>>> >>> Should I be able to see the machine dir when I look at nightlies
>>> output?
>>> >>> (Was trying to see what else is running).
>>> >>>
>>> >>> Thanks Sean,
>>> >>> St.Ack
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>>  On Mon, Nov 6, 2017 at 10:21 AM, Sean Busbey >> >
>>>  wrote:
>>>  > It looks like old tests branch-1.2 and branch-1.3 are failing with
>>>  > some maven enforcer problem that we thought we had fixed a few
>>> times
>>>  > before. It's probably fixable by changing the version of maven
>>> they
>>>  > use, but I'd much rather any test effort go into the last mile of
>>>  > getting our new nightly tests working.
>>>  >
>>>  > I'll start picking this up as soon as I close out HBASE-18784.
>>>  >
>>>  > Please consider branch-1.2 release blocked. :(
>>>  >
>>>  > On Mon, Nov 6, 2017 at 10:19 AM, Stack  wrote:
>>>  >> Our builds seem pretty sick up on builds.apache.org even after
>>> the
>>>  miracle
>>>  >> work by Allen W containing errant hadoop processes. Looking at
>>> 1.2 and
>>>  1.3,
>>> 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-29 Thread Stack
Note that I have disabled the HBase-1.2-JDK7, HBase-1.2-JDK8,
HBase-1.3-JDK7, and HBase-1.3-JDK8 jobs. They have been broken for a good
while now. In their place, refer to an ongoing Sean "Nightly" project, an
effort he has been at for a while. It does more checking with pretty
reports that will help figuring general stability over time. See under
https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/ See the
nightly builds for 1.2 and 1.3. They have some teething issues still but
are almost there. See the 1.2 build from last night. In recent days, the
1.2 branch went from trash-can fire to stable. See how all tests passed in
the last build but then we failed generating the src bundle on the end
(this is what I mean by 'teething' issue). Will work on fixing this last
step and moving over 1.4, etc., in the next few days.

FYI,
St.Ack


On Tue, Nov 7, 2017 at 7:45 AM, Stack  wrote:

> On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey  wrote:
>
>> > Should I be able to see the machine dir when I look at nightlies output?
>> > (Was trying to see what else is running).
>>
>> Ah. we don't have the same machine sampling on nightly as we do in
>> precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
>> repeatedly)  that includes pulling that information gathering into a
>> place where we could also use it in nightly.
>>
>>
> Sweet.
>
>
>
>> Did we ever figure out how many cores we expect our tests to need? It
>> looks like the Hadoop nodes have 8 cores. (with 2 executors that means
>> 4 is our fair share)
>>
>>
> At the end of the thread inquiry I suggested that we don't use enough
> cores, that we could up our fork counts and tests would complete in less
> time. I wanted to experiment some w/ high fork counts -- 16 or so -- to see
> if concurrent running brought on  more failure.
>
> St.Ack
>
>
>
>
>> On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey  wrote:
>> > surefire results get zipped up (we were filling the jenkins hosts with
>> > old test logs previously) and stored in a file called "test_logs.zip"
>> > for each jvm run. So if that happend in the jdk7 run for branch-1.2,
>> > it'd be in artifacts -> output-jdk7 -> test_logs.zip.
>> >
>> > I don't know if the archival process grabs things from surefire that
>> > aren't the surefire XML files, but we can update it to do so if it
>> > doesn't.
>> >
>> > On Mon, Nov 6, 2017 at 11:39 PM, Stack  wrote:
>> >> I see this in the 1.2 nightly just when it gives up the ghost
>> >>
>> >> [WARNING] Corrupted STDOUT by directly writing to native stream in
>> >> forked JVM 2. See FAQ web page and the dump file
>> >> /testptch/hbase/hbase-server/target/surefire-reports/2017-11
>> -06T20-11-30_219-jvmRun2.dumpstream
>> >>
>> >> .. but the pointed to dumpstream doesn't seem to be around post build.
>> >> I am looking in wrong place?
>> >>
>> >>
>> >> Thanks,
>> >>
>> >> S
>> >>
>> >>
>> >> On Mon, Nov 6, 2017 at 8:20 PM, Stack  wrote:
>> >>
>> >>> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey 
>> wrote:
>> >>>
>>  Given that all of the old post-commit tests have been posting that
>>  they're failing to JIRAs for what looks like a month, is there any
>>  reason not to switch to the new tests that also say they're failing?
>> 
>> 
>> >>> No reason.
>> >>>
>> >>>
>> >>>
>>  The reason HBASE-18467 has been sitting on hold this whole time has
>>  been because the new nightly branch tests keep complaining about
>>  failures.
>> 
>> 
>> >>> Looking just now, it looks like killed-off test runs.
>> >>>
>> >>> +1 on move to nightlies.
>> >>>
>> >>> Can I help?
>> >>>
>> >>> Should I be able to see the machine dir when I look at nightlies
>> output?
>> >>> (Was trying to see what else is running).
>> >>>
>> >>> Thanks Sean,
>> >>> St.Ack
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>>  On Mon, Nov 6, 2017 at 10:21 AM, Sean Busbey 
>>  wrote:
>>  > It looks like old tests branch-1.2 and branch-1.3 are failing with
>>  > some maven enforcer problem that we thought we had fixed a few
>> times
>>  > before. It's probably fixable by changing the version of maven they
>>  > use, but I'd much rather any test effort go into the last mile of
>>  > getting our new nightly tests working.
>>  >
>>  > I'll start picking this up as soon as I close out HBASE-18784.
>>  >
>>  > Please consider branch-1.2 release blocked. :(
>>  >
>>  > On Mon, Nov 6, 2017 at 10:19 AM, Stack  wrote:
>>  >> Our builds seem pretty sick up on builds.apache.org even after
>> the
>>  miracle
>>  >> work by Allen W containing errant hadoop processes. Looking at
>> 1.2 and
>>  1.3,
>>  >> we don't even get off the ground. Anyone been taking a look?
>>  >>
>>  >> When I try to run the branch-1.2 and branch-1.3 unit tests
>> locally,
>>  about
>>  >> ten tests or so timeout. Have others tried branch-1 test runs
>> recently?
>>  >>
>>  >> Thanks,
>>  >> S
>>  >>
>>  >>
>> >

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-07 Thread Stack
On Tue, Nov 7, 2017 at 6:10 AM, Sean Busbey  wrote:

> > Should I be able to see the machine dir when I look at nightlies output?
> > (Was trying to see what else is running).
>
> Ah. we don't have the same machine sampling on nightly as we do in
> precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
> repeatedly)  that includes pulling that information gathering into a
> place where we could also use it in nightly.
>
>
Sweet.



> Did we ever figure out how many cores we expect our tests to need? It
> looks like the Hadoop nodes have 8 cores. (with 2 executors that means
> 4 is our fair share)
>
>
At the end of the thread inquiry I suggested that we don't use enough
cores, that we could up our fork counts and tests would complete in less
time. I wanted to experiment some w/ high fork counts -- 16 or so -- to see
if concurrent running brought on  more failure.

St.Ack




> On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey  wrote:
> > surefire results get zipped up (we were filling the jenkins hosts with
> > old test logs previously) and stored in a file called "test_logs.zip"
> > for each jvm run. So if that happend in the jdk7 run for branch-1.2,
> > it'd be in artifacts -> output-jdk7 -> test_logs.zip.
> >
> > I don't know if the archival process grabs things from surefire that
> > aren't the surefire XML files, but we can update it to do so if it
> > doesn't.
> >
> > On Mon, Nov 6, 2017 at 11:39 PM, Stack  wrote:
> >> I see this in the 1.2 nightly just when it gives up the ghost
> >>
> >> [WARNING] Corrupted STDOUT by directly writing to native stream in
> >> forked JVM 2. See FAQ web page and the dump file
> >> /testptch/hbase/hbase-server/target/surefire-reports/2017-
> 11-06T20-11-30_219-jvmRun2.dumpstream
> >>
> >> .. but the pointed to dumpstream doesn't seem to be around post build.
> >> I am looking in wrong place?
> >>
> >>
> >> Thanks,
> >>
> >> S
> >>
> >>
> >> On Mon, Nov 6, 2017 at 8:20 PM, Stack  wrote:
> >>
> >>> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey 
> wrote:
> >>>
>  Given that all of the old post-commit tests have been posting that
>  they're failing to JIRAs for what looks like a month, is there any
>  reason not to switch to the new tests that also say they're failing?
> 
> 
> >>> No reason.
> >>>
> >>>
> >>>
>  The reason HBASE-18467 has been sitting on hold this whole time has
>  been because the new nightly branch tests keep complaining about
>  failures.
> 
> 
> >>> Looking just now, it looks like killed-off test runs.
> >>>
> >>> +1 on move to nightlies.
> >>>
> >>> Can I help?
> >>>
> >>> Should I be able to see the machine dir when I look at nightlies
> output?
> >>> (Was trying to see what else is running).
> >>>
> >>> Thanks Sean,
> >>> St.Ack
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
>  On Mon, Nov 6, 2017 at 10:21 AM, Sean Busbey 
>  wrote:
>  > It looks like old tests branch-1.2 and branch-1.3 are failing with
>  > some maven enforcer problem that we thought we had fixed a few times
>  > before. It's probably fixable by changing the version of maven they
>  > use, but I'd much rather any test effort go into the last mile of
>  > getting our new nightly tests working.
>  >
>  > I'll start picking this up as soon as I close out HBASE-18784.
>  >
>  > Please consider branch-1.2 release blocked. :(
>  >
>  > On Mon, Nov 6, 2017 at 10:19 AM, Stack  wrote:
>  >> Our builds seem pretty sick up on builds.apache.org even after the
>  miracle
>  >> work by Allen W containing errant hadoop processes. Looking at 1.2
> and
>  1.3,
>  >> we don't even get off the ground. Anyone been taking a look?
>  >>
>  >> When I try to run the branch-1.2 and branch-1.3 unit tests locally,
>  about
>  >> ten tests or so timeout. Have others tried branch-1 test runs
> recently?
>  >>
>  >> Thanks,
>  >> S
>  >>
>  >>
>  >> On Mon, Aug 21, 2017 at 1:54 PM, Stack  wrote:
>  >>
>  >>> Loads of tests timing out in test runs -- then they all pass.
> Anyone
>  have
>  >>> an input? I'm trying to take a look as background task...
>  >>>
>  >>> S
>  >>>
>  >>> On Tue, Jul 11, 2017 at 7:05 PM, Stack  wrote:
>  >>>
>   Thanks Appy.
>  
>   Any one looking at the 'ERROR ExecutionException Java heap
> space...'
>   errors on patch builds or failed forking? Seems common enough.
> Here
>  are
>   complaints that remote JVM went away:
>  
>   https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>   HBASE-Build/7617/artifact/patchprocess/patch-unit-hbase-
> server.txt
>   https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>   HBASE-Build/7616/artifact/patchprocess/patch-unit-hbase-
> server.txt
>  
>   Then this succeeds
>  
>   https://builds.apache.org/view/H-L/view/HBase/job/PreCommit

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-07 Thread Sean Busbey
okay, what gets saved from test runs is controlled by a parameter to
the jenkins job called "ARCHIVE_PATTERN_LIST".

That gets used by Apache Yetus' archival feature[1]. which is
essentially a comma separated set of file name regexes to use with the
find command.

The default in the job is  "TEST-*.xml,org.apache.h*.txt", which means
it will miss those files. We should expand it to include
'*.dumpstream'

[1]:
http://yetus.apache.org/documentation/0.6.0/precommit-qbt/#archiving

On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey  wrote:
> surefire results get zipped up (we were filling the jenkins hosts with
> old test logs previously) and stored in a file called "test_logs.zip"
> for each jvm run. So if that happend in the jdk7 run for branch-1.2,
> it'd be in artifacts -> output-jdk7 -> test_logs.zip.
>
> I don't know if the archival process grabs things from surefire that
> aren't the surefire XML files, but we can update it to do so if it
> doesn't.
>
> On Mon, Nov 6, 2017 at 11:39 PM, Stack  wrote:
>> I see this in the 1.2 nightly just when it gives up the ghost
>>
>> [WARNING] Corrupted STDOUT by directly writing to native stream in
>> forked JVM 2. See FAQ web page and the dump file
>> /testptch/hbase/hbase-server/target/surefire-reports/2017-11-06T20-11-30_219-jvmRun2.dumpstream
>>
>> .. but the pointed to dumpstream doesn't seem to be around post build.
>> I am looking in wrong place?
>>
>>
>> Thanks,
>>
>> S
>>
>>
>> On Mon, Nov 6, 2017 at 8:20 PM, Stack  wrote:
>>
>>> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey  wrote:
>>>
 Given that all of the old post-commit tests have been posting that
 they're failing to JIRAs for what looks like a month, is there any
 reason not to switch to the new tests that also say they're failing?


>>> No reason.
>>>
>>>
>>>
 The reason HBASE-18467 has been sitting on hold this whole time has
 been because the new nightly branch tests keep complaining about
 failures.


>>> Looking just now, it looks like killed-off test runs.
>>>
>>> +1 on move to nightlies.
>>>
>>> Can I help?
>>>
>>> Should I be able to see the machine dir when I look at nightlies output?
>>> (Was trying to see what else is running).
>>>
>>> Thanks Sean,
>>> St.Ack
>>>
>>>
>>>
>>>
>>>
>>>
 On Mon, Nov 6, 2017 at 10:21 AM, Sean Busbey 
 wrote:
 > It looks like old tests branch-1.2 and branch-1.3 are failing with
 > some maven enforcer problem that we thought we had fixed a few times
 > before. It's probably fixable by changing the version of maven they
 > use, but I'd much rather any test effort go into the last mile of
 > getting our new nightly tests working.
 >
 > I'll start picking this up as soon as I close out HBASE-18784.
 >
 > Please consider branch-1.2 release blocked. :(
 >
 > On Mon, Nov 6, 2017 at 10:19 AM, Stack  wrote:
 >> Our builds seem pretty sick up on builds.apache.org even after the
 miracle
 >> work by Allen W containing errant hadoop processes. Looking at 1.2 and
 1.3,
 >> we don't even get off the ground. Anyone been taking a look?
 >>
 >> When I try to run the branch-1.2 and branch-1.3 unit tests locally,
 about
 >> ten tests or so timeout. Have others tried branch-1 test runs recently?
 >>
 >> Thanks,
 >> S
 >>
 >>
 >> On Mon, Aug 21, 2017 at 1:54 PM, Stack  wrote:
 >>
 >>> Loads of tests timing out in test runs -- then they all pass. Anyone
 have
 >>> an input? I'm trying to take a look as background task...
 >>>
 >>> S
 >>>
 >>> On Tue, Jul 11, 2017 at 7:05 PM, Stack  wrote:
 >>>
  Thanks Appy.
 
  Any one looking at the 'ERROR ExecutionException Java heap space...'
  errors on patch builds or failed forking? Seems common enough. Here
 are
  complaints that remote JVM went away:
 
  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
  HBASE-Build/7617/artifact/patchprocess/patch-unit-hbase-server.txt
  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
  HBASE-Build/7616/artifact/patchprocess/patch-unit-hbase-server.txt
 
  Then this succeeds
 
  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
  HBASE-Build/7614/artifact/patchprocess/patch-unit-hbase-server.txt
 
  And we are good for a while.
 
  Then heap issues:
 
  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
  HBASE-Build/7607/artifact/patchprocess/patch-unit-hbase-server.txt
 
  Are the zombies back?
 
  St.Ack
 
  On Tue, Jul 11, 2017 at 12:33 AM, Apekshit Sharma >>> >
  wrote:
 
 > Fixed 'trends' in flaky dashboard. Since i changed the test names
 in last
 > fix, the dots in the name were messing up with CSS selectors. :)

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-07 Thread Sean Busbey
> Should I be able to see the machine dir when I look at nightlies output?
> (Was trying to see what else is running).

Ah. we don't have the same machine sampling on nightly as we do in
precommit. I am 80% on a patch for HBASE-19189 (run test ad-hoc
repeatedly)  that includes pulling that information gathering into a
place where we could also use it in nightly.

Did we ever figure out how many cores we expect our tests to need? It
looks like the Hadoop nodes have 8 cores. (with 2 executors that means
4 is our fair share)

On Tue, Nov 7, 2017 at 8:05 AM, Sean Busbey  wrote:
> surefire results get zipped up (we were filling the jenkins hosts with
> old test logs previously) and stored in a file called "test_logs.zip"
> for each jvm run. So if that happend in the jdk7 run for branch-1.2,
> it'd be in artifacts -> output-jdk7 -> test_logs.zip.
>
> I don't know if the archival process grabs things from surefire that
> aren't the surefire XML files, but we can update it to do so if it
> doesn't.
>
> On Mon, Nov 6, 2017 at 11:39 PM, Stack  wrote:
>> I see this in the 1.2 nightly just when it gives up the ghost
>>
>> [WARNING] Corrupted STDOUT by directly writing to native stream in
>> forked JVM 2. See FAQ web page and the dump file
>> /testptch/hbase/hbase-server/target/surefire-reports/2017-11-06T20-11-30_219-jvmRun2.dumpstream
>>
>> .. but the pointed to dumpstream doesn't seem to be around post build.
>> I am looking in wrong place?
>>
>>
>> Thanks,
>>
>> S
>>
>>
>> On Mon, Nov 6, 2017 at 8:20 PM, Stack  wrote:
>>
>>> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey  wrote:
>>>
 Given that all of the old post-commit tests have been posting that
 they're failing to JIRAs for what looks like a month, is there any
 reason not to switch to the new tests that also say they're failing?


>>> No reason.
>>>
>>>
>>>
 The reason HBASE-18467 has been sitting on hold this whole time has
 been because the new nightly branch tests keep complaining about
 failures.


>>> Looking just now, it looks like killed-off test runs.
>>>
>>> +1 on move to nightlies.
>>>
>>> Can I help?
>>>
>>> Should I be able to see the machine dir when I look at nightlies output?
>>> (Was trying to see what else is running).
>>>
>>> Thanks Sean,
>>> St.Ack
>>>
>>>
>>>
>>>
>>>
>>>
 On Mon, Nov 6, 2017 at 10:21 AM, Sean Busbey 
 wrote:
 > It looks like old tests branch-1.2 and branch-1.3 are failing with
 > some maven enforcer problem that we thought we had fixed a few times
 > before. It's probably fixable by changing the version of maven they
 > use, but I'd much rather any test effort go into the last mile of
 > getting our new nightly tests working.
 >
 > I'll start picking this up as soon as I close out HBASE-18784.
 >
 > Please consider branch-1.2 release blocked. :(
 >
 > On Mon, Nov 6, 2017 at 10:19 AM, Stack  wrote:
 >> Our builds seem pretty sick up on builds.apache.org even after the
 miracle
 >> work by Allen W containing errant hadoop processes. Looking at 1.2 and
 1.3,
 >> we don't even get off the ground. Anyone been taking a look?
 >>
 >> When I try to run the branch-1.2 and branch-1.3 unit tests locally,
 about
 >> ten tests or so timeout. Have others tried branch-1 test runs recently?
 >>
 >> Thanks,
 >> S
 >>
 >>
 >> On Mon, Aug 21, 2017 at 1:54 PM, Stack  wrote:
 >>
 >>> Loads of tests timing out in test runs -- then they all pass. Anyone
 have
 >>> an input? I'm trying to take a look as background task...
 >>>
 >>> S
 >>>
 >>> On Tue, Jul 11, 2017 at 7:05 PM, Stack  wrote:
 >>>
  Thanks Appy.
 
  Any one looking at the 'ERROR ExecutionException Java heap space...'
  errors on patch builds or failed forking? Seems common enough. Here
 are
  complaints that remote JVM went away:
 
  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
  HBASE-Build/7617/artifact/patchprocess/patch-unit-hbase-server.txt
  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
  HBASE-Build/7616/artifact/patchprocess/patch-unit-hbase-server.txt
 
  Then this succeeds
 
  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
  HBASE-Build/7614/artifact/patchprocess/patch-unit-hbase-server.txt
 
  And we are good for a while.
 
  Then heap issues:
 
  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
  HBASE-Build/7607/artifact/patchprocess/patch-unit-hbase-server.txt
 
  Are the zombies back?
 
  St.Ack
 
  On Tue, Jul 11, 2017 at 12:33 AM, Apekshit Sharma >>> >
  wrote:
 
 > Fixed 'trends' in flaky dashboard. Since i changed the test names
 in last
 > fix, the dots in th

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-07 Thread Sean Busbey
surefire results get zipped up (we were filling the jenkins hosts with
old test logs previously) and stored in a file called "test_logs.zip"
for each jvm run. So if that happend in the jdk7 run for branch-1.2,
it'd be in artifacts -> output-jdk7 -> test_logs.zip.

I don't know if the archival process grabs things from surefire that
aren't the surefire XML files, but we can update it to do so if it
doesn't.

On Mon, Nov 6, 2017 at 11:39 PM, Stack  wrote:
> I see this in the 1.2 nightly just when it gives up the ghost
>
> [WARNING] Corrupted STDOUT by directly writing to native stream in
> forked JVM 2. See FAQ web page and the dump file
> /testptch/hbase/hbase-server/target/surefire-reports/2017-11-06T20-11-30_219-jvmRun2.dumpstream
>
> .. but the pointed to dumpstream doesn't seem to be around post build.
> I am looking in wrong place?
>
>
> Thanks,
>
> S
>
>
> On Mon, Nov 6, 2017 at 8:20 PM, Stack  wrote:
>
>> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey  wrote:
>>
>>> Given that all of the old post-commit tests have been posting that
>>> they're failing to JIRAs for what looks like a month, is there any
>>> reason not to switch to the new tests that also say they're failing?
>>>
>>>
>> No reason.
>>
>>
>>
>>> The reason HBASE-18467 has been sitting on hold this whole time has
>>> been because the new nightly branch tests keep complaining about
>>> failures.
>>>
>>>
>> Looking just now, it looks like killed-off test runs.
>>
>> +1 on move to nightlies.
>>
>> Can I help?
>>
>> Should I be able to see the machine dir when I look at nightlies output?
>> (Was trying to see what else is running).
>>
>> Thanks Sean,
>> St.Ack
>>
>>
>>
>>
>>
>>
>>> On Mon, Nov 6, 2017 at 10:21 AM, Sean Busbey 
>>> wrote:
>>> > It looks like old tests branch-1.2 and branch-1.3 are failing with
>>> > some maven enforcer problem that we thought we had fixed a few times
>>> > before. It's probably fixable by changing the version of maven they
>>> > use, but I'd much rather any test effort go into the last mile of
>>> > getting our new nightly tests working.
>>> >
>>> > I'll start picking this up as soon as I close out HBASE-18784.
>>> >
>>> > Please consider branch-1.2 release blocked. :(
>>> >
>>> > On Mon, Nov 6, 2017 at 10:19 AM, Stack  wrote:
>>> >> Our builds seem pretty sick up on builds.apache.org even after the
>>> miracle
>>> >> work by Allen W containing errant hadoop processes. Looking at 1.2 and
>>> 1.3,
>>> >> we don't even get off the ground. Anyone been taking a look?
>>> >>
>>> >> When I try to run the branch-1.2 and branch-1.3 unit tests locally,
>>> about
>>> >> ten tests or so timeout. Have others tried branch-1 test runs recently?
>>> >>
>>> >> Thanks,
>>> >> S
>>> >>
>>> >>
>>> >> On Mon, Aug 21, 2017 at 1:54 PM, Stack  wrote:
>>> >>
>>> >>> Loads of tests timing out in test runs -- then they all pass. Anyone
>>> have
>>> >>> an input? I'm trying to take a look as background task...
>>> >>>
>>> >>> S
>>> >>>
>>> >>> On Tue, Jul 11, 2017 at 7:05 PM, Stack  wrote:
>>> >>>
>>>  Thanks Appy.
>>> 
>>>  Any one looking at the 'ERROR ExecutionException Java heap space...'
>>>  errors on patch builds or failed forking? Seems common enough. Here
>>> are
>>>  complaints that remote JVM went away:
>>> 
>>>  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>>  HBASE-Build/7617/artifact/patchprocess/patch-unit-hbase-server.txt
>>>  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>>  HBASE-Build/7616/artifact/patchprocess/patch-unit-hbase-server.txt
>>> 
>>>  Then this succeeds
>>> 
>>>  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>>  HBASE-Build/7614/artifact/patchprocess/patch-unit-hbase-server.txt
>>> 
>>>  And we are good for a while.
>>> 
>>>  Then heap issues:
>>> 
>>>  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>>  HBASE-Build/7607/artifact/patchprocess/patch-unit-hbase-server.txt
>>> 
>>>  Are the zombies back?
>>> 
>>>  St.Ack
>>> 
>>>  On Tue, Jul 11, 2017 at 12:33 AM, Apekshit Sharma >> >
>>>  wrote:
>>> 
>>> > Fixed 'trends' in flaky dashboard. Since i changed the test names
>>> in last
>>> > fix, the dots in the name were messing up with CSS selectors. :)
>>> >
>>> >
>>> > On Mon, Jul 10, 2017 at 11:34 AM, Apekshit Sharma <
>>> [email protected]>
>>> > wrote:
>>> >
>>> > > Quick update on flaky dashboard:
>>> > > Flaky dashboard wasn't working earlier because our trunk build was
>>> > broken.
>>> > > After trunk was fixed, the format of log lines in consoleText was
>>> not
>>> > the
>>> > > same, so findHangingTests.py was not able to parse it correctly
>>> for
>>> > > broken/hanging/timeout tests. That's been fixed now HBASE-18341
>>> > > .
>>> > > Drob brought up in other thread that 'treads' isn't working. It'

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-06 Thread Stack
I see this in the 1.2 nightly just when it gives up the ghost

[WARNING] Corrupted STDOUT by directly writing to native stream in
forked JVM 2. See FAQ web page and the dump file
/testptch/hbase/hbase-server/target/surefire-reports/2017-11-06T20-11-30_219-jvmRun2.dumpstream

.. but the pointed to dumpstream doesn't seem to be around post build.
I am looking in wrong place?


Thanks,

S


On Mon, Nov 6, 2017 at 8:20 PM, Stack  wrote:

> On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey  wrote:
>
>> Given that all of the old post-commit tests have been posting that
>> they're failing to JIRAs for what looks like a month, is there any
>> reason not to switch to the new tests that also say they're failing?
>>
>>
> No reason.
>
>
>
>> The reason HBASE-18467 has been sitting on hold this whole time has
>> been because the new nightly branch tests keep complaining about
>> failures.
>>
>>
> Looking just now, it looks like killed-off test runs.
>
> +1 on move to nightlies.
>
> Can I help?
>
> Should I be able to see the machine dir when I look at nightlies output?
> (Was trying to see what else is running).
>
> Thanks Sean,
> St.Ack
>
>
>
>
>
>
>> On Mon, Nov 6, 2017 at 10:21 AM, Sean Busbey 
>> wrote:
>> > It looks like old tests branch-1.2 and branch-1.3 are failing with
>> > some maven enforcer problem that we thought we had fixed a few times
>> > before. It's probably fixable by changing the version of maven they
>> > use, but I'd much rather any test effort go into the last mile of
>> > getting our new nightly tests working.
>> >
>> > I'll start picking this up as soon as I close out HBASE-18784.
>> >
>> > Please consider branch-1.2 release blocked. :(
>> >
>> > On Mon, Nov 6, 2017 at 10:19 AM, Stack  wrote:
>> >> Our builds seem pretty sick up on builds.apache.org even after the
>> miracle
>> >> work by Allen W containing errant hadoop processes. Looking at 1.2 and
>> 1.3,
>> >> we don't even get off the ground. Anyone been taking a look?
>> >>
>> >> When I try to run the branch-1.2 and branch-1.3 unit tests locally,
>> about
>> >> ten tests or so timeout. Have others tried branch-1 test runs recently?
>> >>
>> >> Thanks,
>> >> S
>> >>
>> >>
>> >> On Mon, Aug 21, 2017 at 1:54 PM, Stack  wrote:
>> >>
>> >>> Loads of tests timing out in test runs -- then they all pass. Anyone
>> have
>> >>> an input? I'm trying to take a look as background task...
>> >>>
>> >>> S
>> >>>
>> >>> On Tue, Jul 11, 2017 at 7:05 PM, Stack  wrote:
>> >>>
>>  Thanks Appy.
>> 
>>  Any one looking at the 'ERROR ExecutionException Java heap space...'
>>  errors on patch builds or failed forking? Seems common enough. Here
>> are
>>  complaints that remote JVM went away:
>> 
>>  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>  HBASE-Build/7617/artifact/patchprocess/patch-unit-hbase-server.txt
>>  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>  HBASE-Build/7616/artifact/patchprocess/patch-unit-hbase-server.txt
>> 
>>  Then this succeeds
>> 
>>  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>  HBASE-Build/7614/artifact/patchprocess/patch-unit-hbase-server.txt
>> 
>>  And we are good for a while.
>> 
>>  Then heap issues:
>> 
>>  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>  HBASE-Build/7607/artifact/patchprocess/patch-unit-hbase-server.txt
>> 
>>  Are the zombies back?
>> 
>>  St.Ack
>> 
>>  On Tue, Jul 11, 2017 at 12:33 AM, Apekshit Sharma > >
>>  wrote:
>> 
>> > Fixed 'trends' in flaky dashboard. Since i changed the test names
>> in last
>> > fix, the dots in the name were messing up with CSS selectors. :)
>> >
>> >
>> > On Mon, Jul 10, 2017 at 11:34 AM, Apekshit Sharma <
>> [email protected]>
>> > wrote:
>> >
>> > > Quick update on flaky dashboard:
>> > > Flaky dashboard wasn't working earlier because our trunk build was
>> > broken.
>> > > After trunk was fixed, the format of log lines in consoleText was
>> not
>> > the
>> > > same, so findHangingTests.py was not able to parse it correctly
>> for
>> > > broken/hanging/timeout tests. That's been fixed now HBASE-18341
>> > > .
>> > > Drob brought up in other thread that 'treads' isn't working. It's
>> > probably
>> > > because i changed tests names (which are used as keys in python
>> dicts)
>> > from
>> > > just class name to package name+classname (without common
>> > > org.apache.hadoop.hbase prefix). I had to do it because we have
>> some
>> > tests
>> > > with same class name but in different packages.
>> > >
>> > > I'll take a look at it sometime this week (unless someone wants to
>> > take it
>> > > up and work on this beautiful piece of infra ;) )
>> > >
>> > >
>> > > On Thu, Jul 6, 2017 at 11:25 PM, Stack  wrote:
>> > >
>> 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-06 Thread Stack
On Mon, Nov 6, 2017 at 8:35 AM, Sean Busbey  wrote:

> Given that all of the old post-commit tests have been posting that
> they're failing to JIRAs for what looks like a month, is there any
> reason not to switch to the new tests that also say they're failing?
>
>
No reason.



> The reason HBASE-18467 has been sitting on hold this whole time has
> been because the new nightly branch tests keep complaining about
> failures.
>
>
Looking just now, it looks like killed-off test runs.

+1 on move to nightlies.

Can I help?

Should I be able to see the machine dir when I look at nightlies output?
(Was trying to see what else is running).

Thanks Sean,
St.Ack






> On Mon, Nov 6, 2017 at 10:21 AM, Sean Busbey 
> wrote:
> > It looks like old tests branch-1.2 and branch-1.3 are failing with
> > some maven enforcer problem that we thought we had fixed a few times
> > before. It's probably fixable by changing the version of maven they
> > use, but I'd much rather any test effort go into the last mile of
> > getting our new nightly tests working.
> >
> > I'll start picking this up as soon as I close out HBASE-18784.
> >
> > Please consider branch-1.2 release blocked. :(
> >
> > On Mon, Nov 6, 2017 at 10:19 AM, Stack  wrote:
> >> Our builds seem pretty sick up on builds.apache.org even after the
> miracle
> >> work by Allen W containing errant hadoop processes. Looking at 1.2 and
> 1.3,
> >> we don't even get off the ground. Anyone been taking a look?
> >>
> >> When I try to run the branch-1.2 and branch-1.3 unit tests locally,
> about
> >> ten tests or so timeout. Have others tried branch-1 test runs recently?
> >>
> >> Thanks,
> >> S
> >>
> >>
> >> On Mon, Aug 21, 2017 at 1:54 PM, Stack  wrote:
> >>
> >>> Loads of tests timing out in test runs -- then they all pass. Anyone
> have
> >>> an input? I'm trying to take a look as background task...
> >>>
> >>> S
> >>>
> >>> On Tue, Jul 11, 2017 at 7:05 PM, Stack  wrote:
> >>>
>  Thanks Appy.
> 
>  Any one looking at the 'ERROR ExecutionException Java heap space...'
>  errors on patch builds or failed forking? Seems common enough. Here
> are
>  complaints that remote JVM went away:
> 
>  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>  HBASE-Build/7617/artifact/patchprocess/patch-unit-hbase-server.txt
>  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>  HBASE-Build/7616/artifact/patchprocess/patch-unit-hbase-server.txt
> 
>  Then this succeeds
> 
>  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>  HBASE-Build/7614/artifact/patchprocess/patch-unit-hbase-server.txt
> 
>  And we are good for a while.
> 
>  Then heap issues:
> 
>  https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>  HBASE-Build/7607/artifact/patchprocess/patch-unit-hbase-server.txt
> 
>  Are the zombies back?
> 
>  St.Ack
> 
>  On Tue, Jul 11, 2017 at 12:33 AM, Apekshit Sharma 
>  wrote:
> 
> > Fixed 'trends' in flaky dashboard. Since i changed the test names in
> last
> > fix, the dots in the name were messing up with CSS selectors. :)
> >
> >
> > On Mon, Jul 10, 2017 at 11:34 AM, Apekshit Sharma  >
> > wrote:
> >
> > > Quick update on flaky dashboard:
> > > Flaky dashboard wasn't working earlier because our trunk build was
> > broken.
> > > After trunk was fixed, the format of log lines in consoleText was
> not
> > the
> > > same, so findHangingTests.py was not able to parse it correctly for
> > > broken/hanging/timeout tests. That's been fixed now HBASE-18341
> > > .
> > > Drob brought up in other thread that 'treads' isn't working. It's
> > probably
> > > because i changed tests names (which are used as keys in python
> dicts)
> > from
> > > just class name to package name+classname (without common
> > > org.apache.hadoop.hbase prefix). I had to do it because we have
> some
> > tests
> > > with same class name but in different packages.
> > >
> > > I'll take a look at it sometime this week (unless someone wants to
> > take it
> > > up and work on this beautiful piece of infra ;) )
> > >
> > >
> > > On Thu, Jul 6, 2017 at 11:25 PM, Stack  wrote:
> > >
> > >> On Thu, Jul 6, 2017 at 3:45 PM, Sean Busbey 
> > wrote:
> > >>
> > >> > that sounds like our project structure is broken. Please make
> sure
> > >> there's
> > >> > a jira that tracks it and I'll take a look later.
> > >> >
> > >> >
> > >>
> > >> Filed HBASE-18331 for now.
> > >>
> > >> I can take a look too later.
> > >>
> > >> St.Ack
> > >>
> > >>
> > >>
> > >> > On Thu, Jul 6, 2017 at 6:15 PM, Stack  wrote:
> > >> >
> > >> > > I tried publishing hbase-3.0.0-SNAPSHOT... so hbase-checkstyle
> > was up
> > >> in
> >>

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-06 Thread Sean Busbey
Given that all of the old post-commit tests have been posting that
they're failing to JIRAs for what looks like a month, is there any
reason not to switch to the new tests that also say they're failing?

The reason HBASE-18467 has been sitting on hold this whole time has
been because the new nightly branch tests keep complaining about
failures.

On Mon, Nov 6, 2017 at 10:21 AM, Sean Busbey  wrote:
> It looks like old tests branch-1.2 and branch-1.3 are failing with
> some maven enforcer problem that we thought we had fixed a few times
> before. It's probably fixable by changing the version of maven they
> use, but I'd much rather any test effort go into the last mile of
> getting our new nightly tests working.
>
> I'll start picking this up as soon as I close out HBASE-18784.
>
> Please consider branch-1.2 release blocked. :(
>
> On Mon, Nov 6, 2017 at 10:19 AM, Stack  wrote:
>> Our builds seem pretty sick up on builds.apache.org even after the miracle
>> work by Allen W containing errant hadoop processes. Looking at 1.2 and 1.3,
>> we don't even get off the ground. Anyone been taking a look?
>>
>> When I try to run the branch-1.2 and branch-1.3 unit tests locally, about
>> ten tests or so timeout. Have others tried branch-1 test runs recently?
>>
>> Thanks,
>> S
>>
>>
>> On Mon, Aug 21, 2017 at 1:54 PM, Stack  wrote:
>>
>>> Loads of tests timing out in test runs -- then they all pass. Anyone have
>>> an input? I'm trying to take a look as background task...
>>>
>>> S
>>>
>>> On Tue, Jul 11, 2017 at 7:05 PM, Stack  wrote:
>>>
 Thanks Appy.

 Any one looking at the 'ERROR ExecutionException Java heap space...'
 errors on patch builds or failed forking? Seems common enough. Here are
 complaints that remote JVM went away:

 https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
 HBASE-Build/7617/artifact/patchprocess/patch-unit-hbase-server.txt
 https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
 HBASE-Build/7616/artifact/patchprocess/patch-unit-hbase-server.txt

 Then this succeeds

 https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
 HBASE-Build/7614/artifact/patchprocess/patch-unit-hbase-server.txt

 And we are good for a while.

 Then heap issues:

 https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
 HBASE-Build/7607/artifact/patchprocess/patch-unit-hbase-server.txt

 Are the zombies back?

 St.Ack

 On Tue, Jul 11, 2017 at 12:33 AM, Apekshit Sharma 
 wrote:

> Fixed 'trends' in flaky dashboard. Since i changed the test names in last
> fix, the dots in the name were messing up with CSS selectors. :)
>
>
> On Mon, Jul 10, 2017 at 11:34 AM, Apekshit Sharma 
> wrote:
>
> > Quick update on flaky dashboard:
> > Flaky dashboard wasn't working earlier because our trunk build was
> broken.
> > After trunk was fixed, the format of log lines in consoleText was not
> the
> > same, so findHangingTests.py was not able to parse it correctly for
> > broken/hanging/timeout tests. That's been fixed now HBASE-18341
> > .
> > Drob brought up in other thread that 'treads' isn't working. It's
> probably
> > because i changed tests names (which are used as keys in python dicts)
> from
> > just class name to package name+classname (without common
> > org.apache.hadoop.hbase prefix). I had to do it because we have some
> tests
> > with same class name but in different packages.
> >
> > I'll take a look at it sometime this week (unless someone wants to
> take it
> > up and work on this beautiful piece of infra ;) )
> >
> >
> > On Thu, Jul 6, 2017 at 11:25 PM, Stack  wrote:
> >
> >> On Thu, Jul 6, 2017 at 3:45 PM, Sean Busbey 
> wrote:
> >>
> >> > that sounds like our project structure is broken. Please make sure
> >> there's
> >> > a jira that tracks it and I'll take a look later.
> >> >
> >> >
> >>
> >> Filed HBASE-18331 for now.
> >>
> >> I can take a look too later.
> >>
> >> St.Ack
> >>
> >>
> >>
> >> > On Thu, Jul 6, 2017 at 6:15 PM, Stack  wrote:
> >> >
> >> > > I tried publishing hbase-3.0.0-SNAPSHOT... so hbase-checkstyle
> was up
> >> in
> >> > > repo (presuming it relied on an aged-out snapshot). Seems to have
> >> 'fixed'
> >> > > it for now
> >> > >
> >> > > St.Ack
> >> > >
> >> > > On Thu, Jul 6, 2017 at 12:50 PM, Stack  wrote:
> >> > >
> >> > > > The 3.0.0-SNAPSHOT looks suspicious ... the hbase version
> >> > > > St.Ack
> >> > > >
> >> > > > On Thu, Jul 6, 2017 at 12:49 PM, Stack 
> wrote:
> >> > > >
> >> > > >> On Thu, Jul 6, 2017 at 12:48 PM, Stack 
> wrote:
> >> > > >>
> >> > > >>> Checkstyle is currently broke on our 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-06 Thread Sean Busbey
It looks like old tests branch-1.2 and branch-1.3 are failing with
some maven enforcer problem that we thought we had fixed a few times
before. It's probably fixable by changing the version of maven they
use, but I'd much rather any test effort go into the last mile of
getting our new nightly tests working.

I'll start picking this up as soon as I close out HBASE-18784.

Please consider branch-1.2 release blocked. :(

On Mon, Nov 6, 2017 at 10:19 AM, Stack  wrote:
> Our builds seem pretty sick up on builds.apache.org even after the miracle
> work by Allen W containing errant hadoop processes. Looking at 1.2 and 1.3,
> we don't even get off the ground. Anyone been taking a look?
>
> When I try to run the branch-1.2 and branch-1.3 unit tests locally, about
> ten tests or so timeout. Have others tried branch-1 test runs recently?
>
> Thanks,
> S
>
>
> On Mon, Aug 21, 2017 at 1:54 PM, Stack  wrote:
>
>> Loads of tests timing out in test runs -- then they all pass. Anyone have
>> an input? I'm trying to take a look as background task...
>>
>> S
>>
>> On Tue, Jul 11, 2017 at 7:05 PM, Stack  wrote:
>>
>>> Thanks Appy.
>>>
>>> Any one looking at the 'ERROR ExecutionException Java heap space...'
>>> errors on patch builds or failed forking? Seems common enough. Here are
>>> complaints that remote JVM went away:
>>>
>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>> HBASE-Build/7617/artifact/patchprocess/patch-unit-hbase-server.txt
>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>> HBASE-Build/7616/artifact/patchprocess/patch-unit-hbase-server.txt
>>>
>>> Then this succeeds
>>>
>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>> HBASE-Build/7614/artifact/patchprocess/patch-unit-hbase-server.txt
>>>
>>> And we are good for a while.
>>>
>>> Then heap issues:
>>>
>>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>>> HBASE-Build/7607/artifact/patchprocess/patch-unit-hbase-server.txt
>>>
>>> Are the zombies back?
>>>
>>> St.Ack
>>>
>>> On Tue, Jul 11, 2017 at 12:33 AM, Apekshit Sharma 
>>> wrote:
>>>
 Fixed 'trends' in flaky dashboard. Since i changed the test names in last
 fix, the dots in the name were messing up with CSS selectors. :)


 On Mon, Jul 10, 2017 at 11:34 AM, Apekshit Sharma 
 wrote:

 > Quick update on flaky dashboard:
 > Flaky dashboard wasn't working earlier because our trunk build was
 broken.
 > After trunk was fixed, the format of log lines in consoleText was not
 the
 > same, so findHangingTests.py was not able to parse it correctly for
 > broken/hanging/timeout tests. That's been fixed now HBASE-18341
 > .
 > Drob brought up in other thread that 'treads' isn't working. It's
 probably
 > because i changed tests names (which are used as keys in python dicts)
 from
 > just class name to package name+classname (without common
 > org.apache.hadoop.hbase prefix). I had to do it because we have some
 tests
 > with same class name but in different packages.
 >
 > I'll take a look at it sometime this week (unless someone wants to
 take it
 > up and work on this beautiful piece of infra ;) )
 >
 >
 > On Thu, Jul 6, 2017 at 11:25 PM, Stack  wrote:
 >
 >> On Thu, Jul 6, 2017 at 3:45 PM, Sean Busbey 
 wrote:
 >>
 >> > that sounds like our project structure is broken. Please make sure
 >> there's
 >> > a jira that tracks it and I'll take a look later.
 >> >
 >> >
 >>
 >> Filed HBASE-18331 for now.
 >>
 >> I can take a look too later.
 >>
 >> St.Ack
 >>
 >>
 >>
 >> > On Thu, Jul 6, 2017 at 6:15 PM, Stack  wrote:
 >> >
 >> > > I tried publishing hbase-3.0.0-SNAPSHOT... so hbase-checkstyle
 was up
 >> in
 >> > > repo (presuming it relied on an aged-out snapshot). Seems to have
 >> 'fixed'
 >> > > it for now
 >> > >
 >> > > St.Ack
 >> > >
 >> > > On Thu, Jul 6, 2017 at 12:50 PM, Stack  wrote:
 >> > >
 >> > > > The 3.0.0-SNAPSHOT looks suspicious ... the hbase version
 >> > > > St.Ack
 >> > > >
 >> > > > On Thu, Jul 6, 2017 at 12:49 PM, Stack 
 wrote:
 >> > > >
 >> > > >> On Thu, Jul 6, 2017 at 12:48 PM, Stack 
 wrote:
 >> > > >>
 >> > > >>> Checkstyle is currently broke on our builds... looking.
 >> > > >>> St.Ack
 >> > > >>>
 >> > > >>>
 >> > > >> Works if I run it locally (of course)
 >> > > >> St.Ack
 >> > > >>
 >> > > >>
 >> > > >>
 >> > > >>
 >> > > >>>
 >> > > >>>
 >> > > >>> [ERROR] Failed to execute goal org.apache.maven.plugins:
 >> > > maven-checkstyle-plugin:2.17:checkstyle (default-cli) on project
 >> hbase:
 >> > > Execution default-cli of goal org.apache.maven.plugins:
 >> > > maven-checkstyle-plugin:2.17:checkstyle failed: Plugin

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-11-06 Thread Stack
Our builds seem pretty sick up on builds.apache.org even after the miracle
work by Allen W containing errant hadoop processes. Looking at 1.2 and 1.3,
we don't even get off the ground. Anyone been taking a look?

When I try to run the branch-1.2 and branch-1.3 unit tests locally, about
ten tests or so timeout. Have others tried branch-1 test runs recently?

Thanks,
S


On Mon, Aug 21, 2017 at 1:54 PM, Stack  wrote:

> Loads of tests timing out in test runs -- then they all pass. Anyone have
> an input? I'm trying to take a look as background task...
>
> S
>
> On Tue, Jul 11, 2017 at 7:05 PM, Stack  wrote:
>
>> Thanks Appy.
>>
>> Any one looking at the 'ERROR ExecutionException Java heap space...'
>> errors on patch builds or failed forking? Seems common enough. Here are
>> complaints that remote JVM went away:
>>
>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>> HBASE-Build/7617/artifact/patchprocess/patch-unit-hbase-server.txt
>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>> HBASE-Build/7616/artifact/patchprocess/patch-unit-hbase-server.txt
>>
>> Then this succeeds
>>
>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>> HBASE-Build/7614/artifact/patchprocess/patch-unit-hbase-server.txt
>>
>> And we are good for a while.
>>
>> Then heap issues:
>>
>> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-
>> HBASE-Build/7607/artifact/patchprocess/patch-unit-hbase-server.txt
>>
>> Are the zombies back?
>>
>> St.Ack
>>
>> On Tue, Jul 11, 2017 at 12:33 AM, Apekshit Sharma 
>> wrote:
>>
>>> Fixed 'trends' in flaky dashboard. Since i changed the test names in last
>>> fix, the dots in the name were messing up with CSS selectors. :)
>>>
>>>
>>> On Mon, Jul 10, 2017 at 11:34 AM, Apekshit Sharma 
>>> wrote:
>>>
>>> > Quick update on flaky dashboard:
>>> > Flaky dashboard wasn't working earlier because our trunk build was
>>> broken.
>>> > After trunk was fixed, the format of log lines in consoleText was not
>>> the
>>> > same, so findHangingTests.py was not able to parse it correctly for
>>> > broken/hanging/timeout tests. That's been fixed now HBASE-18341
>>> > .
>>> > Drob brought up in other thread that 'treads' isn't working. It's
>>> probably
>>> > because i changed tests names (which are used as keys in python dicts)
>>> from
>>> > just class name to package name+classname (without common
>>> > org.apache.hadoop.hbase prefix). I had to do it because we have some
>>> tests
>>> > with same class name but in different packages.
>>> >
>>> > I'll take a look at it sometime this week (unless someone wants to
>>> take it
>>> > up and work on this beautiful piece of infra ;) )
>>> >
>>> >
>>> > On Thu, Jul 6, 2017 at 11:25 PM, Stack  wrote:
>>> >
>>> >> On Thu, Jul 6, 2017 at 3:45 PM, Sean Busbey 
>>> wrote:
>>> >>
>>> >> > that sounds like our project structure is broken. Please make sure
>>> >> there's
>>> >> > a jira that tracks it and I'll take a look later.
>>> >> >
>>> >> >
>>> >>
>>> >> Filed HBASE-18331 for now.
>>> >>
>>> >> I can take a look too later.
>>> >>
>>> >> St.Ack
>>> >>
>>> >>
>>> >>
>>> >> > On Thu, Jul 6, 2017 at 6:15 PM, Stack  wrote:
>>> >> >
>>> >> > > I tried publishing hbase-3.0.0-SNAPSHOT... so hbase-checkstyle
>>> was up
>>> >> in
>>> >> > > repo (presuming it relied on an aged-out snapshot). Seems to have
>>> >> 'fixed'
>>> >> > > it for now
>>> >> > >
>>> >> > > St.Ack
>>> >> > >
>>> >> > > On Thu, Jul 6, 2017 at 12:50 PM, Stack  wrote:
>>> >> > >
>>> >> > > > The 3.0.0-SNAPSHOT looks suspicious ... the hbase version
>>> >> > > > St.Ack
>>> >> > > >
>>> >> > > > On Thu, Jul 6, 2017 at 12:49 PM, Stack 
>>> wrote:
>>> >> > > >
>>> >> > > >> On Thu, Jul 6, 2017 at 12:48 PM, Stack 
>>> wrote:
>>> >> > > >>
>>> >> > > >>> Checkstyle is currently broke on our builds... looking.
>>> >> > > >>> St.Ack
>>> >> > > >>>
>>> >> > > >>>
>>> >> > > >> Works if I run it locally (of course)
>>> >> > > >> St.Ack
>>> >> > > >>
>>> >> > > >>
>>> >> > > >>
>>> >> > > >>
>>> >> > > >>>
>>> >> > > >>>
>>> >> > > >>> [ERROR] Failed to execute goal org.apache.maven.plugins:
>>> >> > > maven-checkstyle-plugin:2.17:checkstyle (default-cli) on project
>>> >> hbase:
>>> >> > > Execution default-cli of goal org.apache.maven.plugins:
>>> >> > > maven-checkstyle-plugin:2.17:checkstyle failed: Plugin
>>> >> > > org.apache.maven.plugins:maven-checkstyle-plugin:2.17 or one of
>>> its
>>> >> > > dependencies could not be resolved: Could not find artifact
>>> >> > > org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT in Nexus (
>>> >> > > http://repository.apache.org/snapshots) -> [Help 1][ERROR]
>>> [ERROR] To
>>> >> > see
>>> >> > > the full stack trace of the errors, re-run Maven with the -e
>>> >> > switch.[ERROR]
>>> >> > > Re-run Maven using the -X switch to enable full debug
>>> logging.[ERROR]
>>> >> > > [ERROR] For more information about the errors and possible
>>> solutions,
>>> >> > > please rea

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-08-21 Thread Stack
Loads of tests timing out in test runs -- then they all pass. Anyone have
an input? I'm trying to take a look as background task...

S

On Tue, Jul 11, 2017 at 7:05 PM, Stack  wrote:

> Thanks Appy.
>
> Any one looking at the 'ERROR ExecutionException Java heap space...'
> errors on patch builds or failed forking? Seems common enough. Here are
> complaints that remote JVM went away:
>
> https://builds.apache.org/view/H-L/view/HBase/job/
> PreCommit-HBASE-Build/7617/artifact/patchprocess/patch-
> unit-hbase-server.txt
> https://builds.apache.org/view/H-L/view/HBase/job/
> PreCommit-HBASE-Build/7616/artifact/patchprocess/patch-
> unit-hbase-server.txt
>
> Then this succeeds
>
> https://builds.apache.org/view/H-L/view/HBase/job/
> PreCommit-HBASE-Build/7614/artifact/patchprocess/patch-
> unit-hbase-server.txt
>
> And we are good for a while.
>
> Then heap issues:
>
> https://builds.apache.org/view/H-L/view/HBase/job/
> PreCommit-HBASE-Build/7607/artifact/patchprocess/patch-
> unit-hbase-server.txt
>
> Are the zombies back?
>
> St.Ack
>
> On Tue, Jul 11, 2017 at 12:33 AM, Apekshit Sharma 
> wrote:
>
>> Fixed 'trends' in flaky dashboard. Since i changed the test names in last
>> fix, the dots in the name were messing up with CSS selectors. :)
>>
>>
>> On Mon, Jul 10, 2017 at 11:34 AM, Apekshit Sharma 
>> wrote:
>>
>> > Quick update on flaky dashboard:
>> > Flaky dashboard wasn't working earlier because our trunk build was
>> broken.
>> > After trunk was fixed, the format of log lines in consoleText was not
>> the
>> > same, so findHangingTests.py was not able to parse it correctly for
>> > broken/hanging/timeout tests. That's been fixed now HBASE-18341
>> > .
>> > Drob brought up in other thread that 'treads' isn't working. It's
>> probably
>> > because i changed tests names (which are used as keys in python dicts)
>> from
>> > just class name to package name+classname (without common
>> > org.apache.hadoop.hbase prefix). I had to do it because we have some
>> tests
>> > with same class name but in different packages.
>> >
>> > I'll take a look at it sometime this week (unless someone wants to take
>> it
>> > up and work on this beautiful piece of infra ;) )
>> >
>> >
>> > On Thu, Jul 6, 2017 at 11:25 PM, Stack  wrote:
>> >
>> >> On Thu, Jul 6, 2017 at 3:45 PM, Sean Busbey  wrote:
>> >>
>> >> > that sounds like our project structure is broken. Please make sure
>> >> there's
>> >> > a jira that tracks it and I'll take a look later.
>> >> >
>> >> >
>> >>
>> >> Filed HBASE-18331 for now.
>> >>
>> >> I can take a look too later.
>> >>
>> >> St.Ack
>> >>
>> >>
>> >>
>> >> > On Thu, Jul 6, 2017 at 6:15 PM, Stack  wrote:
>> >> >
>> >> > > I tried publishing hbase-3.0.0-SNAPSHOT... so hbase-checkstyle was
>> up
>> >> in
>> >> > > repo (presuming it relied on an aged-out snapshot). Seems to have
>> >> 'fixed'
>> >> > > it for now
>> >> > >
>> >> > > St.Ack
>> >> > >
>> >> > > On Thu, Jul 6, 2017 at 12:50 PM, Stack  wrote:
>> >> > >
>> >> > > > The 3.0.0-SNAPSHOT looks suspicious ... the hbase version
>> >> > > > St.Ack
>> >> > > >
>> >> > > > On Thu, Jul 6, 2017 at 12:49 PM, Stack  wrote:
>> >> > > >
>> >> > > >> On Thu, Jul 6, 2017 at 12:48 PM, Stack 
>> wrote:
>> >> > > >>
>> >> > > >>> Checkstyle is currently broke on our builds... looking.
>> >> > > >>> St.Ack
>> >> > > >>>
>> >> > > >>>
>> >> > > >> Works if I run it locally (of course)
>> >> > > >> St.Ack
>> >> > > >>
>> >> > > >>
>> >> > > >>
>> >> > > >>
>> >> > > >>>
>> >> > > >>>
>> >> > > >>> [ERROR] Failed to execute goal org.apache.maven.plugins:
>> >> > > maven-checkstyle-plugin:2.17:checkstyle (default-cli) on project
>> >> hbase:
>> >> > > Execution default-cli of goal org.apache.maven.plugins:
>> >> > > maven-checkstyle-plugin:2.17:checkstyle failed: Plugin
>> >> > > org.apache.maven.plugins:maven-checkstyle-plugin:2.17 or one of
>> its
>> >> > > dependencies could not be resolved: Could not find artifact
>> >> > > org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT in Nexus (
>> >> > > http://repository.apache.org/snapshots) -> [Help 1][ERROR]
>> [ERROR] To
>> >> > see
>> >> > > the full stack trace of the errors, re-run Maven with the -e
>> >> > switch.[ERROR]
>> >> > > Re-run Maven using the -X switch to enable full debug
>> logging.[ERROR]
>> >> > > [ERROR] For more information about the errors and possible
>> solutions,
>> >> > > please read the following articles:[ERROR] [Help 1]
>> >> > > http://cwiki.apache.org/confluence/display/MAVEN/
>> >> > > PluginResolutionExceptionBuild step 'Invoke top-level Maven
>> targets'
>> >> > > marked build as failure
>> >> > > >>> Performing Post build task...
>> >> > > >>> Match found for :.* : True
>> >> > > >>> Logical operation result is TRUE
>> >> > > >>> Running script  : # Run zombie detector script
>> >> > > >>> ./dev-support/zombie-detector.sh --jenkins ${BUILD_ID}
>> >> > > >>> [a3159d73] $ /bin/bash -xe /tmp/hudson1697041977582083402.sh
>

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-07-11 Thread Stack
Thanks Appy.

Any one looking at the 'ERROR ExecutionException Java heap space...' errors
on patch builds or failed forking? Seems common enough. Here are complaints
that remote JVM went away:

https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/7617/artifact/patchprocess/patch-unit-hbase-server.txt
https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/7616/artifact/patchprocess/patch-unit-hbase-server.txt

Then this succeeds

https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/7614/artifact/patchprocess/patch-unit-hbase-server.txt

And we are good for a while.

Then heap issues:

https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/7607/artifact/patchprocess/patch-unit-hbase-server.txt

Are the zombies back?

St.Ack

On Tue, Jul 11, 2017 at 12:33 AM, Apekshit Sharma  wrote:

> Fixed 'trends' in flaky dashboard. Since i changed the test names in last
> fix, the dots in the name were messing up with CSS selectors. :)
>
>
> On Mon, Jul 10, 2017 at 11:34 AM, Apekshit Sharma 
> wrote:
>
> > Quick update on flaky dashboard:
> > Flaky dashboard wasn't working earlier because our trunk build was
> broken.
> > After trunk was fixed, the format of log lines in consoleText was not the
> > same, so findHangingTests.py was not able to parse it correctly for
> > broken/hanging/timeout tests. That's been fixed now HBASE-18341
> > .
> > Drob brought up in other thread that 'treads' isn't working. It's
> probably
> > because i changed tests names (which are used as keys in python dicts)
> from
> > just class name to package name+classname (without common
> > org.apache.hadoop.hbase prefix). I had to do it because we have some
> tests
> > with same class name but in different packages.
> >
> > I'll take a look at it sometime this week (unless someone wants to take
> it
> > up and work on this beautiful piece of infra ;) )
> >
> >
> > On Thu, Jul 6, 2017 at 11:25 PM, Stack  wrote:
> >
> >> On Thu, Jul 6, 2017 at 3:45 PM, Sean Busbey  wrote:
> >>
> >> > that sounds like our project structure is broken. Please make sure
> >> there's
> >> > a jira that tracks it and I'll take a look later.
> >> >
> >> >
> >>
> >> Filed HBASE-18331 for now.
> >>
> >> I can take a look too later.
> >>
> >> St.Ack
> >>
> >>
> >>
> >> > On Thu, Jul 6, 2017 at 6:15 PM, Stack  wrote:
> >> >
> >> > > I tried publishing hbase-3.0.0-SNAPSHOT... so hbase-checkstyle was
> up
> >> in
> >> > > repo (presuming it relied on an aged-out snapshot). Seems to have
> >> 'fixed'
> >> > > it for now
> >> > >
> >> > > St.Ack
> >> > >
> >> > > On Thu, Jul 6, 2017 at 12:50 PM, Stack  wrote:
> >> > >
> >> > > > The 3.0.0-SNAPSHOT looks suspicious ... the hbase version
> >> > > > St.Ack
> >> > > >
> >> > > > On Thu, Jul 6, 2017 at 12:49 PM, Stack  wrote:
> >> > > >
> >> > > >> On Thu, Jul 6, 2017 at 12:48 PM, Stack  wrote:
> >> > > >>
> >> > > >>> Checkstyle is currently broke on our builds... looking.
> >> > > >>> St.Ack
> >> > > >>>
> >> > > >>>
> >> > > >> Works if I run it locally (of course)
> >> > > >> St.Ack
> >> > > >>
> >> > > >>
> >> > > >>
> >> > > >>
> >> > > >>>
> >> > > >>>
> >> > > >>> [ERROR] Failed to execute goal org.apache.maven.plugins:
> >> > > maven-checkstyle-plugin:2.17:checkstyle (default-cli) on project
> >> hbase:
> >> > > Execution default-cli of goal org.apache.maven.plugins:
> >> > > maven-checkstyle-plugin:2.17:checkstyle failed: Plugin
> >> > > org.apache.maven.plugins:maven-checkstyle-plugin:2.17 or one of its
> >> > > dependencies could not be resolved: Could not find artifact
> >> > > org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT in Nexus (
> >> > > http://repository.apache.org/snapshots) -> [Help 1][ERROR] [ERROR]
> To
> >> > see
> >> > > the full stack trace of the errors, re-run Maven with the -e
> >> > switch.[ERROR]
> >> > > Re-run Maven using the -X switch to enable full debug
> logging.[ERROR]
> >> > > [ERROR] For more information about the errors and possible
> solutions,
> >> > > please read the following articles:[ERROR] [Help 1]
> >> > > http://cwiki.apache.org/confluence/display/MAVEN/
> >> > > PluginResolutionExceptionBuild step 'Invoke top-level Maven targets'
> >> > > marked build as failure
> >> > > >>> Performing Post build task...
> >> > > >>> Match found for :.* : True
> >> > > >>> Logical operation result is TRUE
> >> > > >>> Running script  : # Run zombie detector script
> >> > > >>> ./dev-support/zombie-detector.sh --jenkins ${BUILD_ID}
> >> > > >>> [a3159d73] $ /bin/bash -xe /tmp/hudson1697041977582083402.sh
> >> > > >>> + ./dev-support/zombie-detector.sh --jenkins 3320
> >> > > >>> Thu Jul  6 01:37:09 UTC 2017 We're ok: there is no zombie test
> >> > > >>>
> >> > > >>>
> >> > > >>>
> >> > > >>>
> >> > > >>> On Fri, Jun 30, 2017 at 2:43 PM, Sean Busbey  >
> >> > > wrote:
> >> > > >>>
> >> > >  jacoco was added ages ago. I'd guess that something changed on
> >> the

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-07-10 Thread Apekshit Sharma
Fixed 'trends' in flaky dashboard. Since i changed the test names in last
fix, the dots in the name were messing up with CSS selectors. :)


On Mon, Jul 10, 2017 at 11:34 AM, Apekshit Sharma  wrote:

> Quick update on flaky dashboard:
> Flaky dashboard wasn't working earlier because our trunk build was broken.
> After trunk was fixed, the format of log lines in consoleText was not the
> same, so findHangingTests.py was not able to parse it correctly for
> broken/hanging/timeout tests. That's been fixed now HBASE-18341
> .
> Drob brought up in other thread that 'treads' isn't working. It's probably
> because i changed tests names (which are used as keys in python dicts) from
> just class name to package name+classname (without common
> org.apache.hadoop.hbase prefix). I had to do it because we have some tests
> with same class name but in different packages.
>
> I'll take a look at it sometime this week (unless someone wants to take it
> up and work on this beautiful piece of infra ;) )
>
>
> On Thu, Jul 6, 2017 at 11:25 PM, Stack  wrote:
>
>> On Thu, Jul 6, 2017 at 3:45 PM, Sean Busbey  wrote:
>>
>> > that sounds like our project structure is broken. Please make sure
>> there's
>> > a jira that tracks it and I'll take a look later.
>> >
>> >
>>
>> Filed HBASE-18331 for now.
>>
>> I can take a look too later.
>>
>> St.Ack
>>
>>
>>
>> > On Thu, Jul 6, 2017 at 6:15 PM, Stack  wrote:
>> >
>> > > I tried publishing hbase-3.0.0-SNAPSHOT... so hbase-checkstyle was up
>> in
>> > > repo (presuming it relied on an aged-out snapshot). Seems to have
>> 'fixed'
>> > > it for now
>> > >
>> > > St.Ack
>> > >
>> > > On Thu, Jul 6, 2017 at 12:50 PM, Stack  wrote:
>> > >
>> > > > The 3.0.0-SNAPSHOT looks suspicious ... the hbase version
>> > > > St.Ack
>> > > >
>> > > > On Thu, Jul 6, 2017 at 12:49 PM, Stack  wrote:
>> > > >
>> > > >> On Thu, Jul 6, 2017 at 12:48 PM, Stack  wrote:
>> > > >>
>> > > >>> Checkstyle is currently broke on our builds... looking.
>> > > >>> St.Ack
>> > > >>>
>> > > >>>
>> > > >> Works if I run it locally (of course)
>> > > >> St.Ack
>> > > >>
>> > > >>
>> > > >>
>> > > >>
>> > > >>>
>> > > >>>
>> > > >>> [ERROR] Failed to execute goal org.apache.maven.plugins:
>> > > maven-checkstyle-plugin:2.17:checkstyle (default-cli) on project
>> hbase:
>> > > Execution default-cli of goal org.apache.maven.plugins:
>> > > maven-checkstyle-plugin:2.17:checkstyle failed: Plugin
>> > > org.apache.maven.plugins:maven-checkstyle-plugin:2.17 or one of its
>> > > dependencies could not be resolved: Could not find artifact
>> > > org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT in Nexus (
>> > > http://repository.apache.org/snapshots) -> [Help 1][ERROR] [ERROR] To
>> > see
>> > > the full stack trace of the errors, re-run Maven with the -e
>> > switch.[ERROR]
>> > > Re-run Maven using the -X switch to enable full debug logging.[ERROR]
>> > > [ERROR] For more information about the errors and possible solutions,
>> > > please read the following articles:[ERROR] [Help 1]
>> > > http://cwiki.apache.org/confluence/display/MAVEN/
>> > > PluginResolutionExceptionBuild step 'Invoke top-level Maven targets'
>> > > marked build as failure
>> > > >>> Performing Post build task...
>> > > >>> Match found for :.* : True
>> > > >>> Logical operation result is TRUE
>> > > >>> Running script  : # Run zombie detector script
>> > > >>> ./dev-support/zombie-detector.sh --jenkins ${BUILD_ID}
>> > > >>> [a3159d73] $ /bin/bash -xe /tmp/hudson1697041977582083402.sh
>> > > >>> + ./dev-support/zombie-detector.sh --jenkins 3320
>> > > >>> Thu Jul  6 01:37:09 UTC 2017 We're ok: there is no zombie test
>> > > >>>
>> > > >>>
>> > > >>>
>> > > >>>
>> > > >>> On Fri, Jun 30, 2017 at 2:43 PM, Sean Busbey 
>> > > wrote:
>> > > >>>
>> > >  jacoco was added ages ago. I'd guess that something changed on
>> the
>> > >  machines
>> > >  we use to cause it to stop working.
>> > > 
>> > >  On Thu, Jun 29, 2017 at 12:02 PM, Stack 
>> wrote:
>> > > 
>> > >  > On Wed, Jun 28, 2017 at 8:43 AM, Josh Elser > >
>> > >  wrote:
>> > >  >
>> > >  > >
>> > >  > >
>> > >  > > On 6/27/17 7:20 PM, Stack wrote:
>> > >  > >
>> > >  > >> * test-patch's whitespace plugin can configured to ignore
>> some
>> > >  files
>> > >  > (but
>> > >  > >>> I
>> > >  > >>> can't think of any we'd care to so whitelist)
>> > >  > >>>
>> > >  > >>> Generated files.
>> > >  > >>
>> > >  > >
>> > >  > > Oh my goodness, yes, please. This has been such a pain in the
>> > rear
>> > >  for me
>> > >  > > as I've been rebasing space quota patches. Sometimes, the
>> spaces
>> > > in
>> > >  > > pb-gen'ed code are removed by folks before commit, other
>> times
>> > > they
>> > >  > aren't.
>> > >  > >
>> > >  >
>> > >  > Agree sir. Its a distraction at least.
>> > >  >
>> > >  > I see Jacoco report here now:
>> 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-07-10 Thread Apekshit Sharma
Quick update on flaky dashboard:
Flaky dashboard wasn't working earlier because our trunk build was broken.
After trunk was fixed, the format of log lines in consoleText was not the
same, so findHangingTests.py was not able to parse it correctly for
broken/hanging/timeout tests. That's been fixed now HBASE-18341
.
Drob brought up in other thread that 'treads' isn't working. It's probably
because i changed tests names (which are used as keys in python dicts) from
just class name to package name+classname (without common
org.apache.hadoop.hbase prefix). I had to do it because we have some tests
with same class name but in different packages.

I'll take a look at it sometime this week (unless someone wants to take it
up and work on this beautiful piece of infra ;) )


On Thu, Jul 6, 2017 at 11:25 PM, Stack  wrote:

> On Thu, Jul 6, 2017 at 3:45 PM, Sean Busbey  wrote:
>
> > that sounds like our project structure is broken. Please make sure
> there's
> > a jira that tracks it and I'll take a look later.
> >
> >
>
> Filed HBASE-18331 for now.
>
> I can take a look too later.
>
> St.Ack
>
>
>
> > On Thu, Jul 6, 2017 at 6:15 PM, Stack  wrote:
> >
> > > I tried publishing hbase-3.0.0-SNAPSHOT... so hbase-checkstyle was up
> in
> > > repo (presuming it relied on an aged-out snapshot). Seems to have
> 'fixed'
> > > it for now
> > >
> > > St.Ack
> > >
> > > On Thu, Jul 6, 2017 at 12:50 PM, Stack  wrote:
> > >
> > > > The 3.0.0-SNAPSHOT looks suspicious ... the hbase version
> > > > St.Ack
> > > >
> > > > On Thu, Jul 6, 2017 at 12:49 PM, Stack  wrote:
> > > >
> > > >> On Thu, Jul 6, 2017 at 12:48 PM, Stack  wrote:
> > > >>
> > > >>> Checkstyle is currently broke on our builds... looking.
> > > >>> St.Ack
> > > >>>
> > > >>>
> > > >> Works if I run it locally (of course)
> > > >> St.Ack
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>>
> > > >>>
> > > >>> [ERROR] Failed to execute goal org.apache.maven.plugins:
> > > maven-checkstyle-plugin:2.17:checkstyle (default-cli) on project
> hbase:
> > > Execution default-cli of goal org.apache.maven.plugins:
> > > maven-checkstyle-plugin:2.17:checkstyle failed: Plugin
> > > org.apache.maven.plugins:maven-checkstyle-plugin:2.17 or one of its
> > > dependencies could not be resolved: Could not find artifact
> > > org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT in Nexus (
> > > http://repository.apache.org/snapshots) -> [Help 1][ERROR] [ERROR] To
> > see
> > > the full stack trace of the errors, re-run Maven with the -e
> > switch.[ERROR]
> > > Re-run Maven using the -X switch to enable full debug logging.[ERROR]
> > > [ERROR] For more information about the errors and possible solutions,
> > > please read the following articles:[ERROR] [Help 1]
> > > http://cwiki.apache.org/confluence/display/MAVEN/
> > > PluginResolutionExceptionBuild step 'Invoke top-level Maven targets'
> > > marked build as failure
> > > >>> Performing Post build task...
> > > >>> Match found for :.* : True
> > > >>> Logical operation result is TRUE
> > > >>> Running script  : # Run zombie detector script
> > > >>> ./dev-support/zombie-detector.sh --jenkins ${BUILD_ID}
> > > >>> [a3159d73] $ /bin/bash -xe /tmp/hudson1697041977582083402.sh
> > > >>> + ./dev-support/zombie-detector.sh --jenkins 3320
> > > >>> Thu Jul  6 01:37:09 UTC 2017 We're ok: there is no zombie test
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Fri, Jun 30, 2017 at 2:43 PM, Sean Busbey 
> > > wrote:
> > > >>>
> > >  jacoco was added ages ago. I'd guess that something changed on the
> > >  machines
> > >  we use to cause it to stop working.
> > > 
> > >  On Thu, Jun 29, 2017 at 12:02 PM, Stack  wrote:
> > > 
> > >  > On Wed, Jun 28, 2017 at 8:43 AM, Josh Elser 
> > >  wrote:
> > >  >
> > >  > >
> > >  > >
> > >  > > On 6/27/17 7:20 PM, Stack wrote:
> > >  > >
> > >  > >> * test-patch's whitespace plugin can configured to ignore
> some
> > >  files
> > >  > (but
> > >  > >>> I
> > >  > >>> can't think of any we'd care to so whitelist)
> > >  > >>>
> > >  > >>> Generated files.
> > >  > >>
> > >  > >
> > >  > > Oh my goodness, yes, please. This has been such a pain in the
> > rear
> > >  for me
> > >  > > as I've been rebasing space quota patches. Sometimes, the
> spaces
> > > in
> > >  > > pb-gen'ed code are removed by folks before commit, other times
> > > they
> > >  > aren't.
> > >  > >
> > >  >
> > >  > Agree sir. Its a distraction at least.
> > >  >
> > >  > I see Jacoco report here now:
> > >  > https://builds.apache.org/job/HBase-Trunk_matrix/jdk=JDK%
> > >  > 201.8%20(latest),label=Hadoop/3277/
> > >  >
> > >  > Maybe it has been there always and I just haven't noticed.
> > >  >
> > >  > Its all 0%. We need to turn on stuff?
> > >  >
> > >  > St.Ack
> > >  >
> > > 
> > > >>>
> > > >>>
> > > >>
> > > >
>

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-07-06 Thread Stack
On Thu, Jul 6, 2017 at 3:45 PM, Sean Busbey  wrote:

> that sounds like our project structure is broken. Please make sure there's
> a jira that tracks it and I'll take a look later.
>
>

Filed HBASE-18331 for now.

I can take a look too later.

St.Ack



> On Thu, Jul 6, 2017 at 6:15 PM, Stack  wrote:
>
> > I tried publishing hbase-3.0.0-SNAPSHOT... so hbase-checkstyle was up in
> > repo (presuming it relied on an aged-out snapshot). Seems to have 'fixed'
> > it for now
> >
> > St.Ack
> >
> > On Thu, Jul 6, 2017 at 12:50 PM, Stack  wrote:
> >
> > > The 3.0.0-SNAPSHOT looks suspicious ... the hbase version
> > > St.Ack
> > >
> > > On Thu, Jul 6, 2017 at 12:49 PM, Stack  wrote:
> > >
> > >> On Thu, Jul 6, 2017 at 12:48 PM, Stack  wrote:
> > >>
> > >>> Checkstyle is currently broke on our builds... looking.
> > >>> St.Ack
> > >>>
> > >>>
> > >> Works if I run it locally (of course)
> > >> St.Ack
> > >>
> > >>
> > >>
> > >>
> > >>>
> > >>>
> > >>> [ERROR] Failed to execute goal org.apache.maven.plugins:
> > maven-checkstyle-plugin:2.17:checkstyle (default-cli) on project hbase:
> > Execution default-cli of goal org.apache.maven.plugins:
> > maven-checkstyle-plugin:2.17:checkstyle failed: Plugin
> > org.apache.maven.plugins:maven-checkstyle-plugin:2.17 or one of its
> > dependencies could not be resolved: Could not find artifact
> > org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT in Nexus (
> > http://repository.apache.org/snapshots) -> [Help 1][ERROR] [ERROR] To
> see
> > the full stack trace of the errors, re-run Maven with the -e
> switch.[ERROR]
> > Re-run Maven using the -X switch to enable full debug logging.[ERROR]
> > [ERROR] For more information about the errors and possible solutions,
> > please read the following articles:[ERROR] [Help 1]
> > http://cwiki.apache.org/confluence/display/MAVEN/
> > PluginResolutionExceptionBuild step 'Invoke top-level Maven targets'
> > marked build as failure
> > >>> Performing Post build task...
> > >>> Match found for :.* : True
> > >>> Logical operation result is TRUE
> > >>> Running script  : # Run zombie detector script
> > >>> ./dev-support/zombie-detector.sh --jenkins ${BUILD_ID}
> > >>> [a3159d73] $ /bin/bash -xe /tmp/hudson1697041977582083402.sh
> > >>> + ./dev-support/zombie-detector.sh --jenkins 3320
> > >>> Thu Jul  6 01:37:09 UTC 2017 We're ok: there is no zombie test
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Fri, Jun 30, 2017 at 2:43 PM, Sean Busbey 
> > wrote:
> > >>>
> >  jacoco was added ages ago. I'd guess that something changed on the
> >  machines
> >  we use to cause it to stop working.
> > 
> >  On Thu, Jun 29, 2017 at 12:02 PM, Stack  wrote:
> > 
> >  > On Wed, Jun 28, 2017 at 8:43 AM, Josh Elser 
> >  wrote:
> >  >
> >  > >
> >  > >
> >  > > On 6/27/17 7:20 PM, Stack wrote:
> >  > >
> >  > >> * test-patch's whitespace plugin can configured to ignore some
> >  files
> >  > (but
> >  > >>> I
> >  > >>> can't think of any we'd care to so whitelist)
> >  > >>>
> >  > >>> Generated files.
> >  > >>
> >  > >
> >  > > Oh my goodness, yes, please. This has been such a pain in the
> rear
> >  for me
> >  > > as I've been rebasing space quota patches. Sometimes, the spaces
> > in
> >  > > pb-gen'ed code are removed by folks before commit, other times
> > they
> >  > aren't.
> >  > >
> >  >
> >  > Agree sir. Its a distraction at least.
> >  >
> >  > I see Jacoco report here now:
> >  > https://builds.apache.org/job/HBase-Trunk_matrix/jdk=JDK%
> >  > 201.8%20(latest),label=Hadoop/3277/
> >  >
> >  > Maybe it has been there always and I just haven't noticed.
> >  >
> >  > Its all 0%. We need to turn on stuff?
> >  >
> >  > St.Ack
> >  >
> > 
> > >>>
> > >>>
> > >>
> > >
> >
>


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-07-06 Thread Sean Busbey
that sounds like our project structure is broken. Please make sure there's
a jira that tracks it and I'll take a look later.

On Thu, Jul 6, 2017 at 6:15 PM, Stack  wrote:

> I tried publishing hbase-3.0.0-SNAPSHOT... so hbase-checkstyle was up in
> repo (presuming it relied on an aged-out snapshot). Seems to have 'fixed'
> it for now
>
> St.Ack
>
> On Thu, Jul 6, 2017 at 12:50 PM, Stack  wrote:
>
> > The 3.0.0-SNAPSHOT looks suspicious ... the hbase version
> > St.Ack
> >
> > On Thu, Jul 6, 2017 at 12:49 PM, Stack  wrote:
> >
> >> On Thu, Jul 6, 2017 at 12:48 PM, Stack  wrote:
> >>
> >>> Checkstyle is currently broke on our builds... looking.
> >>> St.Ack
> >>>
> >>>
> >> Works if I run it locally (of course)
> >> St.Ack
> >>
> >>
> >>
> >>
> >>>
> >>>
> >>> [ERROR] Failed to execute goal org.apache.maven.plugins:
> maven-checkstyle-plugin:2.17:checkstyle (default-cli) on project hbase:
> Execution default-cli of goal org.apache.maven.plugins:
> maven-checkstyle-plugin:2.17:checkstyle failed: Plugin
> org.apache.maven.plugins:maven-checkstyle-plugin:2.17 or one of its
> dependencies could not be resolved: Could not find artifact
> org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT in Nexus (
> http://repository.apache.org/snapshots) -> [Help 1][ERROR] [ERROR] To see
> the full stack trace of the errors, re-run Maven with the -e switch.[ERROR]
> Re-run Maven using the -X switch to enable full debug logging.[ERROR]
> [ERROR] For more information about the errors and possible solutions,
> please read the following articles:[ERROR] [Help 1]
> http://cwiki.apache.org/confluence/display/MAVEN/
> PluginResolutionExceptionBuild step 'Invoke top-level Maven targets'
> marked build as failure
> >>> Performing Post build task...
> >>> Match found for :.* : True
> >>> Logical operation result is TRUE
> >>> Running script  : # Run zombie detector script
> >>> ./dev-support/zombie-detector.sh --jenkins ${BUILD_ID}
> >>> [a3159d73] $ /bin/bash -xe /tmp/hudson1697041977582083402.sh
> >>> + ./dev-support/zombie-detector.sh --jenkins 3320
> >>> Thu Jul  6 01:37:09 UTC 2017 We're ok: there is no zombie test
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jun 30, 2017 at 2:43 PM, Sean Busbey 
> wrote:
> >>>
>  jacoco was added ages ago. I'd guess that something changed on the
>  machines
>  we use to cause it to stop working.
> 
>  On Thu, Jun 29, 2017 at 12:02 PM, Stack  wrote:
> 
>  > On Wed, Jun 28, 2017 at 8:43 AM, Josh Elser 
>  wrote:
>  >
>  > >
>  > >
>  > > On 6/27/17 7:20 PM, Stack wrote:
>  > >
>  > >> * test-patch's whitespace plugin can configured to ignore some
>  files
>  > (but
>  > >>> I
>  > >>> can't think of any we'd care to so whitelist)
>  > >>>
>  > >>> Generated files.
>  > >>
>  > >
>  > > Oh my goodness, yes, please. This has been such a pain in the rear
>  for me
>  > > as I've been rebasing space quota patches. Sometimes, the spaces
> in
>  > > pb-gen'ed code are removed by folks before commit, other times
> they
>  > aren't.
>  > >
>  >
>  > Agree sir. Its a distraction at least.
>  >
>  > I see Jacoco report here now:
>  > https://builds.apache.org/job/HBase-Trunk_matrix/jdk=JDK%
>  > 201.8%20(latest),label=Hadoop/3277/
>  >
>  > Maybe it has been there always and I just haven't noticed.
>  >
>  > Its all 0%. We need to turn on stuff?
>  >
>  > St.Ack
>  >
> 
> >>>
> >>>
> >>
> >
>


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-07-06 Thread Stack
I tried publishing hbase-3.0.0-SNAPSHOT... so hbase-checkstyle was up in
repo (presuming it relied on an aged-out snapshot). Seems to have 'fixed'
it for now

St.Ack

On Thu, Jul 6, 2017 at 12:50 PM, Stack  wrote:

> The 3.0.0-SNAPSHOT looks suspicious ... the hbase version
> St.Ack
>
> On Thu, Jul 6, 2017 at 12:49 PM, Stack  wrote:
>
>> On Thu, Jul 6, 2017 at 12:48 PM, Stack  wrote:
>>
>>> Checkstyle is currently broke on our builds... looking.
>>> St.Ack
>>>
>>>
>> Works if I run it locally (of course)
>> St.Ack
>>
>>
>>
>>
>>>
>>>
>>> [ERROR] Failed to execute goal 
>>> org.apache.maven.plugins:maven-checkstyle-plugin:2.17:checkstyle 
>>> (default-cli) on project hbase: Execution default-cli of goal 
>>> org.apache.maven.plugins:maven-checkstyle-plugin:2.17:checkstyle failed: 
>>> Plugin org.apache.maven.plugins:maven-checkstyle-plugin:2.17 or one of its 
>>> dependencies could not be resolved: Could not find artifact 
>>> org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT in Nexus 
>>> (http://repository.apache.org/snapshots) -> [Help 1][ERROR] [ERROR] To see 
>>> the full stack trace of the errors, re-run Maven with the -e switch.[ERROR] 
>>> Re-run Maven using the -X switch to enable full debug logging.[ERROR] 
>>> [ERROR] For more information about the errors and possible solutions, 
>>> please read the following articles:[ERROR] [Help 1] 
>>> http://cwiki.apache.org/confluence/display/MAVEN/PluginResolutionExceptionBuild
>>>  step 'Invoke top-level Maven targets' marked build as failure
>>> Performing Post build task...
>>> Match found for :.* : True
>>> Logical operation result is TRUE
>>> Running script  : # Run zombie detector script
>>> ./dev-support/zombie-detector.sh --jenkins ${BUILD_ID}
>>> [a3159d73] $ /bin/bash -xe /tmp/hudson1697041977582083402.sh
>>> + ./dev-support/zombie-detector.sh --jenkins 3320
>>> Thu Jul  6 01:37:09 UTC 2017 We're ok: there is no zombie test
>>>
>>>
>>>
>>>
>>> On Fri, Jun 30, 2017 at 2:43 PM, Sean Busbey  wrote:
>>>
 jacoco was added ages ago. I'd guess that something changed on the
 machines
 we use to cause it to stop working.

 On Thu, Jun 29, 2017 at 12:02 PM, Stack  wrote:

 > On Wed, Jun 28, 2017 at 8:43 AM, Josh Elser 
 wrote:
 >
 > >
 > >
 > > On 6/27/17 7:20 PM, Stack wrote:
 > >
 > >> * test-patch's whitespace plugin can configured to ignore some
 files
 > (but
 > >>> I
 > >>> can't think of any we'd care to so whitelist)
 > >>>
 > >>> Generated files.
 > >>
 > >
 > > Oh my goodness, yes, please. This has been such a pain in the rear
 for me
 > > as I've been rebasing space quota patches. Sometimes, the spaces in
 > > pb-gen'ed code are removed by folks before commit, other times they
 > aren't.
 > >
 >
 > Agree sir. Its a distraction at least.
 >
 > I see Jacoco report here now:
 > https://builds.apache.org/job/HBase-Trunk_matrix/jdk=JDK%
 > 201.8%20(latest),label=Hadoop/3277/
 >
 > Maybe it has been there always and I just haven't noticed.
 >
 > Its all 0%. We need to turn on stuff?
 >
 > St.Ack
 >

>>>
>>>
>>
>


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-07-06 Thread Stack
The 3.0.0-SNAPSHOT looks suspicious ... the hbase version
St.Ack

On Thu, Jul 6, 2017 at 12:49 PM, Stack  wrote:

> On Thu, Jul 6, 2017 at 12:48 PM, Stack  wrote:
>
>> Checkstyle is currently broke on our builds... looking.
>> St.Ack
>>
>>
> Works if I run it locally (of course)
> St.Ack
>
>
>
>
>>
>>
>> [ERROR] Failed to execute goal 
>> org.apache.maven.plugins:maven-checkstyle-plugin:2.17:checkstyle 
>> (default-cli) on project hbase: Execution default-cli of goal 
>> org.apache.maven.plugins:maven-checkstyle-plugin:2.17:checkstyle failed: 
>> Plugin org.apache.maven.plugins:maven-checkstyle-plugin:2.17 or one of its 
>> dependencies could not be resolved: Could not find artifact 
>> org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT in Nexus 
>> (http://repository.apache.org/snapshots) -> [Help 1][ERROR] [ERROR] To see 
>> the full stack trace of the errors, re-run Maven with the -e switch.[ERROR] 
>> Re-run Maven using the -X switch to enable full debug logging.[ERROR] 
>> [ERROR] For more information about the errors and possible solutions, please 
>> read the following articles:[ERROR] [Help 1] 
>> http://cwiki.apache.org/confluence/display/MAVEN/PluginResolutionExceptionBuild
>>  step 'Invoke top-level Maven targets' marked build as failure
>> Performing Post build task...
>> Match found for :.* : True
>> Logical operation result is TRUE
>> Running script  : # Run zombie detector script
>> ./dev-support/zombie-detector.sh --jenkins ${BUILD_ID}
>> [a3159d73] $ /bin/bash -xe /tmp/hudson1697041977582083402.sh
>> + ./dev-support/zombie-detector.sh --jenkins 3320
>> Thu Jul  6 01:37:09 UTC 2017 We're ok: there is no zombie test
>>
>>
>>
>>
>> On Fri, Jun 30, 2017 at 2:43 PM, Sean Busbey  wrote:
>>
>>> jacoco was added ages ago. I'd guess that something changed on the
>>> machines
>>> we use to cause it to stop working.
>>>
>>> On Thu, Jun 29, 2017 at 12:02 PM, Stack  wrote:
>>>
>>> > On Wed, Jun 28, 2017 at 8:43 AM, Josh Elser  wrote:
>>> >
>>> > >
>>> > >
>>> > > On 6/27/17 7:20 PM, Stack wrote:
>>> > >
>>> > >> * test-patch's whitespace plugin can configured to ignore some files
>>> > (but
>>> > >>> I
>>> > >>> can't think of any we'd care to so whitelist)
>>> > >>>
>>> > >>> Generated files.
>>> > >>
>>> > >
>>> > > Oh my goodness, yes, please. This has been such a pain in the rear
>>> for me
>>> > > as I've been rebasing space quota patches. Sometimes, the spaces in
>>> > > pb-gen'ed code are removed by folks before commit, other times they
>>> > aren't.
>>> > >
>>> >
>>> > Agree sir. Its a distraction at least.
>>> >
>>> > I see Jacoco report here now:
>>> > https://builds.apache.org/job/HBase-Trunk_matrix/jdk=JDK%
>>> > 201.8%20(latest),label=Hadoop/3277/
>>> >
>>> > Maybe it has been there always and I just haven't noticed.
>>> >
>>> > Its all 0%. We need to turn on stuff?
>>> >
>>> > St.Ack
>>> >
>>>
>>
>>
>


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-07-06 Thread Stack
On Thu, Jul 6, 2017 at 12:48 PM, Stack  wrote:

> Checkstyle is currently broke on our builds... looking.
> St.Ack
>
>
Works if I run it locally (of course)
St.Ack




>
>
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-checkstyle-plugin:2.17:checkstyle 
> (default-cli) on project hbase: Execution default-cli of goal 
> org.apache.maven.plugins:maven-checkstyle-plugin:2.17:checkstyle failed: 
> Plugin org.apache.maven.plugins:maven-checkstyle-plugin:2.17 or one of its 
> dependencies could not be resolved: Could not find artifact 
> org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT in Nexus 
> (http://repository.apache.org/snapshots) -> [Help 1][ERROR] [ERROR] To see 
> the full stack trace of the errors, re-run Maven with the -e switch.[ERROR] 
> Re-run Maven using the -X switch to enable full debug logging.[ERROR] [ERROR] 
> For more information about the errors and possible solutions, please read the 
> following articles:[ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/PluginResolutionExceptionBuild
>  step 'Invoke top-level Maven targets' marked build as failure
> Performing Post build task...
> Match found for :.* : True
> Logical operation result is TRUE
> Running script  : # Run zombie detector script
> ./dev-support/zombie-detector.sh --jenkins ${BUILD_ID}
> [a3159d73] $ /bin/bash -xe /tmp/hudson1697041977582083402.sh
> + ./dev-support/zombie-detector.sh --jenkins 3320
> Thu Jul  6 01:37:09 UTC 2017 We're ok: there is no zombie test
>
>
>
>
> On Fri, Jun 30, 2017 at 2:43 PM, Sean Busbey  wrote:
>
>> jacoco was added ages ago. I'd guess that something changed on the
>> machines
>> we use to cause it to stop working.
>>
>> On Thu, Jun 29, 2017 at 12:02 PM, Stack  wrote:
>>
>> > On Wed, Jun 28, 2017 at 8:43 AM, Josh Elser  wrote:
>> >
>> > >
>> > >
>> > > On 6/27/17 7:20 PM, Stack wrote:
>> > >
>> > >> * test-patch's whitespace plugin can configured to ignore some files
>> > (but
>> > >>> I
>> > >>> can't think of any we'd care to so whitelist)
>> > >>>
>> > >>> Generated files.
>> > >>
>> > >
>> > > Oh my goodness, yes, please. This has been such a pain in the rear
>> for me
>> > > as I've been rebasing space quota patches. Sometimes, the spaces in
>> > > pb-gen'ed code are removed by folks before commit, other times they
>> > aren't.
>> > >
>> >
>> > Agree sir. Its a distraction at least.
>> >
>> > I see Jacoco report here now:
>> > https://builds.apache.org/job/HBase-Trunk_matrix/jdk=JDK%
>> > 201.8%20(latest),label=Hadoop/3277/
>> >
>> > Maybe it has been there always and I just haven't noticed.
>> >
>> > Its all 0%. We need to turn on stuff?
>> >
>> > St.Ack
>> >
>>
>
>


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-07-06 Thread Stack
Checkstyle is currently broke on our builds... looking.
St.Ack



[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-checkstyle-plugin:2.17:checkstyle
(default-cli) on project hbase: Execution default-cli of goal
org.apache.maven.plugins:maven-checkstyle-plugin:2.17:checkstyle
failed: Plugin org.apache.maven.plugins:maven-checkstyle-plugin:2.17
or one of its dependencies could not be resolved: Could not find
artifact org.apache.hbase:hbase-checkstyle:jar:3.0.0-SNAPSHOT in Nexus
(http://repository.apache.org/snapshots) -> [Help 1][ERROR] [ERROR] To
see the full stack trace of the errors, re-run Maven with the -e
switch.[ERROR] Re-run Maven using the -X switch to enable full debug
logging.[ERROR] [ERROR] For more information about the errors and
possible solutions, please read the following articles:[ERROR] [Help
1] 
http://cwiki.apache.org/confluence/display/MAVEN/PluginResolutionExceptionBuild
step 'Invoke top-level Maven targets' marked build as failure
Performing Post build task...
Match found for :.* : True
Logical operation result is TRUE
Running script  : # Run zombie detector script
./dev-support/zombie-detector.sh --jenkins ${BUILD_ID}
[a3159d73] $ /bin/bash -xe /tmp/hudson1697041977582083402.sh
+ ./dev-support/zombie-detector.sh --jenkins 3320
Thu Jul  6 01:37:09 UTC 2017 We're ok: there is no zombie test




On Fri, Jun 30, 2017 at 2:43 PM, Sean Busbey  wrote:

> jacoco was added ages ago. I'd guess that something changed on the machines
> we use to cause it to stop working.
>
> On Thu, Jun 29, 2017 at 12:02 PM, Stack  wrote:
>
> > On Wed, Jun 28, 2017 at 8:43 AM, Josh Elser  wrote:
> >
> > >
> > >
> > > On 6/27/17 7:20 PM, Stack wrote:
> > >
> > >> * test-patch's whitespace plugin can configured to ignore some files
> > (but
> > >>> I
> > >>> can't think of any we'd care to so whitelist)
> > >>>
> > >>> Generated files.
> > >>
> > >
> > > Oh my goodness, yes, please. This has been such a pain in the rear for
> me
> > > as I've been rebasing space quota patches. Sometimes, the spaces in
> > > pb-gen'ed code are removed by folks before commit, other times they
> > aren't.
> > >
> >
> > Agree sir. Its a distraction at least.
> >
> > I see Jacoco report here now:
> > https://builds.apache.org/job/HBase-Trunk_matrix/jdk=JDK%
> > 201.8%20(latest),label=Hadoop/3277/
> >
> > Maybe it has been there always and I just haven't noticed.
> >
> > Its all 0%. We need to turn on stuff?
> >
> > St.Ack
> >
>


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-06-30 Thread Sean Busbey
jacoco was added ages ago. I'd guess that something changed on the machines
we use to cause it to stop working.

On Thu, Jun 29, 2017 at 12:02 PM, Stack  wrote:

> On Wed, Jun 28, 2017 at 8:43 AM, Josh Elser  wrote:
>
> >
> >
> > On 6/27/17 7:20 PM, Stack wrote:
> >
> >> * test-patch's whitespace plugin can configured to ignore some files
> (but
> >>> I
> >>> can't think of any we'd care to so whitelist)
> >>>
> >>> Generated files.
> >>
> >
> > Oh my goodness, yes, please. This has been such a pain in the rear for me
> > as I've been rebasing space quota patches. Sometimes, the spaces in
> > pb-gen'ed code are removed by folks before commit, other times they
> aren't.
> >
>
> Agree sir. Its a distraction at least.
>
> I see Jacoco report here now:
> https://builds.apache.org/job/HBase-Trunk_matrix/jdk=JDK%
> 201.8%20(latest),label=Hadoop/3277/
>
> Maybe it has been there always and I just haven't noticed.
>
> Its all 0%. We need to turn on stuff?
>
> St.Ack
>


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-06-29 Thread Stack
On Wed, Jun 28, 2017 at 8:43 AM, Josh Elser  wrote:

>
>
> On 6/27/17 7:20 PM, Stack wrote:
>
>> * test-patch's whitespace plugin can configured to ignore some files (but
>>> I
>>> can't think of any we'd care to so whitelist)
>>>
>>> Generated files.
>>
>
> Oh my goodness, yes, please. This has been such a pain in the rear for me
> as I've been rebasing space quota patches. Sometimes, the spaces in
> pb-gen'ed code are removed by folks before commit, other times they aren't.
>

Agree sir. Its a distraction at least.

I see Jacoco report here now:
https://builds.apache.org/job/HBase-Trunk_matrix/jdk=JDK%201.8%20(latest),label=Hadoop/3277/

Maybe it has been there always and I just haven't noticed.

Its all 0%. We need to turn on stuff?

St.Ack


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-06-28 Thread Josh Elser



On 6/27/17 7:20 PM, Stack wrote:

* test-patch's whitespace plugin can configured to ignore some files (but I
can't think of any we'd care to so whitelist)


Generated files.


Oh my goodness, yes, please. This has been such a pain in the rear for 
me as I've been rebasing space quota patches. Sometimes, the spaces in 
pb-gen'ed code are removed by folks before commit, other times they aren't.


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-06-27 Thread Stack
On Tue, Jun 27, 2017 at 10:28 AM, Sean Busbey  wrote:

> On Tue, Jun 27, 2017 at 11:38 AM, Stack  wrote:
>
> > On Tue, Jun 27, 2017 at 9:24 AM, Sean Busbey  wrote:
> >
> > > FYI, I've updated the precommit build to use Yetus 0.4.0 (which is the
> > > current release).
> > >
> > > Shouldn't impact much. if things look off ping me.
> > >
> > >
> > Thanks Sean.
> >
> > Whats new in 0.4.0?
> >
> >
> This change was spurred by YETUS-520, which EOL's versions earlier than
> 0.4.0.
>
> Couple of things that changed:
>
>
Helps. Thanks. See below.



> * test-patch got better about handling pylint (which rarely impacts us, but
> will help the effort to adopt a pylintrc for our dev-support stuff.)
> * docker cleanup got more robust (I think we've been sheltered from this
> because Hadoop's been running a more up-to-date version of Yetus more
> frequently than us on asf infra)
> * test-patch's whitespace plugin can configured to ignore some files (but I
> can't think of any we'd care to so whitelist)
>

Generated files.


> * test-patch's nightly mode got a new email format that's not so verbose (I
> believe I'm the only one getting these emails, maybe the new format will be
> worth sending to the notification list)
>
>
Sure. Sounds good.

Thanks Sean.

S


> Here's the notes:
>
> http://yetus.apache.org/documentation/0.4.0/RELEASENOTES/
>


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-06-27 Thread Sean Busbey
On Tue, Jun 27, 2017 at 11:38 AM, Stack  wrote:

> On Tue, Jun 27, 2017 at 9:24 AM, Sean Busbey  wrote:
>
> > FYI, I've updated the precommit build to use Yetus 0.4.0 (which is the
> > current release).
> >
> > Shouldn't impact much. if things look off ping me.
> >
> >
> Thanks Sean.
>
> Whats new in 0.4.0?
>
>
This change was spurred by YETUS-520, which EOL's versions earlier than
0.4.0.

Couple of things that changed:

* test-patch got better about handling pylint (which rarely impacts us, but
will help the effort to adopt a pylintrc for our dev-support stuff.)
* docker cleanup got more robust (I think we've been sheltered from this
because Hadoop's been running a more up-to-date version of Yetus more
frequently than us on asf infra)
* test-patch's whitespace plugin can configured to ignore some files (but I
can't think of any we'd care to so whitelist)
* test-patch's nightly mode got a new email format that's not so verbose (I
believe I'm the only one getting these emails, maybe the new format will be
worth sending to the notification list)

Here's the notes:

http://yetus.apache.org/documentation/0.4.0/RELEASENOTES/


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-06-27 Thread Stack
On Tue, Jun 27, 2017 at 9:24 AM, Sean Busbey  wrote:

> FYI, I've updated the precommit build to use Yetus 0.4.0 (which is the
> current release).
>
> Shouldn't impact much. if things look off ping me.
>
>
Thanks Sean.

Whats new in 0.4.0?

S


> On Wed, Mar 1, 2017 at 2:23 PM, Mikhail Antonov 
> wrote:
>
> > Ouch. Thanks Sean!
> >
> > I'm pretty sure at some point I was debugging 1.3-IT job and saw
> branch-1.3
> > getting checked out in the logs. Not sure how/when it went sideways
> though.
> >
> > Yeah, let's see how it goes.
> >
> > -Mikhail
> >
> > On Wed, Mar 1, 2017 at 5:50 AM, Sean Busbey  wrote:
> >
> > > Fun times.
> > >
> > > 1) Turns out our 1.3-IT jobs have been running against branch-1.2.
> > > Don't know how long, but as long as we have history.
> > >
> > > 2) I deleted the failing-since-august 1.2-IT job.
> > >
> > > 3) I renamed the passing 1.3-IT job that runs against branch-1.2 to be
> > > the 1.2-IT job
> > >
> > > 4) I copied the now renamed 1.2-IT job and made a 1.3-IT job that runs
> > > against branch-1.3
> > >
> > > I kicked off jobs after all this shuffling. We'll see how it goes.
> > >
> > > On Tue, Feb 21, 2017 at 5:49 PM, Sean Busbey 
> wrote:
> > > > FYI, I updated the 1.2-IT and 1.3-IT jobs today to use Appy's
> > > > suggested "custom child workspace" of "${SHORT_COMBINATION}", since
> > > > spaces in paths had caused them to fail for a v long time.
> > > >
> > > > On Fri, Oct 14, 2016 at 4:44 PM, Andrew Purtell  >
> > > wrote:
> > > >> Thanks Ted, that would be a nice contribution, thank you.
> > > >>
> > > >>
> > > >> On Fri, Oct 14, 2016 at 12:07 PM, Apekshit Sharma <
> [email protected]>
> > > wrote:
> > > >>
> > > >>> @Ted, here's the old jira, HBASE-14167. Use that.
> > > >>>
> > > >>> On Fri, Oct 14, 2016 at 12:02 PM, Ted Yu 
> > wrote:
> > > >>>
> > > >>> > I just ran the tests in hbase-spark module using 'mvn verify'.
> > > >>> >
> > > >>> > All passed.
> > > >>> >
> > > >>> > I am testing a patch locally where hbase-spark tests are run in
> > test
> > > >>> phase.
> > > >>> >
> > > >>> > If the tests pass, I will log a JIRA.
> > > >>> >
> > > >>> > Thanks
> > > >>> >
> > > >>> > > On Oct 14, 2016, at 11:41 AM, Andrew Purtell <
> > [email protected]>
> > > >>> > wrote:
> > > >>> > >
> > > >>> > > The hbase-spark integration tests run (and fail) for me locally
> > > >>> whenever
> > > >>> > I
> > > >>> > > build master with 'mvn clean install -DskipITs' .
> > > >>> > >
> > > >>> > > HBaseConnectionCacheSuite:
> > > >>> > > - all test cases *** FAILED ***
> > > >>> > >  2 did not equal 1 (HBaseConnectionCacheSuite.scala:92)
> > > >>> > >
> > > >>> > > Saw it but had to ignore/triage to get something else done.
> > > >>> > >
> > > >>> > > We have a weird situation where integration tests run when they
> > > >>> shouldn't
> > > >>> > > locally yet no tests run at all for patch process?
> > > >>> > >
> > > >>> > > I would like to see Spark behave like the other modules. I
> > remember
> > > >>> > filing
> > > >>> > > a JIRA asking that hbase-spark honor -DskipITs. It still
> doesn't.
> > > >>> > > Meanwhile, it does its own thing with '-DskipSparkTests', which
> > is
> > > not
> > > >>> > > appropriate given that none of the other modules have their own
> > > >>> distinct
> > > >>> > > control parameters. There also doesn't seem to be a distinction
> > > between
> > > >>> > > unit tests and integration tests. The 'test' target does
> nothing.
> > > >>> > > Everything happens during the 'integration-test' phase. Is
> this a
> > > Spark
> > > >>> > > limitation?
> > > >>> > >
> > > >>> > >
> > > >>> > >> On Fri, Oct 14, 2016 at 11:27 AM, Sean Busbey <
> > > [email protected]>
> > > >>> > wrote:
> > > >>> > >>
> > > >>> > >> Do the HBase Spark tests only run during the maven verify
> > command?
> > > >>> > >> We'll need to update our personality to say that that command
> > > should
> > > >>> > >> be used for unit tests when in the hbase spark module. ugh.
> > > >>> > >>
> > > >>> > >> On Thu, Oct 13, 2016 at 7:42 PM, Apekshit Sharma <
> > > [email protected]>
> > > >>> > >> wrote:
> > > >>> > >>> Our patch process isn't running hbase-spark tests. See this
> for
> > > >>> > example:
> > > >>> > >>>
> > > >>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> > > >>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> > > >>> > >> artifact/patchprocess/patch-unit-hbase-spark.txt/*view*/
> > > >>> > >>>
> > > >>> > >>> Found it when trying to debug cause of trunk failures. Part
> of
> > > the
> > > >>> > cause
> > > >>> > >> is
> > > >>> > >>> hbase-spark's HBaseConnectionCacheSuite test failure (
> > > >>> > >>> https://builds.apache.org/view/All/job/HBase-Trunk_
> > > >>> > >> matrix/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/
> > > 1776/consoleFull
> > > >>> > >>  > > >>> > matrix/jdk=JDK%201.8%20%28latest%29,label=yahoo-not-
> > > h2/1776/consoleFull>
> > > >>> > >> )
> > > >>> > >>> w

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-06-27 Thread Sean Busbey
FYI, I've updated the precommit build to use Yetus 0.4.0 (which is the
current release).

Shouldn't impact much. if things look off ping me.

On Wed, Mar 1, 2017 at 2:23 PM, Mikhail Antonov 
wrote:

> Ouch. Thanks Sean!
>
> I'm pretty sure at some point I was debugging 1.3-IT job and saw branch-1.3
> getting checked out in the logs. Not sure how/when it went sideways though.
>
> Yeah, let's see how it goes.
>
> -Mikhail
>
> On Wed, Mar 1, 2017 at 5:50 AM, Sean Busbey  wrote:
>
> > Fun times.
> >
> > 1) Turns out our 1.3-IT jobs have been running against branch-1.2.
> > Don't know how long, but as long as we have history.
> >
> > 2) I deleted the failing-since-august 1.2-IT job.
> >
> > 3) I renamed the passing 1.3-IT job that runs against branch-1.2 to be
> > the 1.2-IT job
> >
> > 4) I copied the now renamed 1.2-IT job and made a 1.3-IT job that runs
> > against branch-1.3
> >
> > I kicked off jobs after all this shuffling. We'll see how it goes.
> >
> > On Tue, Feb 21, 2017 at 5:49 PM, Sean Busbey  wrote:
> > > FYI, I updated the 1.2-IT and 1.3-IT jobs today to use Appy's
> > > suggested "custom child workspace" of "${SHORT_COMBINATION}", since
> > > spaces in paths had caused them to fail for a v long time.
> > >
> > > On Fri, Oct 14, 2016 at 4:44 PM, Andrew Purtell 
> > wrote:
> > >> Thanks Ted, that would be a nice contribution, thank you.
> > >>
> > >>
> > >> On Fri, Oct 14, 2016 at 12:07 PM, Apekshit Sharma 
> > wrote:
> > >>
> > >>> @Ted, here's the old jira, HBASE-14167. Use that.
> > >>>
> > >>> On Fri, Oct 14, 2016 at 12:02 PM, Ted Yu 
> wrote:
> > >>>
> > >>> > I just ran the tests in hbase-spark module using 'mvn verify'.
> > >>> >
> > >>> > All passed.
> > >>> >
> > >>> > I am testing a patch locally where hbase-spark tests are run in
> test
> > >>> phase.
> > >>> >
> > >>> > If the tests pass, I will log a JIRA.
> > >>> >
> > >>> > Thanks
> > >>> >
> > >>> > > On Oct 14, 2016, at 11:41 AM, Andrew Purtell <
> [email protected]>
> > >>> > wrote:
> > >>> > >
> > >>> > > The hbase-spark integration tests run (and fail) for me locally
> > >>> whenever
> > >>> > I
> > >>> > > build master with 'mvn clean install -DskipITs' .
> > >>> > >
> > >>> > > HBaseConnectionCacheSuite:
> > >>> > > - all test cases *** FAILED ***
> > >>> > >  2 did not equal 1 (HBaseConnectionCacheSuite.scala:92)
> > >>> > >
> > >>> > > Saw it but had to ignore/triage to get something else done.
> > >>> > >
> > >>> > > We have a weird situation where integration tests run when they
> > >>> shouldn't
> > >>> > > locally yet no tests run at all for patch process?
> > >>> > >
> > >>> > > I would like to see Spark behave like the other modules. I
> remember
> > >>> > filing
> > >>> > > a JIRA asking that hbase-spark honor -DskipITs. It still doesn't.
> > >>> > > Meanwhile, it does its own thing with '-DskipSparkTests', which
> is
> > not
> > >>> > > appropriate given that none of the other modules have their own
> > >>> distinct
> > >>> > > control parameters. There also doesn't seem to be a distinction
> > between
> > >>> > > unit tests and integration tests. The 'test' target does nothing.
> > >>> > > Everything happens during the 'integration-test' phase. Is this a
> > Spark
> > >>> > > limitation?
> > >>> > >
> > >>> > >
> > >>> > >> On Fri, Oct 14, 2016 at 11:27 AM, Sean Busbey <
> > [email protected]>
> > >>> > wrote:
> > >>> > >>
> > >>> > >> Do the HBase Spark tests only run during the maven verify
> command?
> > >>> > >> We'll need to update our personality to say that that command
> > should
> > >>> > >> be used for unit tests when in the hbase spark module. ugh.
> > >>> > >>
> > >>> > >> On Thu, Oct 13, 2016 at 7:42 PM, Apekshit Sharma <
> > [email protected]>
> > >>> > >> wrote:
> > >>> > >>> Our patch process isn't running hbase-spark tests. See this for
> > >>> > example:
> > >>> > >>>
> > >>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> > >>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> > >>> > >> artifact/patchprocess/patch-unit-hbase-spark.txt/*view*/
> > >>> > >>>
> > >>> > >>> Found it when trying to debug cause of trunk failures. Part of
> > the
> > >>> > cause
> > >>> > >> is
> > >>> > >>> hbase-spark's HBaseConnectionCacheSuite test failure (
> > >>> > >>> https://builds.apache.org/view/All/job/HBase-Trunk_
> > >>> > >> matrix/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/
> > 1776/consoleFull
> > >>> > >>  > >>> > matrix/jdk=JDK%201.8%20%28latest%29,label=yahoo-not-
> > h2/1776/consoleFull>
> > >>> > >> )
> > >>> > >>> which was added in HBASE-16638. However, to be fair, QA was
> > green and
> > >>> > >>> reported passing hbase-spark tests for that jira.
> > >>> > >>>
> > >>> >  On Mon, Sep 19, 2016 at 12:57 PM, Stack 
> > wrote:
> > >>> > 
> > >>> >  childCustomWorkspace seems to be just the ticket. Nice find
> > Appy.
> > >>> >  St.Ack
> > >>> > 
> > >>> >  On Mon, Sep 19, 2016 at 10:0

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-03-01 Thread Mikhail Antonov
Ouch. Thanks Sean!

I'm pretty sure at some point I was debugging 1.3-IT job and saw branch-1.3
getting checked out in the logs. Not sure how/when it went sideways though.

Yeah, let's see how it goes.

-Mikhail

On Wed, Mar 1, 2017 at 5:50 AM, Sean Busbey  wrote:

> Fun times.
>
> 1) Turns out our 1.3-IT jobs have been running against branch-1.2.
> Don't know how long, but as long as we have history.
>
> 2) I deleted the failing-since-august 1.2-IT job.
>
> 3) I renamed the passing 1.3-IT job that runs against branch-1.2 to be
> the 1.2-IT job
>
> 4) I copied the now renamed 1.2-IT job and made a 1.3-IT job that runs
> against branch-1.3
>
> I kicked off jobs after all this shuffling. We'll see how it goes.
>
> On Tue, Feb 21, 2017 at 5:49 PM, Sean Busbey  wrote:
> > FYI, I updated the 1.2-IT and 1.3-IT jobs today to use Appy's
> > suggested "custom child workspace" of "${SHORT_COMBINATION}", since
> > spaces in paths had caused them to fail for a v long time.
> >
> > On Fri, Oct 14, 2016 at 4:44 PM, Andrew Purtell 
> wrote:
> >> Thanks Ted, that would be a nice contribution, thank you.
> >>
> >>
> >> On Fri, Oct 14, 2016 at 12:07 PM, Apekshit Sharma 
> wrote:
> >>
> >>> @Ted, here's the old jira, HBASE-14167. Use that.
> >>>
> >>> On Fri, Oct 14, 2016 at 12:02 PM, Ted Yu  wrote:
> >>>
> >>> > I just ran the tests in hbase-spark module using 'mvn verify'.
> >>> >
> >>> > All passed.
> >>> >
> >>> > I am testing a patch locally where hbase-spark tests are run in test
> >>> phase.
> >>> >
> >>> > If the tests pass, I will log a JIRA.
> >>> >
> >>> > Thanks
> >>> >
> >>> > > On Oct 14, 2016, at 11:41 AM, Andrew Purtell 
> >>> > wrote:
> >>> > >
> >>> > > The hbase-spark integration tests run (and fail) for me locally
> >>> whenever
> >>> > I
> >>> > > build master with 'mvn clean install -DskipITs' .
> >>> > >
> >>> > > HBaseConnectionCacheSuite:
> >>> > > - all test cases *** FAILED ***
> >>> > >  2 did not equal 1 (HBaseConnectionCacheSuite.scala:92)
> >>> > >
> >>> > > Saw it but had to ignore/triage to get something else done.
> >>> > >
> >>> > > We have a weird situation where integration tests run when they
> >>> shouldn't
> >>> > > locally yet no tests run at all for patch process?
> >>> > >
> >>> > > I would like to see Spark behave like the other modules. I remember
> >>> > filing
> >>> > > a JIRA asking that hbase-spark honor -DskipITs. It still doesn't.
> >>> > > Meanwhile, it does its own thing with '-DskipSparkTests', which is
> not
> >>> > > appropriate given that none of the other modules have their own
> >>> distinct
> >>> > > control parameters. There also doesn't seem to be a distinction
> between
> >>> > > unit tests and integration tests. The 'test' target does nothing.
> >>> > > Everything happens during the 'integration-test' phase. Is this a
> Spark
> >>> > > limitation?
> >>> > >
> >>> > >
> >>> > >> On Fri, Oct 14, 2016 at 11:27 AM, Sean Busbey <
> [email protected]>
> >>> > wrote:
> >>> > >>
> >>> > >> Do the HBase Spark tests only run during the maven verify command?
> >>> > >> We'll need to update our personality to say that that command
> should
> >>> > >> be used for unit tests when in the hbase spark module. ugh.
> >>> > >>
> >>> > >> On Thu, Oct 13, 2016 at 7:42 PM, Apekshit Sharma <
> [email protected]>
> >>> > >> wrote:
> >>> > >>> Our patch process isn't running hbase-spark tests. See this for
> >>> > example:
> >>> > >>>
> >>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> >>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> >>> > >> artifact/patchprocess/patch-unit-hbase-spark.txt/*view*/
> >>> > >>>
> >>> > >>> Found it when trying to debug cause of trunk failures. Part of
> the
> >>> > cause
> >>> > >> is
> >>> > >>> hbase-spark's HBaseConnectionCacheSuite test failure (
> >>> > >>> https://builds.apache.org/view/All/job/HBase-Trunk_
> >>> > >> matrix/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/
> 1776/consoleFull
> >>> > >>  >>> > matrix/jdk=JDK%201.8%20%28latest%29,label=yahoo-not-
> h2/1776/consoleFull>
> >>> > >> )
> >>> > >>> which was added in HBASE-16638. However, to be fair, QA was
> green and
> >>> > >>> reported passing hbase-spark tests for that jira.
> >>> > >>>
> >>> >  On Mon, Sep 19, 2016 at 12:57 PM, Stack 
> wrote:
> >>> > 
> >>> >  childCustomWorkspace seems to be just the ticket. Nice find
> Appy.
> >>> >  St.Ack
> >>> > 
> >>> >  On Mon, Sep 19, 2016 at 10:03 AM, Sean Busbey <
> [email protected]>
> >>> > >> wrote:
> >>> > 
> >>> > > Option 2c looks to be working really well. Thanks for tackling
> this
> >>> > >> Appy!
> >>> > >
> >>> > > We still have some failures on the master build, but it looks
> like
> >>> > > actual problems (or perhaps a flakey). There are several
> passing
> >>> > > builds.
> >>> > >
> >>> > > This should be pretty easy to replicate on the other jobs. I
> don't
> >>> > see
> >>

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-03-01 Thread Sean Busbey
Fun times.

1) Turns out our 1.3-IT jobs have been running against branch-1.2.
Don't know how long, but as long as we have history.

2) I deleted the failing-since-august 1.2-IT job.

3) I renamed the passing 1.3-IT job that runs against branch-1.2 to be
the 1.2-IT job

4) I copied the now renamed 1.2-IT job and made a 1.3-IT job that runs
against branch-1.3

I kicked off jobs after all this shuffling. We'll see how it goes.

On Tue, Feb 21, 2017 at 5:49 PM, Sean Busbey  wrote:
> FYI, I updated the 1.2-IT and 1.3-IT jobs today to use Appy's
> suggested "custom child workspace" of "${SHORT_COMBINATION}", since
> spaces in paths had caused them to fail for a v long time.
>
> On Fri, Oct 14, 2016 at 4:44 PM, Andrew Purtell  wrote:
>> Thanks Ted, that would be a nice contribution, thank you.
>>
>>
>> On Fri, Oct 14, 2016 at 12:07 PM, Apekshit Sharma  wrote:
>>
>>> @Ted, here's the old jira, HBASE-14167. Use that.
>>>
>>> On Fri, Oct 14, 2016 at 12:02 PM, Ted Yu  wrote:
>>>
>>> > I just ran the tests in hbase-spark module using 'mvn verify'.
>>> >
>>> > All passed.
>>> >
>>> > I am testing a patch locally where hbase-spark tests are run in test
>>> phase.
>>> >
>>> > If the tests pass, I will log a JIRA.
>>> >
>>> > Thanks
>>> >
>>> > > On Oct 14, 2016, at 11:41 AM, Andrew Purtell 
>>> > wrote:
>>> > >
>>> > > The hbase-spark integration tests run (and fail) for me locally
>>> whenever
>>> > I
>>> > > build master with 'mvn clean install -DskipITs' .
>>> > >
>>> > > HBaseConnectionCacheSuite:
>>> > > - all test cases *** FAILED ***
>>> > >  2 did not equal 1 (HBaseConnectionCacheSuite.scala:92)
>>> > >
>>> > > Saw it but had to ignore/triage to get something else done.
>>> > >
>>> > > We have a weird situation where integration tests run when they
>>> shouldn't
>>> > > locally yet no tests run at all for patch process?
>>> > >
>>> > > I would like to see Spark behave like the other modules. I remember
>>> > filing
>>> > > a JIRA asking that hbase-spark honor -DskipITs. It still doesn't.
>>> > > Meanwhile, it does its own thing with '-DskipSparkTests', which is not
>>> > > appropriate given that none of the other modules have their own
>>> distinct
>>> > > control parameters. There also doesn't seem to be a distinction between
>>> > > unit tests and integration tests. The 'test' target does nothing.
>>> > > Everything happens during the 'integration-test' phase. Is this a Spark
>>> > > limitation?
>>> > >
>>> > >
>>> > >> On Fri, Oct 14, 2016 at 11:27 AM, Sean Busbey 
>>> > wrote:
>>> > >>
>>> > >> Do the HBase Spark tests only run during the maven verify command?
>>> > >> We'll need to update our personality to say that that command should
>>> > >> be used for unit tests when in the hbase spark module. ugh.
>>> > >>
>>> > >> On Thu, Oct 13, 2016 at 7:42 PM, Apekshit Sharma 
>>> > >> wrote:
>>> > >>> Our patch process isn't running hbase-spark tests. See this for
>>> > example:
>>> > >>>
>>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
>>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
>>> > >> artifact/patchprocess/patch-unit-hbase-spark.txt/*view*/
>>> > >>>
>>> > >>> Found it when trying to debug cause of trunk failures. Part of the
>>> > cause
>>> > >> is
>>> > >>> hbase-spark's HBaseConnectionCacheSuite test failure (
>>> > >>> https://builds.apache.org/view/All/job/HBase-Trunk_
>>> > >> matrix/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/1776/consoleFull
>>> > >> >> > matrix/jdk=JDK%201.8%20%28latest%29,label=yahoo-not-h2/1776/consoleFull>
>>> > >> )
>>> > >>> which was added in HBASE-16638. However, to be fair, QA was green and
>>> > >>> reported passing hbase-spark tests for that jira.
>>> > >>>
>>> >  On Mon, Sep 19, 2016 at 12:57 PM, Stack  wrote:
>>> > 
>>> >  childCustomWorkspace seems to be just the ticket. Nice find Appy.
>>> >  St.Ack
>>> > 
>>> >  On Mon, Sep 19, 2016 at 10:03 AM, Sean Busbey 
>>> > >> wrote:
>>> > 
>>> > > Option 2c looks to be working really well. Thanks for tackling this
>>> > >> Appy!
>>> > >
>>> > > We still have some failures on the master build, but it looks like
>>> > > actual problems (or perhaps a flakey). There are several passing
>>> > > builds.
>>> > >
>>> > > This should be pretty easy to replicate on the other jobs. I don't
>>> > see
>>> > > a downside. Anyone else have concerns?
>>> > >
>>> > >
>>> > > On Fri, Sep 16, 2016 at 6:15 PM, Apekshit Sharma <
>>> [email protected]>
>>> > > wrote:
>>> > >> So this all started with spaces-in-path issue, right?  I think it
>>> > >> has
>>> > >> gobbled up a lot of time of a lot of people.
>>> > >> Let's discuss our options and try to fix it for good. Here are
>>> what
>>> > >> i
>>> >  can
>>> > >> think of, and my opinion about them.
>>> > >>
>>> > >> 1. Not use matrix build
>>> > >>  Temporary fix. Not preferred since no

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-02-21 Thread Sean Busbey
FYI, I updated the 1.2-IT and 1.3-IT jobs today to use Appy's
suggested "custom child workspace" of "${SHORT_COMBINATION}", since
spaces in paths had caused them to fail for a v long time.

On Fri, Oct 14, 2016 at 4:44 PM, Andrew Purtell  wrote:
> Thanks Ted, that would be a nice contribution, thank you.
>
>
> On Fri, Oct 14, 2016 at 12:07 PM, Apekshit Sharma  wrote:
>
>> @Ted, here's the old jira, HBASE-14167. Use that.
>>
>> On Fri, Oct 14, 2016 at 12:02 PM, Ted Yu  wrote:
>>
>> > I just ran the tests in hbase-spark module using 'mvn verify'.
>> >
>> > All passed.
>> >
>> > I am testing a patch locally where hbase-spark tests are run in test
>> phase.
>> >
>> > If the tests pass, I will log a JIRA.
>> >
>> > Thanks
>> >
>> > > On Oct 14, 2016, at 11:41 AM, Andrew Purtell 
>> > wrote:
>> > >
>> > > The hbase-spark integration tests run (and fail) for me locally
>> whenever
>> > I
>> > > build master with 'mvn clean install -DskipITs' .
>> > >
>> > > HBaseConnectionCacheSuite:
>> > > - all test cases *** FAILED ***
>> > >  2 did not equal 1 (HBaseConnectionCacheSuite.scala:92)
>> > >
>> > > Saw it but had to ignore/triage to get something else done.
>> > >
>> > > We have a weird situation where integration tests run when they
>> shouldn't
>> > > locally yet no tests run at all for patch process?
>> > >
>> > > I would like to see Spark behave like the other modules. I remember
>> > filing
>> > > a JIRA asking that hbase-spark honor -DskipITs. It still doesn't.
>> > > Meanwhile, it does its own thing with '-DskipSparkTests', which is not
>> > > appropriate given that none of the other modules have their own
>> distinct
>> > > control parameters. There also doesn't seem to be a distinction between
>> > > unit tests and integration tests. The 'test' target does nothing.
>> > > Everything happens during the 'integration-test' phase. Is this a Spark
>> > > limitation?
>> > >
>> > >
>> > >> On Fri, Oct 14, 2016 at 11:27 AM, Sean Busbey 
>> > wrote:
>> > >>
>> > >> Do the HBase Spark tests only run during the maven verify command?
>> > >> We'll need to update our personality to say that that command should
>> > >> be used for unit tests when in the hbase spark module. ugh.
>> > >>
>> > >> On Thu, Oct 13, 2016 at 7:42 PM, Apekshit Sharma 
>> > >> wrote:
>> > >>> Our patch process isn't running hbase-spark tests. See this for
>> > example:
>> > >>>
>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
>> > >> artifact/patchprocess/patch-unit-hbase-spark.txt/*view*/
>> > >>>
>> > >>> Found it when trying to debug cause of trunk failures. Part of the
>> > cause
>> > >> is
>> > >>> hbase-spark's HBaseConnectionCacheSuite test failure (
>> > >>> https://builds.apache.org/view/All/job/HBase-Trunk_
>> > >> matrix/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/1776/consoleFull
>> > >> > > matrix/jdk=JDK%201.8%20%28latest%29,label=yahoo-not-h2/1776/consoleFull>
>> > >> )
>> > >>> which was added in HBASE-16638. However, to be fair, QA was green and
>> > >>> reported passing hbase-spark tests for that jira.
>> > >>>
>> >  On Mon, Sep 19, 2016 at 12:57 PM, Stack  wrote:
>> > 
>> >  childCustomWorkspace seems to be just the ticket. Nice find Appy.
>> >  St.Ack
>> > 
>> >  On Mon, Sep 19, 2016 at 10:03 AM, Sean Busbey 
>> > >> wrote:
>> > 
>> > > Option 2c looks to be working really well. Thanks for tackling this
>> > >> Appy!
>> > >
>> > > We still have some failures on the master build, but it looks like
>> > > actual problems (or perhaps a flakey). There are several passing
>> > > builds.
>> > >
>> > > This should be pretty easy to replicate on the other jobs. I don't
>> > see
>> > > a downside. Anyone else have concerns?
>> > >
>> > >
>> > > On Fri, Sep 16, 2016 at 6:15 PM, Apekshit Sharma <
>> [email protected]>
>> > > wrote:
>> > >> So this all started with spaces-in-path issue, right?  I think it
>> > >> has
>> > >> gobbled up a lot of time of a lot of people.
>> > >> Let's discuss our options and try to fix it for good. Here are
>> what
>> > >> i
>> >  can
>> > >> think of, and my opinion about them.
>> > >>
>> > >> 1. Not use matrix build
>> > >>  Temporary fix. Not preferred since not applicable to
>> other
>> > >> branches' builds.
>> > >>
>> > >> 2. Use matrix build
>> > >>
>> > >>  a. Use tool environment trick
>> > >>   I applied this few days ago. Seemed to work until we
>> > > discovered
>> > >> scalatest issue. While the solution looks legitimate, we can't
>> trust
>> >  that
>> > >> all tools will use JAVA_HOME instead of directly using java
>> command.
>> > >>
>> > >>  b. Use JDK axix
>> > >>  Doesn't work right now. I don't have much idea of what's
>> > >> the
>> > > cost
>> > >> for fixing 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-10-14 Thread Andrew Purtell
Thanks Ted, that would be a nice contribution, thank you.


On Fri, Oct 14, 2016 at 12:07 PM, Apekshit Sharma  wrote:

> @Ted, here's the old jira, HBASE-14167. Use that.
>
> On Fri, Oct 14, 2016 at 12:02 PM, Ted Yu  wrote:
>
> > I just ran the tests in hbase-spark module using 'mvn verify'.
> >
> > All passed.
> >
> > I am testing a patch locally where hbase-spark tests are run in test
> phase.
> >
> > If the tests pass, I will log a JIRA.
> >
> > Thanks
> >
> > > On Oct 14, 2016, at 11:41 AM, Andrew Purtell 
> > wrote:
> > >
> > > The hbase-spark integration tests run (and fail) for me locally
> whenever
> > I
> > > build master with 'mvn clean install -DskipITs' .
> > >
> > > HBaseConnectionCacheSuite:
> > > - all test cases *** FAILED ***
> > >  2 did not equal 1 (HBaseConnectionCacheSuite.scala:92)
> > >
> > > Saw it but had to ignore/triage to get something else done.
> > >
> > > We have a weird situation where integration tests run when they
> shouldn't
> > > locally yet no tests run at all for patch process?
> > >
> > > I would like to see Spark behave like the other modules. I remember
> > filing
> > > a JIRA asking that hbase-spark honor -DskipITs. It still doesn't.
> > > Meanwhile, it does its own thing with '-DskipSparkTests', which is not
> > > appropriate given that none of the other modules have their own
> distinct
> > > control parameters. There also doesn't seem to be a distinction between
> > > unit tests and integration tests. The 'test' target does nothing.
> > > Everything happens during the 'integration-test' phase. Is this a Spark
> > > limitation?
> > >
> > >
> > >> On Fri, Oct 14, 2016 at 11:27 AM, Sean Busbey 
> > wrote:
> > >>
> > >> Do the HBase Spark tests only run during the maven verify command?
> > >> We'll need to update our personality to say that that command should
> > >> be used for unit tests when in the hbase spark module. ugh.
> > >>
> > >> On Thu, Oct 13, 2016 at 7:42 PM, Apekshit Sharma 
> > >> wrote:
> > >>> Our patch process isn't running hbase-spark tests. See this for
> > example:
> > >>>
> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> > >> artifact/patchprocess/patch-unit-hbase-spark.txt/*view*/
> > >>>
> > >>> Found it when trying to debug cause of trunk failures. Part of the
> > cause
> > >> is
> > >>> hbase-spark's HBaseConnectionCacheSuite test failure (
> > >>> https://builds.apache.org/view/All/job/HBase-Trunk_
> > >> matrix/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/1776/consoleFull
> > >>  > matrix/jdk=JDK%201.8%20%28latest%29,label=yahoo-not-h2/1776/consoleFull>
> > >> )
> > >>> which was added in HBASE-16638. However, to be fair, QA was green and
> > >>> reported passing hbase-spark tests for that jira.
> > >>>
> >  On Mon, Sep 19, 2016 at 12:57 PM, Stack  wrote:
> > 
> >  childCustomWorkspace seems to be just the ticket. Nice find Appy.
> >  St.Ack
> > 
> >  On Mon, Sep 19, 2016 at 10:03 AM, Sean Busbey 
> > >> wrote:
> > 
> > > Option 2c looks to be working really well. Thanks for tackling this
> > >> Appy!
> > >
> > > We still have some failures on the master build, but it looks like
> > > actual problems (or perhaps a flakey). There are several passing
> > > builds.
> > >
> > > This should be pretty easy to replicate on the other jobs. I don't
> > see
> > > a downside. Anyone else have concerns?
> > >
> > >
> > > On Fri, Sep 16, 2016 at 6:15 PM, Apekshit Sharma <
> [email protected]>
> > > wrote:
> > >> So this all started with spaces-in-path issue, right?  I think it
> > >> has
> > >> gobbled up a lot of time of a lot of people.
> > >> Let's discuss our options and try to fix it for good. Here are
> what
> > >> i
> >  can
> > >> think of, and my opinion about them.
> > >>
> > >> 1. Not use matrix build
> > >>  Temporary fix. Not preferred since not applicable to
> other
> > >> branches' builds.
> > >>
> > >> 2. Use matrix build
> > >>
> > >>  a. Use tool environment trick
> > >>   I applied this few days ago. Seemed to work until we
> > > discovered
> > >> scalatest issue. While the solution looks legitimate, we can't
> trust
> >  that
> > >> all tools will use JAVA_HOME instead of directly using java
> command.
> > >>
> > >>  b. Use JDK axix
> > >>  Doesn't work right now. I don't have much idea of what's
> > >> the
> > > cost
> > >> for fixing it.
> > >>
> > >>  c. Use JDK axis with custom child workspace
> > >>
> > >> https://github.com/jenkinsci/matrix-project-plugin/blob/
> > > master/src/main/resources/hudson/matrix/MatrixProject/
> > > help-childCustomWorkspace.html
> > >>  Just found this one, and it might solve things for good.
> I
> >  have
> > >> updated the job t

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-10-14 Thread Apekshit Sharma
@Ted, here's the old jira, HBASE-14167. Use that.

On Fri, Oct 14, 2016 at 12:02 PM, Ted Yu  wrote:

> I just ran the tests in hbase-spark module using 'mvn verify'.
>
> All passed.
>
> I am testing a patch locally where hbase-spark tests are run in test phase.
>
> If the tests pass, I will log a JIRA.
>
> Thanks
>
> > On Oct 14, 2016, at 11:41 AM, Andrew Purtell 
> wrote:
> >
> > The hbase-spark integration tests run (and fail) for me locally whenever
> I
> > build master with 'mvn clean install -DskipITs' .
> >
> > HBaseConnectionCacheSuite:
> > - all test cases *** FAILED ***
> >  2 did not equal 1 (HBaseConnectionCacheSuite.scala:92)
> >
> > Saw it but had to ignore/triage to get something else done.
> >
> > We have a weird situation where integration tests run when they shouldn't
> > locally yet no tests run at all for patch process?
> >
> > I would like to see Spark behave like the other modules. I remember
> filing
> > a JIRA asking that hbase-spark honor -DskipITs. It still doesn't.
> > Meanwhile, it does its own thing with '-DskipSparkTests', which is not
> > appropriate given that none of the other modules have their own distinct
> > control parameters. There also doesn't seem to be a distinction between
> > unit tests and integration tests. The 'test' target does nothing.
> > Everything happens during the 'integration-test' phase. Is this a Spark
> > limitation?
> >
> >
> >> On Fri, Oct 14, 2016 at 11:27 AM, Sean Busbey 
> wrote:
> >>
> >> Do the HBase Spark tests only run during the maven verify command?
> >> We'll need to update our personality to say that that command should
> >> be used for unit tests when in the hbase spark module. ugh.
> >>
> >> On Thu, Oct 13, 2016 at 7:42 PM, Apekshit Sharma 
> >> wrote:
> >>> Our patch process isn't running hbase-spark tests. See this for
> example:
> >>>
> >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> >> artifact/patchprocess/patch-unit-hbase-spark.txt/*view*/
> >>>
> >>> Found it when trying to debug cause of trunk failures. Part of the
> cause
> >> is
> >>> hbase-spark's HBaseConnectionCacheSuite test failure (
> >>> https://builds.apache.org/view/All/job/HBase-Trunk_
> >> matrix/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/1776/consoleFull
> >>  matrix/jdk=JDK%201.8%20%28latest%29,label=yahoo-not-h2/1776/consoleFull>
> >> )
> >>> which was added in HBASE-16638. However, to be fair, QA was green and
> >>> reported passing hbase-spark tests for that jira.
> >>>
>  On Mon, Sep 19, 2016 at 12:57 PM, Stack  wrote:
> 
>  childCustomWorkspace seems to be just the ticket. Nice find Appy.
>  St.Ack
> 
>  On Mon, Sep 19, 2016 at 10:03 AM, Sean Busbey 
> >> wrote:
> 
> > Option 2c looks to be working really well. Thanks for tackling this
> >> Appy!
> >
> > We still have some failures on the master build, but it looks like
> > actual problems (or perhaps a flakey). There are several passing
> > builds.
> >
> > This should be pretty easy to replicate on the other jobs. I don't
> see
> > a downside. Anyone else have concerns?
> >
> >
> > On Fri, Sep 16, 2016 at 6:15 PM, Apekshit Sharma 
> > wrote:
> >> So this all started with spaces-in-path issue, right?  I think it
> >> has
> >> gobbled up a lot of time of a lot of people.
> >> Let's discuss our options and try to fix it for good. Here are what
> >> i
>  can
> >> think of, and my opinion about them.
> >>
> >> 1. Not use matrix build
> >>  Temporary fix. Not preferred since not applicable to other
> >> branches' builds.
> >>
> >> 2. Use matrix build
> >>
> >>  a. Use tool environment trick
> >>   I applied this few days ago. Seemed to work until we
> > discovered
> >> scalatest issue. While the solution looks legitimate, we can't trust
>  that
> >> all tools will use JAVA_HOME instead of directly using java command.
> >>
> >>  b. Use JDK axix
> >>  Doesn't work right now. I don't have much idea of what's
> >> the
> > cost
> >> for fixing it.
> >>
> >>  c. Use JDK axis with custom child workspace
> >>
> >> https://github.com/jenkinsci/matrix-project-plugin/blob/
> > master/src/main/resources/hudson/matrix/MatrixProject/
> > help-childCustomWorkspace.html
> >>  Just found this one, and it might solve things for good. I
>  have
> >> updated the job to use this. Let's see how it works.
> >>
> >> What do others think?
> >>
> >>> On Fri, Sep 16, 2016 at 3:31 PM, Stack  wrote:
> >>>
> >>> The profile (or define) skipSparkTests looks like it will skip
> >> spark
> > tests.
> >>> Setting skipIntegrationTests to true will skip it.
> >>> S
> >>>
> >>> On Fri, Sep 16, 2016 at 1:40 PM, Dima Spivak <
> >> dimaspi...@ap

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-10-14 Thread Ted Yu
I just ran the tests in hbase-spark module using 'mvn verify'. 

All passed. 

I am testing a patch locally where hbase-spark tests are run in test phase. 

If the tests pass, I will log a JIRA. 

Thanks

> On Oct 14, 2016, at 11:41 AM, Andrew Purtell  wrote:
> 
> The hbase-spark integration tests run (and fail) for me locally whenever I
> build master with 'mvn clean install -DskipITs' .
> 
> HBaseConnectionCacheSuite:
> - all test cases *** FAILED ***
>  2 did not equal 1 (HBaseConnectionCacheSuite.scala:92)
> 
> Saw it but had to ignore/triage to get something else done.
> 
> We have a weird situation where integration tests run when they shouldn't
> locally yet no tests run at all for patch process?
> 
> I would like to see Spark behave like the other modules. I remember filing
> a JIRA asking that hbase-spark honor -DskipITs. It still doesn't.
> Meanwhile, it does its own thing with '-DskipSparkTests', which is not
> appropriate given that none of the other modules have their own distinct
> control parameters. There also doesn't seem to be a distinction between
> unit tests and integration tests. The 'test' target does nothing.
> Everything happens during the 'integration-test' phase. Is this a Spark
> limitation?
> 
> 
>> On Fri, Oct 14, 2016 at 11:27 AM, Sean Busbey  wrote:
>> 
>> Do the HBase Spark tests only run during the maven verify command?
>> We'll need to update our personality to say that that command should
>> be used for unit tests when in the hbase spark module. ugh.
>> 
>> On Thu, Oct 13, 2016 at 7:42 PM, Apekshit Sharma 
>> wrote:
>>> Our patch process isn't running hbase-spark tests. See this for example:
>>> 
>>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
>>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
>> artifact/patchprocess/patch-unit-hbase-spark.txt/*view*/
>>> 
>>> Found it when trying to debug cause of trunk failures. Part of the cause
>> is
>>> hbase-spark's HBaseConnectionCacheSuite test failure (
>>> https://builds.apache.org/view/All/job/HBase-Trunk_
>> matrix/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/1776/consoleFull
>> 
>> )
>>> which was added in HBASE-16638. However, to be fair, QA was green and
>>> reported passing hbase-spark tests for that jira.
>>> 
 On Mon, Sep 19, 2016 at 12:57 PM, Stack  wrote:
 
 childCustomWorkspace seems to be just the ticket. Nice find Appy.
 St.Ack
 
 On Mon, Sep 19, 2016 at 10:03 AM, Sean Busbey 
>> wrote:
 
> Option 2c looks to be working really well. Thanks for tackling this
>> Appy!
> 
> We still have some failures on the master build, but it looks like
> actual problems (or perhaps a flakey). There are several passing
> builds.
> 
> This should be pretty easy to replicate on the other jobs. I don't see
> a downside. Anyone else have concerns?
> 
> 
> On Fri, Sep 16, 2016 at 6:15 PM, Apekshit Sharma 
> wrote:
>> So this all started with spaces-in-path issue, right?  I think it
>> has
>> gobbled up a lot of time of a lot of people.
>> Let's discuss our options and try to fix it for good. Here are what
>> i
 can
>> think of, and my opinion about them.
>> 
>> 1. Not use matrix build
>>  Temporary fix. Not preferred since not applicable to other
>> branches' builds.
>> 
>> 2. Use matrix build
>> 
>>  a. Use tool environment trick
>>   I applied this few days ago. Seemed to work until we
> discovered
>> scalatest issue. While the solution looks legitimate, we can't trust
 that
>> all tools will use JAVA_HOME instead of directly using java command.
>> 
>>  b. Use JDK axix
>>  Doesn't work right now. I don't have much idea of what's
>> the
> cost
>> for fixing it.
>> 
>>  c. Use JDK axis with custom child workspace
>> 
>> https://github.com/jenkinsci/matrix-project-plugin/blob/
> master/src/main/resources/hudson/matrix/MatrixProject/
> help-childCustomWorkspace.html
>>  Just found this one, and it might solve things for good. I
 have
>> updated the job to use this. Let's see how it works.
>> 
>> What do others think?
>> 
>>> On Fri, Sep 16, 2016 at 3:31 PM, Stack  wrote:
>>> 
>>> The profile (or define) skipSparkTests looks like it will skip
>> spark
> tests.
>>> Setting skipIntegrationTests to true will skip it.
>>> S
>>> 
>>> On Fri, Sep 16, 2016 at 1:40 PM, Dima Spivak <
>> [email protected]>
>>> wrote:
>>> 
 Doesn't seem we need a matrix project for master anymore since
>> we're
> just
 doing JDK 8 now, right? Also, it looks like the hbase-spark
 integration-test phase is what's tripping up the build. Why not
>> just
 temporarily disable that to unblock testing?

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-10-14 Thread Andrew Purtell
The hbase-spark integration tests run (and fail) for me locally whenever I
build master with 'mvn clean install -DskipITs' .

HBaseConnectionCacheSuite:
- all test cases *** FAILED ***
  2 did not equal 1 (HBaseConnectionCacheSuite.scala:92)

Saw it but had to ignore/triage to get something else done.

We have a weird situation where integration tests run when they shouldn't
locally yet no tests run at all for patch process?

I would like to see Spark behave like the other modules. I remember filing
a JIRA asking that hbase-spark honor -DskipITs. It still doesn't.
Meanwhile, it does its own thing with '-DskipSparkTests', which is not
appropriate given that none of the other modules have their own distinct
control parameters. There also doesn't seem to be a distinction between
unit tests and integration tests. The 'test' target does nothing.
Everything happens during the 'integration-test' phase. Is this a Spark
limitation?


On Fri, Oct 14, 2016 at 11:27 AM, Sean Busbey  wrote:

> Do the HBase Spark tests only run during the maven verify command?
> We'll need to update our personality to say that that command should
> be used for unit tests when in the hbase spark module. ugh.
>
> On Thu, Oct 13, 2016 at 7:42 PM, Apekshit Sharma 
> wrote:
> > Our patch process isn't running hbase-spark tests. See this for example:
> >
> > https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> > https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> artifact/patchprocess/patch-unit-hbase-spark.txt/*view*/
> >
> > Found it when trying to debug cause of trunk failures. Part of the cause
> is
> > hbase-spark's HBaseConnectionCacheSuite test failure (
> > https://builds.apache.org/view/All/job/HBase-Trunk_
> matrix/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/1776/consoleFull
> 
> )
> > which was added in HBASE-16638. However, to be fair, QA was green and
> > reported passing hbase-spark tests for that jira.
> >
> > On Mon, Sep 19, 2016 at 12:57 PM, Stack  wrote:
> >
> >> childCustomWorkspace seems to be just the ticket. Nice find Appy.
> >> St.Ack
> >>
> >> On Mon, Sep 19, 2016 at 10:03 AM, Sean Busbey 
> wrote:
> >>
> >> > Option 2c looks to be working really well. Thanks for tackling this
> Appy!
> >> >
> >> > We still have some failures on the master build, but it looks like
> >> > actual problems (or perhaps a flakey). There are several passing
> >> > builds.
> >> >
> >> > This should be pretty easy to replicate on the other jobs. I don't see
> >> > a downside. Anyone else have concerns?
> >> >
> >> >
> >> > On Fri, Sep 16, 2016 at 6:15 PM, Apekshit Sharma 
> >> > wrote:
> >> > > So this all started with spaces-in-path issue, right?  I think it
> has
> >> > > gobbled up a lot of time of a lot of people.
> >> > > Let's discuss our options and try to fix it for good. Here are what
> i
> >> can
> >> > > think of, and my opinion about them.
> >> > >
> >> > > 1. Not use matrix build
> >> > >   Temporary fix. Not preferred since not applicable to other
> >> > > branches' builds.
> >> > >
> >> > > 2. Use matrix build
> >> > >
> >> > >   a. Use tool environment trick
> >> > >I applied this few days ago. Seemed to work until we
> >> > discovered
> >> > > scalatest issue. While the solution looks legitimate, we can't trust
> >> that
> >> > > all tools will use JAVA_HOME instead of directly using java command.
> >> > >
> >> > >   b. Use JDK axix
> >> > >   Doesn't work right now. I don't have much idea of what's
> the
> >> > cost
> >> > > for fixing it.
> >> > >
> >> > >   c. Use JDK axis with custom child workspace
> >> > >
> >> > > https://github.com/jenkinsci/matrix-project-plugin/blob/
> >> > master/src/main/resources/hudson/matrix/MatrixProject/
> >> > help-childCustomWorkspace.html
> >> > >   Just found this one, and it might solve things for good. I
> >> have
> >> > > updated the job to use this. Let's see how it works.
> >> > >
> >> > > What do others think?
> >> > >
> >> > > On Fri, Sep 16, 2016 at 3:31 PM, Stack  wrote:
> >> > >
> >> > >> The profile (or define) skipSparkTests looks like it will skip
> spark
> >> > tests.
> >> > >> Setting skipIntegrationTests to true will skip it.
> >> > >> S
> >> > >>
> >> > >> On Fri, Sep 16, 2016 at 1:40 PM, Dima Spivak <
> [email protected]>
> >> > >> wrote:
> >> > >>
> >> > >> > Doesn't seem we need a matrix project for master anymore since
> we're
> >> > just
> >> > >> > doing JDK 8 now, right? Also, it looks like the hbase-spark
> >> > >> > integration-test phase is what's tripping up the build. Why not
> just
> >> > >> > temporarily disable that to unblock testing?
> >> > >> >
> >> > >> > On Friday, September 16, 2016, Apekshit Sharma <
> [email protected]>
> >> > >> wrote:
> >> > >> >
> >> > >> > > So the issue is, we set JAVA_HOME to jdk8 based on matrix param
> >> and
> >> > >> using
> >> > >> > > tool enviro

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-10-14 Thread Sean Busbey
Do the HBase Spark tests only run during the maven verify command?
We'll need to update our personality to say that that command should
be used for unit tests when in the hbase spark module. ugh.

On Thu, Oct 13, 2016 at 7:42 PM, Apekshit Sharma  wrote:
> Our patch process isn't running hbase-spark tests. See this for example:
>
> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/artifact/patchprocess/patch-unit-hbase-spark.txt/*view*/
>
> Found it when trying to debug cause of trunk failures. Part of the cause is
> hbase-spark's HBaseConnectionCacheSuite test failure (
> https://builds.apache.org/view/All/job/HBase-Trunk_matrix/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/1776/consoleFull)
> which was added in HBASE-16638. However, to be fair, QA was green and
> reported passing hbase-spark tests for that jira.
>
> On Mon, Sep 19, 2016 at 12:57 PM, Stack  wrote:
>
>> childCustomWorkspace seems to be just the ticket. Nice find Appy.
>> St.Ack
>>
>> On Mon, Sep 19, 2016 at 10:03 AM, Sean Busbey  wrote:
>>
>> > Option 2c looks to be working really well. Thanks for tackling this Appy!
>> >
>> > We still have some failures on the master build, but it looks like
>> > actual problems (or perhaps a flakey). There are several passing
>> > builds.
>> >
>> > This should be pretty easy to replicate on the other jobs. I don't see
>> > a downside. Anyone else have concerns?
>> >
>> >
>> > On Fri, Sep 16, 2016 at 6:15 PM, Apekshit Sharma 
>> > wrote:
>> > > So this all started with spaces-in-path issue, right?  I think it has
>> > > gobbled up a lot of time of a lot of people.
>> > > Let's discuss our options and try to fix it for good. Here are what i
>> can
>> > > think of, and my opinion about them.
>> > >
>> > > 1. Not use matrix build
>> > >   Temporary fix. Not preferred since not applicable to other
>> > > branches' builds.
>> > >
>> > > 2. Use matrix build
>> > >
>> > >   a. Use tool environment trick
>> > >I applied this few days ago. Seemed to work until we
>> > discovered
>> > > scalatest issue. While the solution looks legitimate, we can't trust
>> that
>> > > all tools will use JAVA_HOME instead of directly using java command.
>> > >
>> > >   b. Use JDK axix
>> > >   Doesn't work right now. I don't have much idea of what's the
>> > cost
>> > > for fixing it.
>> > >
>> > >   c. Use JDK axis with custom child workspace
>> > >
>> > > https://github.com/jenkinsci/matrix-project-plugin/blob/
>> > master/src/main/resources/hudson/matrix/MatrixProject/
>> > help-childCustomWorkspace.html
>> > >   Just found this one, and it might solve things for good. I
>> have
>> > > updated the job to use this. Let's see how it works.
>> > >
>> > > What do others think?
>> > >
>> > > On Fri, Sep 16, 2016 at 3:31 PM, Stack  wrote:
>> > >
>> > >> The profile (or define) skipSparkTests looks like it will skip spark
>> > tests.
>> > >> Setting skipIntegrationTests to true will skip it.
>> > >> S
>> > >>
>> > >> On Fri, Sep 16, 2016 at 1:40 PM, Dima Spivak 
>> > >> wrote:
>> > >>
>> > >> > Doesn't seem we need a matrix project for master anymore since we're
>> > just
>> > >> > doing JDK 8 now, right? Also, it looks like the hbase-spark
>> > >> > integration-test phase is what's tripping up the build. Why not just
>> > >> > temporarily disable that to unblock testing?
>> > >> >
>> > >> > On Friday, September 16, 2016, Apekshit Sharma 
>> > >> wrote:
>> > >> >
>> > >> > > So the issue is, we set JAVA_HOME to jdk8 based on matrix param
>> and
>> > >> using
>> > >> > > tool environment. Since mvn uses the env variable, it compiles
>> with
>> > jdk
>> > >> > 8.
>> > >> > > But i suspect that scalatest isn't using the env variable, instead
>> > it
>> > >> > might
>> > >> > > be directly using 'java' cmd, which can be jdk 7 or 8, and can
>> vary
>> > by
>> > >> > > machine.
>> > >> > > Build succeed if 'java' points to jdk 8, otherwise fails.
>> > >> > > Note that we didn't have this issue earlier since we were using
>> > jenkins
>> > >> > > 'JDK' axis which would set the 'java' to the appropriate version.
>> > But
>> > >> > that
>> > >> > > methods had spaces-in-path issue, so i had to change it.
>> > >> > >
>> > >> > >
>> > >> > > On Fri, Sep 16, 2016 at 3:46 AM, aman poonia <
>> > [email protected]
>> > >> > > >
>> > >> > > wrote:
>> > >> > >
>> > >> > > > I am not sure if this will help. But it looks like it is because
>> > of
>> > >> > > version
>> > >> > > > mismatch, that is, it is expecting JDK1.7 and we are compiling
>> > with
>> > >> > > jdk1.8.
>> > >> > > > That means there is some library which has to be compiled with
>> > jdk8
>> > >> or
>> > >> > > > needs to be updated to a jdk8 compatible version.
>> > >> > > >
>> > >> > > >
>> > >> > > > --
>> > >> > > > *With Regards:-*
>> > >> > > > *Aman Poonia*
>> > >> > > >
>> > >> > > > On Fri, Sep 16, 2016 at 2:40 AM, Apekshit Sharma <
>> > [email protected]
>> > >> > > >
>>

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-10-13 Thread Apekshit Sharma
Our patch process isn't running hbase-spark tests. See this for example:

https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
https://builds.apache.org/job/PreCommit-HBASE-Build/3842/artifact/patchprocess/patch-unit-hbase-spark.txt/*view*/

Found it when trying to debug cause of trunk failures. Part of the cause is
hbase-spark's HBaseConnectionCacheSuite test failure (
https://builds.apache.org/view/All/job/HBase-Trunk_matrix/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/1776/consoleFull)
which was added in HBASE-16638. However, to be fair, QA was green and
reported passing hbase-spark tests for that jira.

On Mon, Sep 19, 2016 at 12:57 PM, Stack  wrote:

> childCustomWorkspace seems to be just the ticket. Nice find Appy.
> St.Ack
>
> On Mon, Sep 19, 2016 at 10:03 AM, Sean Busbey  wrote:
>
> > Option 2c looks to be working really well. Thanks for tackling this Appy!
> >
> > We still have some failures on the master build, but it looks like
> > actual problems (or perhaps a flakey). There are several passing
> > builds.
> >
> > This should be pretty easy to replicate on the other jobs. I don't see
> > a downside. Anyone else have concerns?
> >
> >
> > On Fri, Sep 16, 2016 at 6:15 PM, Apekshit Sharma 
> > wrote:
> > > So this all started with spaces-in-path issue, right?  I think it has
> > > gobbled up a lot of time of a lot of people.
> > > Let's discuss our options and try to fix it for good. Here are what i
> can
> > > think of, and my opinion about them.
> > >
> > > 1. Not use matrix build
> > >   Temporary fix. Not preferred since not applicable to other
> > > branches' builds.
> > >
> > > 2. Use matrix build
> > >
> > >   a. Use tool environment trick
> > >I applied this few days ago. Seemed to work until we
> > discovered
> > > scalatest issue. While the solution looks legitimate, we can't trust
> that
> > > all tools will use JAVA_HOME instead of directly using java command.
> > >
> > >   b. Use JDK axix
> > >   Doesn't work right now. I don't have much idea of what's the
> > cost
> > > for fixing it.
> > >
> > >   c. Use JDK axis with custom child workspace
> > >
> > > https://github.com/jenkinsci/matrix-project-plugin/blob/
> > master/src/main/resources/hudson/matrix/MatrixProject/
> > help-childCustomWorkspace.html
> > >   Just found this one, and it might solve things for good. I
> have
> > > updated the job to use this. Let's see how it works.
> > >
> > > What do others think?
> > >
> > > On Fri, Sep 16, 2016 at 3:31 PM, Stack  wrote:
> > >
> > >> The profile (or define) skipSparkTests looks like it will skip spark
> > tests.
> > >> Setting skipIntegrationTests to true will skip it.
> > >> S
> > >>
> > >> On Fri, Sep 16, 2016 at 1:40 PM, Dima Spivak 
> > >> wrote:
> > >>
> > >> > Doesn't seem we need a matrix project for master anymore since we're
> > just
> > >> > doing JDK 8 now, right? Also, it looks like the hbase-spark
> > >> > integration-test phase is what's tripping up the build. Why not just
> > >> > temporarily disable that to unblock testing?
> > >> >
> > >> > On Friday, September 16, 2016, Apekshit Sharma 
> > >> wrote:
> > >> >
> > >> > > So the issue is, we set JAVA_HOME to jdk8 based on matrix param
> and
> > >> using
> > >> > > tool environment. Since mvn uses the env variable, it compiles
> with
> > jdk
> > >> > 8.
> > >> > > But i suspect that scalatest isn't using the env variable, instead
> > it
> > >> > might
> > >> > > be directly using 'java' cmd, which can be jdk 7 or 8, and can
> vary
> > by
> > >> > > machine.
> > >> > > Build succeed if 'java' points to jdk 8, otherwise fails.
> > >> > > Note that we didn't have this issue earlier since we were using
> > jenkins
> > >> > > 'JDK' axis which would set the 'java' to the appropriate version.
> > But
> > >> > that
> > >> > > methods had spaces-in-path issue, so i had to change it.
> > >> > >
> > >> > >
> > >> > > On Fri, Sep 16, 2016 at 3:46 AM, aman poonia <
> > [email protected]
> > >> > > >
> > >> > > wrote:
> > >> > >
> > >> > > > I am not sure if this will help. But it looks like it is because
> > of
> > >> > > version
> > >> > > > mismatch, that is, it is expecting JDK1.7 and we are compiling
> > with
> > >> > > jdk1.8.
> > >> > > > That means there is some library which has to be compiled with
> > jdk8
> > >> or
> > >> > > > needs to be updated to a jdk8 compatible version.
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > *With Regards:-*
> > >> > > > *Aman Poonia*
> > >> > > >
> > >> > > > On Fri, Sep 16, 2016 at 2:40 AM, Apekshit Sharma <
> > [email protected]
> > >> > > >
> > >> > > > wrote:
> > >> > > >
> > >> > > > > Andeverything is back to red.
> > >> > > > > Because something is plaguing our builds again. :(
> > >> > > > >
> > >> > > > > If anyone knows what's problem in this case, please reply on
> > this
> > >> > > thread,
> > >> > > > > otherwise i'll try to fix it later sometime today.
> > >> > > > >
> > >> > > > > [INFO] *--- scalatest-maven-plugi

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-19 Thread Stack
childCustomWorkspace seems to be just the ticket. Nice find Appy.
St.Ack

On Mon, Sep 19, 2016 at 10:03 AM, Sean Busbey  wrote:

> Option 2c looks to be working really well. Thanks for tackling this Appy!
>
> We still have some failures on the master build, but it looks like
> actual problems (or perhaps a flakey). There are several passing
> builds.
>
> This should be pretty easy to replicate on the other jobs. I don't see
> a downside. Anyone else have concerns?
>
>
> On Fri, Sep 16, 2016 at 6:15 PM, Apekshit Sharma 
> wrote:
> > So this all started with spaces-in-path issue, right?  I think it has
> > gobbled up a lot of time of a lot of people.
> > Let's discuss our options and try to fix it for good. Here are what i can
> > think of, and my opinion about them.
> >
> > 1. Not use matrix build
> >   Temporary fix. Not preferred since not applicable to other
> > branches' builds.
> >
> > 2. Use matrix build
> >
> >   a. Use tool environment trick
> >I applied this few days ago. Seemed to work until we
> discovered
> > scalatest issue. While the solution looks legitimate, we can't trust that
> > all tools will use JAVA_HOME instead of directly using java command.
> >
> >   b. Use JDK axix
> >   Doesn't work right now. I don't have much idea of what's the
> cost
> > for fixing it.
> >
> >   c. Use JDK axis with custom child workspace
> >
> > https://github.com/jenkinsci/matrix-project-plugin/blob/
> master/src/main/resources/hudson/matrix/MatrixProject/
> help-childCustomWorkspace.html
> >   Just found this one, and it might solve things for good. I have
> > updated the job to use this. Let's see how it works.
> >
> > What do others think?
> >
> > On Fri, Sep 16, 2016 at 3:31 PM, Stack  wrote:
> >
> >> The profile (or define) skipSparkTests looks like it will skip spark
> tests.
> >> Setting skipIntegrationTests to true will skip it.
> >> S
> >>
> >> On Fri, Sep 16, 2016 at 1:40 PM, Dima Spivak 
> >> wrote:
> >>
> >> > Doesn't seem we need a matrix project for master anymore since we're
> just
> >> > doing JDK 8 now, right? Also, it looks like the hbase-spark
> >> > integration-test phase is what's tripping up the build. Why not just
> >> > temporarily disable that to unblock testing?
> >> >
> >> > On Friday, September 16, 2016, Apekshit Sharma 
> >> wrote:
> >> >
> >> > > So the issue is, we set JAVA_HOME to jdk8 based on matrix param and
> >> using
> >> > > tool environment. Since mvn uses the env variable, it compiles with
> jdk
> >> > 8.
> >> > > But i suspect that scalatest isn't using the env variable, instead
> it
> >> > might
> >> > > be directly using 'java' cmd, which can be jdk 7 or 8, and can vary
> by
> >> > > machine.
> >> > > Build succeed if 'java' points to jdk 8, otherwise fails.
> >> > > Note that we didn't have this issue earlier since we were using
> jenkins
> >> > > 'JDK' axis which would set the 'java' to the appropriate version.
> But
> >> > that
> >> > > methods had spaces-in-path issue, so i had to change it.
> >> > >
> >> > >
> >> > > On Fri, Sep 16, 2016 at 3:46 AM, aman poonia <
> [email protected]
> >> > > >
> >> > > wrote:
> >> > >
> >> > > > I am not sure if this will help. But it looks like it is because
> of
> >> > > version
> >> > > > mismatch, that is, it is expecting JDK1.7 and we are compiling
> with
> >> > > jdk1.8.
> >> > > > That means there is some library which has to be compiled with
> jdk8
> >> or
> >> > > > needs to be updated to a jdk8 compatible version.
> >> > > >
> >> > > >
> >> > > > --
> >> > > > *With Regards:-*
> >> > > > *Aman Poonia*
> >> > > >
> >> > > > On Fri, Sep 16, 2016 at 2:40 AM, Apekshit Sharma <
> [email protected]
> >> > > >
> >> > > > wrote:
> >> > > >
> >> > > > > Andeverything is back to red.
> >> > > > > Because something is plaguing our builds again. :(
> >> > > > >
> >> > > > > If anyone knows what's problem in this case, please reply on
> this
> >> > > thread,
> >> > > > > otherwise i'll try to fix it later sometime today.
> >> > > > >
> >> > > > > [INFO] *--- scalatest-maven-plugin:1.0:test (integration-test)
> @
> >> > > > > hbase-spark ---
> >> > > > > * [36mDiscovery starting. [0m
> >> > > > >  [31m*** RUN ABORTED *** [0m
> >> > > > >  [31m  java.lang.UnsupportedClassVersionError:
> >> > > > > org/apache/hadoop/hbase/spark/example/hbasecontext/
> >> > > > > JavaHBaseDistributedScan
> >> > > > > : Unsupported major.minor version 52.0 [0m
> >> > > > >  [31m  at java.lang.ClassLoader.defineClass1(Native Method) [0m
> >> > > > >  [31m  at java.lang.ClassLoader.defineClass(ClassLoader.java:
> 803)
> >> > [0m
> >> > > > >  [31m  at java.security.SecureClassLoader.defineClass(
> >> > > > > SecureClassLoader.java:142)
> >> > > > > [0m
> >> > > > >  [31m  at java.net.URLClassLoader.defineClass(URLClassLoader.
> >> > java:449)
> >> > > > [0m
> >> > > > >  [31m  at java.net.URLClassLoader.access$100(URLClassLoader.
> >> java:71)
> >> > > [0m
> >> > > > >  [31m  at java.net.URLClassLoader$1.run(

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-19 Thread Sean Busbey
Option 2c looks to be working really well. Thanks for tackling this Appy!

We still have some failures on the master build, but it looks like
actual problems (or perhaps a flakey). There are several passing
builds.

This should be pretty easy to replicate on the other jobs. I don't see
a downside. Anyone else have concerns?


On Fri, Sep 16, 2016 at 6:15 PM, Apekshit Sharma  wrote:
> So this all started with spaces-in-path issue, right?  I think it has
> gobbled up a lot of time of a lot of people.
> Let's discuss our options and try to fix it for good. Here are what i can
> think of, and my opinion about them.
>
> 1. Not use matrix build
>   Temporary fix. Not preferred since not applicable to other
> branches' builds.
>
> 2. Use matrix build
>
>   a. Use tool environment trick
>I applied this few days ago. Seemed to work until we discovered
> scalatest issue. While the solution looks legitimate, we can't trust that
> all tools will use JAVA_HOME instead of directly using java command.
>
>   b. Use JDK axix
>   Doesn't work right now. I don't have much idea of what's the cost
> for fixing it.
>
>   c. Use JDK axis with custom child workspace
>
> https://github.com/jenkinsci/matrix-project-plugin/blob/master/src/main/resources/hudson/matrix/MatrixProject/help-childCustomWorkspace.html
>   Just found this one, and it might solve things for good. I have
> updated the job to use this. Let's see how it works.
>
> What do others think?
>
> On Fri, Sep 16, 2016 at 3:31 PM, Stack  wrote:
>
>> The profile (or define) skipSparkTests looks like it will skip spark tests.
>> Setting skipIntegrationTests to true will skip it.
>> S
>>
>> On Fri, Sep 16, 2016 at 1:40 PM, Dima Spivak 
>> wrote:
>>
>> > Doesn't seem we need a matrix project for master anymore since we're just
>> > doing JDK 8 now, right? Also, it looks like the hbase-spark
>> > integration-test phase is what's tripping up the build. Why not just
>> > temporarily disable that to unblock testing?
>> >
>> > On Friday, September 16, 2016, Apekshit Sharma 
>> wrote:
>> >
>> > > So the issue is, we set JAVA_HOME to jdk8 based on matrix param and
>> using
>> > > tool environment. Since mvn uses the env variable, it compiles with jdk
>> > 8.
>> > > But i suspect that scalatest isn't using the env variable, instead it
>> > might
>> > > be directly using 'java' cmd, which can be jdk 7 or 8, and can vary by
>> > > machine.
>> > > Build succeed if 'java' points to jdk 8, otherwise fails.
>> > > Note that we didn't have this issue earlier since we were using jenkins
>> > > 'JDK' axis which would set the 'java' to the appropriate version. But
>> > that
>> > > methods had spaces-in-path issue, so i had to change it.
>> > >
>> > >
>> > > On Fri, Sep 16, 2016 at 3:46 AM, aman poonia > > > >
>> > > wrote:
>> > >
>> > > > I am not sure if this will help. But it looks like it is because of
>> > > version
>> > > > mismatch, that is, it is expecting JDK1.7 and we are compiling with
>> > > jdk1.8.
>> > > > That means there is some library which has to be compiled with jdk8
>> or
>> > > > needs to be updated to a jdk8 compatible version.
>> > > >
>> > > >
>> > > > --
>> > > > *With Regards:-*
>> > > > *Aman Poonia*
>> > > >
>> > > > On Fri, Sep 16, 2016 at 2:40 AM, Apekshit Sharma > > > >
>> > > > wrote:
>> > > >
>> > > > > Andeverything is back to red.
>> > > > > Because something is plaguing our builds again. :(
>> > > > >
>> > > > > If anyone knows what's problem in this case, please reply on this
>> > > thread,
>> > > > > otherwise i'll try to fix it later sometime today.
>> > > > >
>> > > > > [INFO] *--- scalatest-maven-plugin:1.0:test (integration-test) @
>> > > > > hbase-spark ---
>> > > > > * [36mDiscovery starting. [0m
>> > > > >  [31m*** RUN ABORTED *** [0m
>> > > > >  [31m  java.lang.UnsupportedClassVersionError:
>> > > > > org/apache/hadoop/hbase/spark/example/hbasecontext/
>> > > > > JavaHBaseDistributedScan
>> > > > > : Unsupported major.minor version 52.0 [0m
>> > > > >  [31m  at java.lang.ClassLoader.defineClass1(Native Method) [0m
>> > > > >  [31m  at java.lang.ClassLoader.defineClass(ClassLoader.java:803)
>> > [0m
>> > > > >  [31m  at java.security.SecureClassLoader.defineClass(
>> > > > > SecureClassLoader.java:142)
>> > > > > [0m
>> > > > >  [31m  at java.net.URLClassLoader.defineClass(URLClassLoader.
>> > java:449)
>> > > > [0m
>> > > > >  [31m  at java.net.URLClassLoader.access$100(URLClassLoader.
>> java:71)
>> > > [0m
>> > > > >  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>> [0m
>> > > > >  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>> [0m
>> > > > >  [31m  at java.security.AccessController.doPrivileged(Native
>> Method)
>> > > [0m
>> > > > >  [31m  at java.net.URLClassLoader.findClass(URLClassLoader.java:
>> 354)
>> > > [0m
>> > > > >  [31m  at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>> [0m
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Mon, Sep 12, 2016 at 5:0

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-16 Thread Apekshit Sharma
So this all started with spaces-in-path issue, right?  I think it has
gobbled up a lot of time of a lot of people.
Let's discuss our options and try to fix it for good. Here are what i can
think of, and my opinion about them.

1. Not use matrix build
  Temporary fix. Not preferred since not applicable to other
branches' builds.

2. Use matrix build

  a. Use tool environment trick
   I applied this few days ago. Seemed to work until we discovered
scalatest issue. While the solution looks legitimate, we can't trust that
all tools will use JAVA_HOME instead of directly using java command.

  b. Use JDK axix
  Doesn't work right now. I don't have much idea of what's the cost
for fixing it.

  c. Use JDK axis with custom child workspace

https://github.com/jenkinsci/matrix-project-plugin/blob/master/src/main/resources/hudson/matrix/MatrixProject/help-childCustomWorkspace.html
  Just found this one, and it might solve things for good. I have
updated the job to use this. Let's see how it works.

What do others think?

On Fri, Sep 16, 2016 at 3:31 PM, Stack  wrote:

> The profile (or define) skipSparkTests looks like it will skip spark tests.
> Setting skipIntegrationTests to true will skip it.
> S
>
> On Fri, Sep 16, 2016 at 1:40 PM, Dima Spivak 
> wrote:
>
> > Doesn't seem we need a matrix project for master anymore since we're just
> > doing JDK 8 now, right? Also, it looks like the hbase-spark
> > integration-test phase is what's tripping up the build. Why not just
> > temporarily disable that to unblock testing?
> >
> > On Friday, September 16, 2016, Apekshit Sharma 
> wrote:
> >
> > > So the issue is, we set JAVA_HOME to jdk8 based on matrix param and
> using
> > > tool environment. Since mvn uses the env variable, it compiles with jdk
> > 8.
> > > But i suspect that scalatest isn't using the env variable, instead it
> > might
> > > be directly using 'java' cmd, which can be jdk 7 or 8, and can vary by
> > > machine.
> > > Build succeed if 'java' points to jdk 8, otherwise fails.
> > > Note that we didn't have this issue earlier since we were using jenkins
> > > 'JDK' axis which would set the 'java' to the appropriate version. But
> > that
> > > methods had spaces-in-path issue, so i had to change it.
> > >
> > >
> > > On Fri, Sep 16, 2016 at 3:46 AM, aman poonia  > > >
> > > wrote:
> > >
> > > > I am not sure if this will help. But it looks like it is because of
> > > version
> > > > mismatch, that is, it is expecting JDK1.7 and we are compiling with
> > > jdk1.8.
> > > > That means there is some library which has to be compiled with jdk8
> or
> > > > needs to be updated to a jdk8 compatible version.
> > > >
> > > >
> > > > --
> > > > *With Regards:-*
> > > > *Aman Poonia*
> > > >
> > > > On Fri, Sep 16, 2016 at 2:40 AM, Apekshit Sharma  > > >
> > > > wrote:
> > > >
> > > > > Andeverything is back to red.
> > > > > Because something is plaguing our builds again. :(
> > > > >
> > > > > If anyone knows what's problem in this case, please reply on this
> > > thread,
> > > > > otherwise i'll try to fix it later sometime today.
> > > > >
> > > > > [INFO] *--- scalatest-maven-plugin:1.0:test (integration-test) @
> > > > > hbase-spark ---
> > > > > * [36mDiscovery starting. [0m
> > > > >  [31m*** RUN ABORTED *** [0m
> > > > >  [31m  java.lang.UnsupportedClassVersionError:
> > > > > org/apache/hadoop/hbase/spark/example/hbasecontext/
> > > > > JavaHBaseDistributedScan
> > > > > : Unsupported major.minor version 52.0 [0m
> > > > >  [31m  at java.lang.ClassLoader.defineClass1(Native Method) [0m
> > > > >  [31m  at java.lang.ClassLoader.defineClass(ClassLoader.java:803)
> > [0m
> > > > >  [31m  at java.security.SecureClassLoader.defineClass(
> > > > > SecureClassLoader.java:142)
> > > > > [0m
> > > > >  [31m  at java.net.URLClassLoader.defineClass(URLClassLoader.
> > java:449)
> > > > [0m
> > > > >  [31m  at java.net.URLClassLoader.access$100(URLClassLoader.
> java:71)
> > > [0m
> > > > >  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> [0m
> > > > >  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> [0m
> > > > >  [31m  at java.security.AccessController.doPrivileged(Native
> Method)
> > > [0m
> > > > >  [31m  at java.net.URLClassLoader.findClass(URLClassLoader.java:
> 354)
> > > [0m
> > > > >  [31m  at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> [0m
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Sep 12, 2016 at 5:01 PM, Mikhail Antonov <
> > [email protected]
> > > >
> > > > > wrote:
> > > > >
> > > > > > Great work indeed!
> > > > > >
> > > > > > Agreed, occasional failed runs may not be that bad, but fairly
> > > regular
> > > > > > failed runs ruin the idea of CI. Especially for released or
> > otherwise
> > > > > > supposedly stable branches.
> > > > > >
> > > > > > -Mikhail
> > > > > >
> > > > > > On Mon, Sep 12, 2016 at 4:53 PM, Sean Busbey <
> [email protected]
> > > >
> > > > > wrote:
> > > > > >
> > > > > > > awesome wor

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-16 Thread Stack
The profile (or define) skipSparkTests looks like it will skip spark tests.
Setting skipIntegrationTests to true will skip it.
S

On Fri, Sep 16, 2016 at 1:40 PM, Dima Spivak  wrote:

> Doesn't seem we need a matrix project for master anymore since we're just
> doing JDK 8 now, right? Also, it looks like the hbase-spark
> integration-test phase is what's tripping up the build. Why not just
> temporarily disable that to unblock testing?
>
> On Friday, September 16, 2016, Apekshit Sharma  wrote:
>
> > So the issue is, we set JAVA_HOME to jdk8 based on matrix param and using
> > tool environment. Since mvn uses the env variable, it compiles with jdk
> 8.
> > But i suspect that scalatest isn't using the env variable, instead it
> might
> > be directly using 'java' cmd, which can be jdk 7 or 8, and can vary by
> > machine.
> > Build succeed if 'java' points to jdk 8, otherwise fails.
> > Note that we didn't have this issue earlier since we were using jenkins
> > 'JDK' axis which would set the 'java' to the appropriate version. But
> that
> > methods had spaces-in-path issue, so i had to change it.
> >
> >
> > On Fri, Sep 16, 2016 at 3:46 AM, aman poonia  > >
> > wrote:
> >
> > > I am not sure if this will help. But it looks like it is because of
> > version
> > > mismatch, that is, it is expecting JDK1.7 and we are compiling with
> > jdk1.8.
> > > That means there is some library which has to be compiled with jdk8 or
> > > needs to be updated to a jdk8 compatible version.
> > >
> > >
> > > --
> > > *With Regards:-*
> > > *Aman Poonia*
> > >
> > > On Fri, Sep 16, 2016 at 2:40 AM, Apekshit Sharma  > >
> > > wrote:
> > >
> > > > Andeverything is back to red.
> > > > Because something is plaguing our builds again. :(
> > > >
> > > > If anyone knows what's problem in this case, please reply on this
> > thread,
> > > > otherwise i'll try to fix it later sometime today.
> > > >
> > > > [INFO] *--- scalatest-maven-plugin:1.0:test (integration-test) @
> > > > hbase-spark ---
> > > > * [36mDiscovery starting. [0m
> > > >  [31m*** RUN ABORTED *** [0m
> > > >  [31m  java.lang.UnsupportedClassVersionError:
> > > > org/apache/hadoop/hbase/spark/example/hbasecontext/
> > > > JavaHBaseDistributedScan
> > > > : Unsupported major.minor version 52.0 [0m
> > > >  [31m  at java.lang.ClassLoader.defineClass1(Native Method) [0m
> > > >  [31m  at java.lang.ClassLoader.defineClass(ClassLoader.java:803)
> [0m
> > > >  [31m  at java.security.SecureClassLoader.defineClass(
> > > > SecureClassLoader.java:142)
> > > > [0m
> > > >  [31m  at java.net.URLClassLoader.defineClass(URLClassLoader.
> java:449)
> > > [0m
> > > >  [31m  at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
> > [0m
> > > >  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:361) [0m
> > > >  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:355) [0m
> > > >  [31m  at java.security.AccessController.doPrivileged(Native Method)
> > [0m
> > > >  [31m  at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> > [0m
> > > >  [31m  at java.lang.ClassLoader.loadClass(ClassLoader.java:425) [0m
> > > >
> > > >
> > > >
> > > > On Mon, Sep 12, 2016 at 5:01 PM, Mikhail Antonov <
> [email protected]
> > >
> > > > wrote:
> > > >
> > > > > Great work indeed!
> > > > >
> > > > > Agreed, occasional failed runs may not be that bad, but fairly
> > regular
> > > > > failed runs ruin the idea of CI. Especially for released or
> otherwise
> > > > > supposedly stable branches.
> > > > >
> > > > > -Mikhail
> > > > >
> > > > > On Mon, Sep 12, 2016 at 4:53 PM, Sean Busbey  > >
> > > > wrote:
> > > > >
> > > > > > awesome work Appy!
> > > > > >
> > > > > > That's certainly good news to hear.
> > > > > >
> > > > > > On Mon, Sep 12, 2016 at 2:14 PM, Apekshit Sharma <
> > [email protected] >
> > > > > > wrote:
> > > > > > > On a separate note:
> > > > > > > Trunk had 8 green runs in last 3 days! (
> > > > > > > https://builds.apache.org/job/HBase-Trunk_matrix/)
> > > > > > > This was due to fixing just the mass failures on trunk and no
> > > change
> > > > in
> > > > > > > flaky infra. Which made me to conclude two things:
> > > > > > > 1. Flaky infra works.
> > > > > > > 2. It relies heavily on the post-commit build's stability
> (which
> > > > every
> > > > > > > project should anyways strive for). If the build fails
> > > > catastrophically
> > > > > > > once in a while, we can just exclude that one run using a flag
> > and
> > > > > > > everything will work, but if it happens frequently, then it
> won't
> > > > work
> > > > > > > right.
> > > > > > >
> > > > > > > I have re-enabled Flaky tests job (
> > > > > > > https://builds.apache.org/view/H-L/view/HBase/job/HBASE-
> > > Flaky-Tests/
> > > > )
> > > > > > which
> > > > > > > was disabled for almost a month due to trunk being on fire.
> > > > > > > I will keep an eye on how things are going.
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Sep 12, 2016 at 2:02 PM, Apekshit Sharma <
> > > [email protected]

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-16 Thread Dima Spivak
Doesn't seem we need a matrix project for master anymore since we're just
doing JDK 8 now, right? Also, it looks like the hbase-spark
integration-test phase is what's tripping up the build. Why not just
temporarily disable that to unblock testing?

On Friday, September 16, 2016, Apekshit Sharma  wrote:

> So the issue is, we set JAVA_HOME to jdk8 based on matrix param and using
> tool environment. Since mvn uses the env variable, it compiles with jdk 8.
> But i suspect that scalatest isn't using the env variable, instead it might
> be directly using 'java' cmd, which can be jdk 7 or 8, and can vary by
> machine.
> Build succeed if 'java' points to jdk 8, otherwise fails.
> Note that we didn't have this issue earlier since we were using jenkins
> 'JDK' axis which would set the 'java' to the appropriate version. But that
> methods had spaces-in-path issue, so i had to change it.
>
>
> On Fri, Sep 16, 2016 at 3:46 AM, aman poonia  >
> wrote:
>
> > I am not sure if this will help. But it looks like it is because of
> version
> > mismatch, that is, it is expecting JDK1.7 and we are compiling with
> jdk1.8.
> > That means there is some library which has to be compiled with jdk8 or
> > needs to be updated to a jdk8 compatible version.
> >
> >
> > --
> > *With Regards:-*
> > *Aman Poonia*
> >
> > On Fri, Sep 16, 2016 at 2:40 AM, Apekshit Sharma  >
> > wrote:
> >
> > > Andeverything is back to red.
> > > Because something is plaguing our builds again. :(
> > >
> > > If anyone knows what's problem in this case, please reply on this
> thread,
> > > otherwise i'll try to fix it later sometime today.
> > >
> > > [INFO] *--- scalatest-maven-plugin:1.0:test (integration-test) @
> > > hbase-spark ---
> > > * [36mDiscovery starting. [0m
> > >  [31m*** RUN ABORTED *** [0m
> > >  [31m  java.lang.UnsupportedClassVersionError:
> > > org/apache/hadoop/hbase/spark/example/hbasecontext/
> > > JavaHBaseDistributedScan
> > > : Unsupported major.minor version 52.0 [0m
> > >  [31m  at java.lang.ClassLoader.defineClass1(Native Method) [0m
> > >  [31m  at java.lang.ClassLoader.defineClass(ClassLoader.java:803) [0m
> > >  [31m  at java.security.SecureClassLoader.defineClass(
> > > SecureClassLoader.java:142)
> > > [0m
> > >  [31m  at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
> > [0m
> > >  [31m  at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
> [0m
> > >  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:361) [0m
> > >  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:355) [0m
> > >  [31m  at java.security.AccessController.doPrivileged(Native Method)
> [0m
> > >  [31m  at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> [0m
> > >  [31m  at java.lang.ClassLoader.loadClass(ClassLoader.java:425) [0m
> > >
> > >
> > >
> > > On Mon, Sep 12, 2016 at 5:01 PM, Mikhail Antonov  >
> > > wrote:
> > >
> > > > Great work indeed!
> > > >
> > > > Agreed, occasional failed runs may not be that bad, but fairly
> regular
> > > > failed runs ruin the idea of CI. Especially for released or otherwise
> > > > supposedly stable branches.
> > > >
> > > > -Mikhail
> > > >
> > > > On Mon, Sep 12, 2016 at 4:53 PM, Sean Busbey  >
> > > wrote:
> > > >
> > > > > awesome work Appy!
> > > > >
> > > > > That's certainly good news to hear.
> > > > >
> > > > > On Mon, Sep 12, 2016 at 2:14 PM, Apekshit Sharma <
> [email protected] >
> > > > > wrote:
> > > > > > On a separate note:
> > > > > > Trunk had 8 green runs in last 3 days! (
> > > > > > https://builds.apache.org/job/HBase-Trunk_matrix/)
> > > > > > This was due to fixing just the mass failures on trunk and no
> > change
> > > in
> > > > > > flaky infra. Which made me to conclude two things:
> > > > > > 1. Flaky infra works.
> > > > > > 2. It relies heavily on the post-commit build's stability (which
> > > every
> > > > > > project should anyways strive for). If the build fails
> > > catastrophically
> > > > > > once in a while, we can just exclude that one run using a flag
> and
> > > > > > everything will work, but if it happens frequently, then it won't
> > > work
> > > > > > right.
> > > > > >
> > > > > > I have re-enabled Flaky tests job (
> > > > > > https://builds.apache.org/view/H-L/view/HBase/job/HBASE-
> > Flaky-Tests/
> > > )
> > > > > which
> > > > > > was disabled for almost a month due to trunk being on fire.
> > > > > > I will keep an eye on how things are going.
> > > > > >
> > > > > >
> > > > > > On Mon, Sep 12, 2016 at 2:02 PM, Apekshit Sharma <
> > [email protected] >
> > > > > wrote:
> > > > > >
> > > > > >> @Sean, Mikhail: I found the alternate solution. Using user
> defined
> > > > axis,
> > > > > >> tool environment and env variable injection.
> > > > > >> See latest diff to https://builds.apache.org/job/
> > > HBase-Trunk_matrix/
> > > > > job
> > > > > >> for reference.
> > > > > >>
> > > > > >>
> > > > > >> On Tue, Aug 30, 2016 at 7:39 PM, Mikhail Antonov <
> > > > [email protected] >
> > > > > >> wrote:
> > > > > >>
> 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-16 Thread Apekshit Sharma
So the issue is, we set JAVA_HOME to jdk8 based on matrix param and using
tool environment. Since mvn uses the env variable, it compiles with jdk 8.
But i suspect that scalatest isn't using the env variable, instead it might
be directly using 'java' cmd, which can be jdk 7 or 8, and can vary by
machine.
Build succeed if 'java' points to jdk 8, otherwise fails.
Note that we didn't have this issue earlier since we were using jenkins
'JDK' axis which would set the 'java' to the appropriate version. But that
methods had spaces-in-path issue, so i had to change it.


On Fri, Sep 16, 2016 at 3:46 AM, aman poonia 
wrote:

> I am not sure if this will help. But it looks like it is because of version
> mismatch, that is, it is expecting JDK1.7 and we are compiling with jdk1.8.
> That means there is some library which has to be compiled with jdk8 or
> needs to be updated to a jdk8 compatible version.
>
>
> --
> *With Regards:-*
> *Aman Poonia*
>
> On Fri, Sep 16, 2016 at 2:40 AM, Apekshit Sharma 
> wrote:
>
> > Andeverything is back to red.
> > Because something is plaguing our builds again. :(
> >
> > If anyone knows what's problem in this case, please reply on this thread,
> > otherwise i'll try to fix it later sometime today.
> >
> > [INFO] *--- scalatest-maven-plugin:1.0:test (integration-test) @
> > hbase-spark ---
> > * [36mDiscovery starting. [0m
> >  [31m*** RUN ABORTED *** [0m
> >  [31m  java.lang.UnsupportedClassVersionError:
> > org/apache/hadoop/hbase/spark/example/hbasecontext/
> > JavaHBaseDistributedScan
> > : Unsupported major.minor version 52.0 [0m
> >  [31m  at java.lang.ClassLoader.defineClass1(Native Method) [0m
> >  [31m  at java.lang.ClassLoader.defineClass(ClassLoader.java:803) [0m
> >  [31m  at java.security.SecureClassLoader.defineClass(
> > SecureClassLoader.java:142)
> > [0m
> >  [31m  at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
> [0m
> >  [31m  at java.net.URLClassLoader.access$100(URLClassLoader.java:71) [0m
> >  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:361) [0m
> >  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:355) [0m
> >  [31m  at java.security.AccessController.doPrivileged(Native Method) [0m
> >  [31m  at java.net.URLClassLoader.findClass(URLClassLoader.java:354) [0m
> >  [31m  at java.lang.ClassLoader.loadClass(ClassLoader.java:425) [0m
> >
> >
> >
> > On Mon, Sep 12, 2016 at 5:01 PM, Mikhail Antonov 
> > wrote:
> >
> > > Great work indeed!
> > >
> > > Agreed, occasional failed runs may not be that bad, but fairly regular
> > > failed runs ruin the idea of CI. Especially for released or otherwise
> > > supposedly stable branches.
> > >
> > > -Mikhail
> > >
> > > On Mon, Sep 12, 2016 at 4:53 PM, Sean Busbey 
> > wrote:
> > >
> > > > awesome work Appy!
> > > >
> > > > That's certainly good news to hear.
> > > >
> > > > On Mon, Sep 12, 2016 at 2:14 PM, Apekshit Sharma 
> > > > wrote:
> > > > > On a separate note:
> > > > > Trunk had 8 green runs in last 3 days! (
> > > > > https://builds.apache.org/job/HBase-Trunk_matrix/)
> > > > > This was due to fixing just the mass failures on trunk and no
> change
> > in
> > > > > flaky infra. Which made me to conclude two things:
> > > > > 1. Flaky infra works.
> > > > > 2. It relies heavily on the post-commit build's stability (which
> > every
> > > > > project should anyways strive for). If the build fails
> > catastrophically
> > > > > once in a while, we can just exclude that one run using a flag and
> > > > > everything will work, but if it happens frequently, then it won't
> > work
> > > > > right.
> > > > >
> > > > > I have re-enabled Flaky tests job (
> > > > > https://builds.apache.org/view/H-L/view/HBase/job/HBASE-
> Flaky-Tests/
> > )
> > > > which
> > > > > was disabled for almost a month due to trunk being on fire.
> > > > > I will keep an eye on how things are going.
> > > > >
> > > > >
> > > > > On Mon, Sep 12, 2016 at 2:02 PM, Apekshit Sharma <
> [email protected]>
> > > > wrote:
> > > > >
> > > > >> @Sean, Mikhail: I found the alternate solution. Using user defined
> > > axis,
> > > > >> tool environment and env variable injection.
> > > > >> See latest diff to https://builds.apache.org/job/
> > HBase-Trunk_matrix/
> > > > job
> > > > >> for reference.
> > > > >>
> > > > >>
> > > > >> On Tue, Aug 30, 2016 at 7:39 PM, Mikhail Antonov <
> > > [email protected]>
> > > > >> wrote:
> > > > >>
> > > > >>> FYI, I did the same for branch-1.3 builds.  I've disabled
> hbase-1.3
> > > and
> > > > >>> hbase-1.3-IT jobs and instead created
> > > > >>>
> > > > >>> https://builds.apache.org/job/HBase-1.3-JDK8 and
> > > > >>> https://builds.apache.org/job/HBase-1.3-JDK7
> > > > >>>
> > > > >>> This should work for now until we figure out how to move forward.
> > > > >>>
> > > > >>> Thanks,
> > > > >>> Mikhail
> > > > >>>
> > > > >>> On Wed, Aug 17, 2016 at 1:41 PM, Sean Busbey <
> [email protected]>
> > > > wrote:
> > > > >>>
> > > > >>> > /me smacks forehead
> > > > >>> >
> > > > >>> 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-16 Thread aman poonia
I am not sure if this will help. But it looks like it is because of version
mismatch, that is, it is expecting JDK1.7 and we are compiling with jdk1.8.
That means there is some library which has to be compiled with jdk8 or
needs to be updated to a jdk8 compatible version.


-- 
*With Regards:-*
*Aman Poonia*

On Fri, Sep 16, 2016 at 2:40 AM, Apekshit Sharma  wrote:

> Andeverything is back to red.
> Because something is plaguing our builds again. :(
>
> If anyone knows what's problem in this case, please reply on this thread,
> otherwise i'll try to fix it later sometime today.
>
> [INFO] *--- scalatest-maven-plugin:1.0:test (integration-test) @
> hbase-spark ---
> * [36mDiscovery starting. [0m
>  [31m*** RUN ABORTED *** [0m
>  [31m  java.lang.UnsupportedClassVersionError:
> org/apache/hadoop/hbase/spark/example/hbasecontext/
> JavaHBaseDistributedScan
> : Unsupported major.minor version 52.0 [0m
>  [31m  at java.lang.ClassLoader.defineClass1(Native Method) [0m
>  [31m  at java.lang.ClassLoader.defineClass(ClassLoader.java:803) [0m
>  [31m  at java.security.SecureClassLoader.defineClass(
> SecureClassLoader.java:142)
> [0m
>  [31m  at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) [0m
>  [31m  at java.net.URLClassLoader.access$100(URLClassLoader.java:71) [0m
>  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:361) [0m
>  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:355) [0m
>  [31m  at java.security.AccessController.doPrivileged(Native Method) [0m
>  [31m  at java.net.URLClassLoader.findClass(URLClassLoader.java:354) [0m
>  [31m  at java.lang.ClassLoader.loadClass(ClassLoader.java:425) [0m
>
>
>
> On Mon, Sep 12, 2016 at 5:01 PM, Mikhail Antonov 
> wrote:
>
> > Great work indeed!
> >
> > Agreed, occasional failed runs may not be that bad, but fairly regular
> > failed runs ruin the idea of CI. Especially for released or otherwise
> > supposedly stable branches.
> >
> > -Mikhail
> >
> > On Mon, Sep 12, 2016 at 4:53 PM, Sean Busbey 
> wrote:
> >
> > > awesome work Appy!
> > >
> > > That's certainly good news to hear.
> > >
> > > On Mon, Sep 12, 2016 at 2:14 PM, Apekshit Sharma 
> > > wrote:
> > > > On a separate note:
> > > > Trunk had 8 green runs in last 3 days! (
> > > > https://builds.apache.org/job/HBase-Trunk_matrix/)
> > > > This was due to fixing just the mass failures on trunk and no change
> in
> > > > flaky infra. Which made me to conclude two things:
> > > > 1. Flaky infra works.
> > > > 2. It relies heavily on the post-commit build's stability (which
> every
> > > > project should anyways strive for). If the build fails
> catastrophically
> > > > once in a while, we can just exclude that one run using a flag and
> > > > everything will work, but if it happens frequently, then it won't
> work
> > > > right.
> > > >
> > > > I have re-enabled Flaky tests job (
> > > > https://builds.apache.org/view/H-L/view/HBase/job/HBASE-Flaky-Tests/
> )
> > > which
> > > > was disabled for almost a month due to trunk being on fire.
> > > > I will keep an eye on how things are going.
> > > >
> > > >
> > > > On Mon, Sep 12, 2016 at 2:02 PM, Apekshit Sharma 
> > > wrote:
> > > >
> > > >> @Sean, Mikhail: I found the alternate solution. Using user defined
> > axis,
> > > >> tool environment and env variable injection.
> > > >> See latest diff to https://builds.apache.org/job/
> HBase-Trunk_matrix/
> > > job
> > > >> for reference.
> > > >>
> > > >>
> > > >> On Tue, Aug 30, 2016 at 7:39 PM, Mikhail Antonov <
> > [email protected]>
> > > >> wrote:
> > > >>
> > > >>> FYI, I did the same for branch-1.3 builds.  I've disabled hbase-1.3
> > and
> > > >>> hbase-1.3-IT jobs and instead created
> > > >>>
> > > >>> https://builds.apache.org/job/HBase-1.3-JDK8 and
> > > >>> https://builds.apache.org/job/HBase-1.3-JDK7
> > > >>>
> > > >>> This should work for now until we figure out how to move forward.
> > > >>>
> > > >>> Thanks,
> > > >>> Mikhail
> > > >>>
> > > >>> On Wed, Aug 17, 2016 at 1:41 PM, Sean Busbey 
> > > wrote:
> > > >>>
> > > >>> > /me smacks forehead
> > > >>> >
> > > >>> > these replacement jobs, of course, also have special characters
> in
> > > >>> > their names which then show up in the working path.
> > > >>> >
> > > >>> > renaming them to skip spaces and parens.
> > > >>> >
> > > >>> > On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey <
> > [email protected]>
> > > >>> > wrote:
> > > >>> > > FYI, it looks like essentially our entire CI suite is red,
> > probably
> > > >>> due
> > > >>> > to
> > > >>> > > parts of our codebase not tolerating spaces or other special
> > > >>> characters
> > > >>> > in
> > > >>> > > the working directory.
> > > >>> > >
> > > >>> > > I've made a stop-gap non-multi-configuration set of jobs for
> > > running
> > > >>> unit
> > > >>> > > tests for the 1.2 branch against JDK 7 and JDK 8:
> > > >>> > >
> > > >>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> > > >>> > 201.2%20(JDK%201.7)/
> > > >>> > >
> > > >>> > > https

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-15 Thread Apekshit Sharma
Emm, can it be because scalatest tries to use a different java then that
specified by JAVA_HOME (which is used to compile)

On Thu, Sep 15, 2016 at 2:10 PM, Apekshit Sharma  wrote:

> Andeverything is back to red.
> Because something is plaguing our builds again. :(
>
> If anyone knows what's problem in this case, please reply on this thread,
> otherwise i'll try to fix it later sometime today.
>
> [INFO] *--- scalatest-maven-plugin:1.0:test (integration-test) @ hbase-spark 
> ---
> * [36mDiscovery starting. [0m
>  [31m*** RUN ABORTED *** [0m
>  [31m  java.lang.UnsupportedClassVersionError: 
> org/apache/hadoop/hbase/spark/example/hbasecontext/JavaHBaseDistributedScan : 
> Unsupported major.minor version 52.0 [0m
>  [31m  at java.lang.ClassLoader.defineClass1(Native Method) [0m
>  [31m  at java.lang.ClassLoader.defineClass(ClassLoader.java:803) [0m
>  [31m  at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) [0m
>  [31m  at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) [0m
>  [31m  at java.net.URLClassLoader.access$100(URLClassLoader.java:71) [0m
>  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:361) [0m
>  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:355) [0m
>  [31m  at java.security.AccessController.doPrivileged(Native Method) [0m
>  [31m  at java.net.URLClassLoader.findClass(URLClassLoader.java:354) [0m
>  [31m  at java.lang.ClassLoader.loadClass(ClassLoader.java:425) [0m
>
>
>
> On Mon, Sep 12, 2016 at 5:01 PM, Mikhail Antonov 
> wrote:
>
>> Great work indeed!
>>
>> Agreed, occasional failed runs may not be that bad, but fairly regular
>> failed runs ruin the idea of CI. Especially for released or otherwise
>> supposedly stable branches.
>>
>> -Mikhail
>>
>> On Mon, Sep 12, 2016 at 4:53 PM, Sean Busbey  wrote:
>>
>> > awesome work Appy!
>> >
>> > That's certainly good news to hear.
>> >
>> > On Mon, Sep 12, 2016 at 2:14 PM, Apekshit Sharma 
>> > wrote:
>> > > On a separate note:
>> > > Trunk had 8 green runs in last 3 days! (
>> > > https://builds.apache.org/job/HBase-Trunk_matrix/)
>> > > This was due to fixing just the mass failures on trunk and no change
>> in
>> > > flaky infra. Which made me to conclude two things:
>> > > 1. Flaky infra works.
>> > > 2. It relies heavily on the post-commit build's stability (which every
>> > > project should anyways strive for). If the build fails
>> catastrophically
>> > > once in a while, we can just exclude that one run using a flag and
>> > > everything will work, but if it happens frequently, then it won't work
>> > > right.
>> > >
>> > > I have re-enabled Flaky tests job (
>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBASE-Flaky-Tests/)
>> > which
>> > > was disabled for almost a month due to trunk being on fire.
>> > > I will keep an eye on how things are going.
>> > >
>> > >
>> > > On Mon, Sep 12, 2016 at 2:02 PM, Apekshit Sharma 
>> > wrote:
>> > >
>> > >> @Sean, Mikhail: I found the alternate solution. Using user defined
>> axis,
>> > >> tool environment and env variable injection.
>> > >> See latest diff to https://builds.apache.org/job/HBase-Trunk_matrix/
>> > job
>> > >> for reference.
>> > >>
>> > >>
>> > >> On Tue, Aug 30, 2016 at 7:39 PM, Mikhail Antonov <
>> [email protected]>
>> > >> wrote:
>> > >>
>> > >>> FYI, I did the same for branch-1.3 builds.  I've disabled hbase-1.3
>> and
>> > >>> hbase-1.3-IT jobs and instead created
>> > >>>
>> > >>> https://builds.apache.org/job/HBase-1.3-JDK8 and
>> > >>> https://builds.apache.org/job/HBase-1.3-JDK7
>> > >>>
>> > >>> This should work for now until we figure out how to move forward.
>> > >>>
>> > >>> Thanks,
>> > >>> Mikhail
>> > >>>
>> > >>> On Wed, Aug 17, 2016 at 1:41 PM, Sean Busbey 
>> > wrote:
>> > >>>
>> > >>> > /me smacks forehead
>> > >>> >
>> > >>> > these replacement jobs, of course, also have special characters in
>> > >>> > their names which then show up in the working path.
>> > >>> >
>> > >>> > renaming them to skip spaces and parens.
>> > >>> >
>> > >>> > On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey <
>> [email protected]>
>> > >>> > wrote:
>> > >>> > > FYI, it looks like essentially our entire CI suite is red,
>> probably
>> > >>> due
>> > >>> > to
>> > >>> > > parts of our codebase not tolerating spaces or other special
>> > >>> characters
>> > >>> > in
>> > >>> > > the working directory.
>> > >>> > >
>> > >>> > > I've made a stop-gap non-multi-configuration set of jobs for
>> > running
>> > >>> unit
>> > >>> > > tests for the 1.2 branch against JDK 7 and JDK 8:
>> > >>> > >
>> > >>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
>> > >>> > 201.2%20(JDK%201.7)/
>> > >>> > >
>> > >>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
>> > >>> > 201.2%20(JDK%201.8)/
>> > >>> > >
>> > >>> > > Due to the lack of response from infra@ I suspect our only
>> options
>> > >>> for
>> > >>> > > continuing on ASF infra is to fix whatever part of our build
>> > doesn't
>> > >>> > > t

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-15 Thread Dima Spivak
52.0 is Java 8. Sounds like the code was compiled to target a later version
than is being used at runtime. Are we accidentally using JDK 7 to run
dependencies built and deployed with JDK 8?

-Dima

On Thu, Sep 15, 2016 at 2:10 PM, Apekshit Sharma  wrote:

> Andeverything is back to red.
> Because something is plaguing our builds again. :(
>
> If anyone knows what's problem in this case, please reply on this thread,
> otherwise i'll try to fix it later sometime today.
>
> [INFO] *--- scalatest-maven-plugin:1.0:test (integration-test) @
> hbase-spark ---
> * [36mDiscovery starting. [0m
>  [31m*** RUN ABORTED *** [0m
>  [31m  java.lang.UnsupportedClassVersionError:
> org/apache/hadoop/hbase/spark/example/hbasecontext/
> JavaHBaseDistributedScan
> : Unsupported major.minor version 52.0 [0m
>  [31m  at java.lang.ClassLoader.defineClass1(Native Method) [0m
>  [31m  at java.lang.ClassLoader.defineClass(ClassLoader.java:803) [0m
>  [31m  at java.security.SecureClassLoader.defineClass(
> SecureClassLoader.java:142)
> [0m
>  [31m  at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) [0m
>  [31m  at java.net.URLClassLoader.access$100(URLClassLoader.java:71) [0m
>  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:361) [0m
>  [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:355) [0m
>  [31m  at java.security.AccessController.doPrivileged(Native Method) [0m
>  [31m  at java.net.URLClassLoader.findClass(URLClassLoader.java:354) [0m
>  [31m  at java.lang.ClassLoader.loadClass(ClassLoader.java:425) [0m
>
>
>
> On Mon, Sep 12, 2016 at 5:01 PM, Mikhail Antonov 
> wrote:
>
> > Great work indeed!
> >
> > Agreed, occasional failed runs may not be that bad, but fairly regular
> > failed runs ruin the idea of CI. Especially for released or otherwise
> > supposedly stable branches.
> >
> > -Mikhail
> >
> > On Mon, Sep 12, 2016 at 4:53 PM, Sean Busbey 
> wrote:
> >
> > > awesome work Appy!
> > >
> > > That's certainly good news to hear.
> > >
> > > On Mon, Sep 12, 2016 at 2:14 PM, Apekshit Sharma 
> > > wrote:
> > > > On a separate note:
> > > > Trunk had 8 green runs in last 3 days! (
> > > > https://builds.apache.org/job/HBase-Trunk_matrix/)
> > > > This was due to fixing just the mass failures on trunk and no change
> in
> > > > flaky infra. Which made me to conclude two things:
> > > > 1. Flaky infra works.
> > > > 2. It relies heavily on the post-commit build's stability (which
> every
> > > > project should anyways strive for). If the build fails
> catastrophically
> > > > once in a while, we can just exclude that one run using a flag and
> > > > everything will work, but if it happens frequently, then it won't
> work
> > > > right.
> > > >
> > > > I have re-enabled Flaky tests job (
> > > > https://builds.apache.org/view/H-L/view/HBase/job/HBASE-Flaky-Tests/
> )
> > > which
> > > > was disabled for almost a month due to trunk being on fire.
> > > > I will keep an eye on how things are going.
> > > >
> > > >
> > > > On Mon, Sep 12, 2016 at 2:02 PM, Apekshit Sharma 
> > > wrote:
> > > >
> > > >> @Sean, Mikhail: I found the alternate solution. Using user defined
> > axis,
> > > >> tool environment and env variable injection.
> > > >> See latest diff to https://builds.apache.org/job/
> HBase-Trunk_matrix/
> > > job
> > > >> for reference.
> > > >>
> > > >>
> > > >> On Tue, Aug 30, 2016 at 7:39 PM, Mikhail Antonov <
> > [email protected]>
> > > >> wrote:
> > > >>
> > > >>> FYI, I did the same for branch-1.3 builds.  I've disabled hbase-1.3
> > and
> > > >>> hbase-1.3-IT jobs and instead created
> > > >>>
> > > >>> https://builds.apache.org/job/HBase-1.3-JDK8 and
> > > >>> https://builds.apache.org/job/HBase-1.3-JDK7
> > > >>>
> > > >>> This should work for now until we figure out how to move forward.
> > > >>>
> > > >>> Thanks,
> > > >>> Mikhail
> > > >>>
> > > >>> On Wed, Aug 17, 2016 at 1:41 PM, Sean Busbey 
> > > wrote:
> > > >>>
> > > >>> > /me smacks forehead
> > > >>> >
> > > >>> > these replacement jobs, of course, also have special characters
> in
> > > >>> > their names which then show up in the working path.
> > > >>> >
> > > >>> > renaming them to skip spaces and parens.
> > > >>> >
> > > >>> > On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey <
> > [email protected]>
> > > >>> > wrote:
> > > >>> > > FYI, it looks like essentially our entire CI suite is red,
> > probably
> > > >>> due
> > > >>> > to
> > > >>> > > parts of our codebase not tolerating spaces or other special
> > > >>> characters
> > > >>> > in
> > > >>> > > the working directory.
> > > >>> > >
> > > >>> > > I've made a stop-gap non-multi-configuration set of jobs for
> > > running
> > > >>> unit
> > > >>> > > tests for the 1.2 branch against JDK 7 and JDK 8:
> > > >>> > >
> > > >>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> > > >>> > 201.2%20(JDK%201.7)/
> > > >>> > >
> > > >>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> > > >>> > 201.2%20(JDK%201.8)/
> > > >>> > >
> > > >>> > > 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-15 Thread Apekshit Sharma
Andeverything is back to red.
Because something is plaguing our builds again. :(

If anyone knows what's problem in this case, please reply on this thread,
otherwise i'll try to fix it later sometime today.

[INFO] *--- scalatest-maven-plugin:1.0:test (integration-test) @ hbase-spark ---
* [36mDiscovery starting. [0m
 [31m*** RUN ABORTED *** [0m
 [31m  java.lang.UnsupportedClassVersionError:
org/apache/hadoop/hbase/spark/example/hbasecontext/JavaHBaseDistributedScan
: Unsupported major.minor version 52.0 [0m
 [31m  at java.lang.ClassLoader.defineClass1(Native Method) [0m
 [31m  at java.lang.ClassLoader.defineClass(ClassLoader.java:803) [0m
 [31m  at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
[0m
 [31m  at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) [0m
 [31m  at java.net.URLClassLoader.access$100(URLClassLoader.java:71) [0m
 [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:361) [0m
 [31m  at java.net.URLClassLoader$1.run(URLClassLoader.java:355) [0m
 [31m  at java.security.AccessController.doPrivileged(Native Method) [0m
 [31m  at java.net.URLClassLoader.findClass(URLClassLoader.java:354) [0m
 [31m  at java.lang.ClassLoader.loadClass(ClassLoader.java:425) [0m



On Mon, Sep 12, 2016 at 5:01 PM, Mikhail Antonov 
wrote:

> Great work indeed!
>
> Agreed, occasional failed runs may not be that bad, but fairly regular
> failed runs ruin the idea of CI. Especially for released or otherwise
> supposedly stable branches.
>
> -Mikhail
>
> On Mon, Sep 12, 2016 at 4:53 PM, Sean Busbey  wrote:
>
> > awesome work Appy!
> >
> > That's certainly good news to hear.
> >
> > On Mon, Sep 12, 2016 at 2:14 PM, Apekshit Sharma 
> > wrote:
> > > On a separate note:
> > > Trunk had 8 green runs in last 3 days! (
> > > https://builds.apache.org/job/HBase-Trunk_matrix/)
> > > This was due to fixing just the mass failures on trunk and no change in
> > > flaky infra. Which made me to conclude two things:
> > > 1. Flaky infra works.
> > > 2. It relies heavily on the post-commit build's stability (which every
> > > project should anyways strive for). If the build fails catastrophically
> > > once in a while, we can just exclude that one run using a flag and
> > > everything will work, but if it happens frequently, then it won't work
> > > right.
> > >
> > > I have re-enabled Flaky tests job (
> > > https://builds.apache.org/view/H-L/view/HBase/job/HBASE-Flaky-Tests/)
> > which
> > > was disabled for almost a month due to trunk being on fire.
> > > I will keep an eye on how things are going.
> > >
> > >
> > > On Mon, Sep 12, 2016 at 2:02 PM, Apekshit Sharma 
> > wrote:
> > >
> > >> @Sean, Mikhail: I found the alternate solution. Using user defined
> axis,
> > >> tool environment and env variable injection.
> > >> See latest diff to https://builds.apache.org/job/HBase-Trunk_matrix/
> > job
> > >> for reference.
> > >>
> > >>
> > >> On Tue, Aug 30, 2016 at 7:39 PM, Mikhail Antonov <
> [email protected]>
> > >> wrote:
> > >>
> > >>> FYI, I did the same for branch-1.3 builds.  I've disabled hbase-1.3
> and
> > >>> hbase-1.3-IT jobs and instead created
> > >>>
> > >>> https://builds.apache.org/job/HBase-1.3-JDK8 and
> > >>> https://builds.apache.org/job/HBase-1.3-JDK7
> > >>>
> > >>> This should work for now until we figure out how to move forward.
> > >>>
> > >>> Thanks,
> > >>> Mikhail
> > >>>
> > >>> On Wed, Aug 17, 2016 at 1:41 PM, Sean Busbey 
> > wrote:
> > >>>
> > >>> > /me smacks forehead
> > >>> >
> > >>> > these replacement jobs, of course, also have special characters in
> > >>> > their names which then show up in the working path.
> > >>> >
> > >>> > renaming them to skip spaces and parens.
> > >>> >
> > >>> > On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey <
> [email protected]>
> > >>> > wrote:
> > >>> > > FYI, it looks like essentially our entire CI suite is red,
> probably
> > >>> due
> > >>> > to
> > >>> > > parts of our codebase not tolerating spaces or other special
> > >>> characters
> > >>> > in
> > >>> > > the working directory.
> > >>> > >
> > >>> > > I've made a stop-gap non-multi-configuration set of jobs for
> > running
> > >>> unit
> > >>> > > tests for the 1.2 branch against JDK 7 and JDK 8:
> > >>> > >
> > >>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> > >>> > 201.2%20(JDK%201.7)/
> > >>> > >
> > >>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> > >>> > 201.2%20(JDK%201.8)/
> > >>> > >
> > >>> > > Due to the lack of response from infra@ I suspect our only
> options
> > >>> for
> > >>> > > continuing on ASF infra is to fix whatever part of our build
> > doesn't
> > >>> > > tolerate the new paths, or stop using multiconfiguration
> > deployments.
> > >>> I
> > >>> > am
> > >>> > > obviously less than thrilled at the idea of having several
> > multiples
> > >>> of
> > >>> > > current jobs.
> > >>> > >
> > >>> > >
> > >>> > > On Wed, Aug 10, 2016 at 6:28 PM, Sean Busbey <
> [email protected]>
> > >>> > wrote:
> > >

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-12 Thread Mikhail Antonov
Great work indeed!

Agreed, occasional failed runs may not be that bad, but fairly regular
failed runs ruin the idea of CI. Especially for released or otherwise
supposedly stable branches.

-Mikhail

On Mon, Sep 12, 2016 at 4:53 PM, Sean Busbey  wrote:

> awesome work Appy!
>
> That's certainly good news to hear.
>
> On Mon, Sep 12, 2016 at 2:14 PM, Apekshit Sharma 
> wrote:
> > On a separate note:
> > Trunk had 8 green runs in last 3 days! (
> > https://builds.apache.org/job/HBase-Trunk_matrix/)
> > This was due to fixing just the mass failures on trunk and no change in
> > flaky infra. Which made me to conclude two things:
> > 1. Flaky infra works.
> > 2. It relies heavily on the post-commit build's stability (which every
> > project should anyways strive for). If the build fails catastrophically
> > once in a while, we can just exclude that one run using a flag and
> > everything will work, but if it happens frequently, then it won't work
> > right.
> >
> > I have re-enabled Flaky tests job (
> > https://builds.apache.org/view/H-L/view/HBase/job/HBASE-Flaky-Tests/)
> which
> > was disabled for almost a month due to trunk being on fire.
> > I will keep an eye on how things are going.
> >
> >
> > On Mon, Sep 12, 2016 at 2:02 PM, Apekshit Sharma 
> wrote:
> >
> >> @Sean, Mikhail: I found the alternate solution. Using user defined axis,
> >> tool environment and env variable injection.
> >> See latest diff to https://builds.apache.org/job/HBase-Trunk_matrix/
> job
> >> for reference.
> >>
> >>
> >> On Tue, Aug 30, 2016 at 7:39 PM, Mikhail Antonov 
> >> wrote:
> >>
> >>> FYI, I did the same for branch-1.3 builds.  I've disabled hbase-1.3 and
> >>> hbase-1.3-IT jobs and instead created
> >>>
> >>> https://builds.apache.org/job/HBase-1.3-JDK8 and
> >>> https://builds.apache.org/job/HBase-1.3-JDK7
> >>>
> >>> This should work for now until we figure out how to move forward.
> >>>
> >>> Thanks,
> >>> Mikhail
> >>>
> >>> On Wed, Aug 17, 2016 at 1:41 PM, Sean Busbey 
> wrote:
> >>>
> >>> > /me smacks forehead
> >>> >
> >>> > these replacement jobs, of course, also have special characters in
> >>> > their names which then show up in the working path.
> >>> >
> >>> > renaming them to skip spaces and parens.
> >>> >
> >>> > On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey 
> >>> > wrote:
> >>> > > FYI, it looks like essentially our entire CI suite is red, probably
> >>> due
> >>> > to
> >>> > > parts of our codebase not tolerating spaces or other special
> >>> characters
> >>> > in
> >>> > > the working directory.
> >>> > >
> >>> > > I've made a stop-gap non-multi-configuration set of jobs for
> running
> >>> unit
> >>> > > tests for the 1.2 branch against JDK 7 and JDK 8:
> >>> > >
> >>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> >>> > 201.2%20(JDK%201.7)/
> >>> > >
> >>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> >>> > 201.2%20(JDK%201.8)/
> >>> > >
> >>> > > Due to the lack of response from infra@ I suspect our only options
> >>> for
> >>> > > continuing on ASF infra is to fix whatever part of our build
> doesn't
> >>> > > tolerate the new paths, or stop using multiconfiguration
> deployments.
> >>> I
> >>> > am
> >>> > > obviously less than thrilled at the idea of having several
> multiples
> >>> of
> >>> > > current jobs.
> >>> > >
> >>> > >
> >>> > > On Wed, Aug 10, 2016 at 6:28 PM, Sean Busbey 
> >>> > wrote:
> >>> > >
> >>> > >> Ugh.
> >>> > >>
> >>> > >> I sent a reply to Gav on builds@ about maybe getting names that
> >>> don't
> >>> > >> have spaces in them:
> >>> > >>
> >>> > >> https://lists.apache.org/thread.html/
> 8ac03dc62f9d6862d4f3d5eb37119c
> >>> > >> 9c73b4059aaa3ebba52fc63bb6@%3Cbuilds.apache.org%3E
> >>> > >>
> >>> > >> In the mean time, is this an issue we need file with Hadoop or
> >>> > >> something we need to fix in our own code?
> >>> > >>
> >>> > >> On Wed, Aug 10, 2016 at 6:04 PM, Matteo Bertozzi
> >>> > >>  wrote:
> >>> > >> > There are a bunch of builds that have most of the test failing.
> >>> > >> >
> >>> > >> > Example:
> >>> > >> > https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=
> >>> > >> JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/
> >>> > >> org.apache.hadoop.hbase/TestLocalHBaseCluster/
> testLocalHBaseCluster/
> >>> > >> >
> >>> > >> > from the stack trace looks like the problem is with the jdk name
> >>> that
> >>> > has
> >>> > >> > spaces:
> >>> > >> > the hadoop FsVolumeImpl calls setNameFormat(... +
> >>> fileName.toString()
> >>> > +
> >>> > >> ...)
> >>> > >> > and this seems to not be escaped
> >>> > >> > so we end up with JDK%25201.7%2520(latest) in the string format
> >>> and we
> >>> > >> get
> >>> > >> > a IllegalFormatPrecisionException: 7
> >>> > >> >
> >>> > >> > 2016-08-10 22:07:46,108 WARN  [DataNode:
> >>> > >> > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
> >>> > >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
> >>> > >> h2/hbase-server/target/test-data/e7099624-ecfa-4674

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-12 Thread Sean Busbey
awesome work Appy!

That's certainly good news to hear.

On Mon, Sep 12, 2016 at 2:14 PM, Apekshit Sharma  wrote:
> On a separate note:
> Trunk had 8 green runs in last 3 days! (
> https://builds.apache.org/job/HBase-Trunk_matrix/)
> This was due to fixing just the mass failures on trunk and no change in
> flaky infra. Which made me to conclude two things:
> 1. Flaky infra works.
> 2. It relies heavily on the post-commit build's stability (which every
> project should anyways strive for). If the build fails catastrophically
> once in a while, we can just exclude that one run using a flag and
> everything will work, but if it happens frequently, then it won't work
> right.
>
> I have re-enabled Flaky tests job (
> https://builds.apache.org/view/H-L/view/HBase/job/HBASE-Flaky-Tests/) which
> was disabled for almost a month due to trunk being on fire.
> I will keep an eye on how things are going.
>
>
> On Mon, Sep 12, 2016 at 2:02 PM, Apekshit Sharma  wrote:
>
>> @Sean, Mikhail: I found the alternate solution. Using user defined axis,
>> tool environment and env variable injection.
>> See latest diff to https://builds.apache.org/job/HBase-Trunk_matrix/ job
>> for reference.
>>
>>
>> On Tue, Aug 30, 2016 at 7:39 PM, Mikhail Antonov 
>> wrote:
>>
>>> FYI, I did the same for branch-1.3 builds.  I've disabled hbase-1.3 and
>>> hbase-1.3-IT jobs and instead created
>>>
>>> https://builds.apache.org/job/HBase-1.3-JDK8 and
>>> https://builds.apache.org/job/HBase-1.3-JDK7
>>>
>>> This should work for now until we figure out how to move forward.
>>>
>>> Thanks,
>>> Mikhail
>>>
>>> On Wed, Aug 17, 2016 at 1:41 PM, Sean Busbey  wrote:
>>>
>>> > /me smacks forehead
>>> >
>>> > these replacement jobs, of course, also have special characters in
>>> > their names which then show up in the working path.
>>> >
>>> > renaming them to skip spaces and parens.
>>> >
>>> > On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey 
>>> > wrote:
>>> > > FYI, it looks like essentially our entire CI suite is red, probably
>>> due
>>> > to
>>> > > parts of our codebase not tolerating spaces or other special
>>> characters
>>> > in
>>> > > the working directory.
>>> > >
>>> > > I've made a stop-gap non-multi-configuration set of jobs for running
>>> unit
>>> > > tests for the 1.2 branch against JDK 7 and JDK 8:
>>> > >
>>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
>>> > 201.2%20(JDK%201.7)/
>>> > >
>>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
>>> > 201.2%20(JDK%201.8)/
>>> > >
>>> > > Due to the lack of response from infra@ I suspect our only options
>>> for
>>> > > continuing on ASF infra is to fix whatever part of our build doesn't
>>> > > tolerate the new paths, or stop using multiconfiguration deployments.
>>> I
>>> > am
>>> > > obviously less than thrilled at the idea of having several multiples
>>> of
>>> > > current jobs.
>>> > >
>>> > >
>>> > > On Wed, Aug 10, 2016 at 6:28 PM, Sean Busbey 
>>> > wrote:
>>> > >
>>> > >> Ugh.
>>> > >>
>>> > >> I sent a reply to Gav on builds@ about maybe getting names that
>>> don't
>>> > >> have spaces in them:
>>> > >>
>>> > >> https://lists.apache.org/thread.html/8ac03dc62f9d6862d4f3d5eb37119c
>>> > >> 9c73b4059aaa3ebba52fc63bb6@%3Cbuilds.apache.org%3E
>>> > >>
>>> > >> In the mean time, is this an issue we need file with Hadoop or
>>> > >> something we need to fix in our own code?
>>> > >>
>>> > >> On Wed, Aug 10, 2016 at 6:04 PM, Matteo Bertozzi
>>> > >>  wrote:
>>> > >> > There are a bunch of builds that have most of the test failing.
>>> > >> >
>>> > >> > Example:
>>> > >> > https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=
>>> > >> JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/
>>> > >> org.apache.hadoop.hbase/TestLocalHBaseCluster/testLocalHBaseCluster/
>>> > >> >
>>> > >> > from the stack trace looks like the problem is with the jdk name
>>> that
>>> > has
>>> > >> > spaces:
>>> > >> > the hadoop FsVolumeImpl calls setNameFormat(... +
>>> fileName.toString()
>>> > +
>>> > >> ...)
>>> > >> > and this seems to not be escaped
>>> > >> > so we end up with JDK%25201.7%2520(latest) in the string format
>>> and we
>>> > >> get
>>> > >> > a IllegalFormatPrecisionException: 7
>>> > >> >
>>> > >> > 2016-08-10 22:07:46,108 WARN  [DataNode:
>>> > >> > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
>>> > >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
>>> > >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
>>> > >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
>>> > >> 9c88f385e6f1/dfs/data/data1/,
>>> > >> > [DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
>>> > >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
>>> > >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
>>> > >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
>>> > >> 9c88f385e6f1/dfs/data/data2/]]
>>> > >> >  heartbeating to localhost/127.0.0.1:34629]
>>> > >> > datanode.BPServiceActor(831): Unexpected exception in block 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-12 Thread Stack
On Mon, Sep 12, 2016 at 2:14 PM, Apekshit Sharma  wrote:

> On a separate note:
> Trunk had 8 green runs in last 3 days! (
> https://builds.apache.org/job/HBase-Trunk_matrix/)
>


Woah!



> This was due to fixing just the mass failures on trunk and no change in
> flaky infra. Which made me to conclude two things:
> 1. Flaky infra works.
> 2. It relies heavily on the post-commit build's stability (which every
> project should anyways strive for). If the build fails catastrophically
> once in a while, we can just exclude that one run using a flag and
> everything will work, but if it happens frequently, then it won't work
> right.
>
> I have re-enabled Flaky tests job (
> https://builds.apache.org/view/H-L/view/HBase/job/HBASE-Flaky-Tests/)
> which
> was disabled for almost a month due to trunk being on fire.
> I will keep an eye on how things are going.
>
>
Thanks Appy.
St.Ack


> On Mon, Sep 12, 2016 at 2:02 PM, Apekshit Sharma 
> wrote:
>
> > @Sean, Mikhail: I found the alternate solution. Using user defined axis,
> > tool environment and env variable injection.
> > See latest diff to https://builds.apache.org/job/HBase-Trunk_matrix/ job
> > for reference.
> >
> >
> > On Tue, Aug 30, 2016 at 7:39 PM, Mikhail Antonov 
> > wrote:
> >
> >> FYI, I did the same for branch-1.3 builds.  I've disabled hbase-1.3 and
> >> hbase-1.3-IT jobs and instead created
> >>
> >> https://builds.apache.org/job/HBase-1.3-JDK8 and
> >> https://builds.apache.org/job/HBase-1.3-JDK7
> >>
> >> This should work for now until we figure out how to move forward.
> >>
> >> Thanks,
> >> Mikhail
> >>
> >> On Wed, Aug 17, 2016 at 1:41 PM, Sean Busbey 
> wrote:
> >>
> >> > /me smacks forehead
> >> >
> >> > these replacement jobs, of course, also have special characters in
> >> > their names which then show up in the working path.
> >> >
> >> > renaming them to skip spaces and parens.
> >> >
> >> > On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey 
> >> > wrote:
> >> > > FYI, it looks like essentially our entire CI suite is red, probably
> >> due
> >> > to
> >> > > parts of our codebase not tolerating spaces or other special
> >> characters
> >> > in
> >> > > the working directory.
> >> > >
> >> > > I've made a stop-gap non-multi-configuration set of jobs for running
> >> unit
> >> > > tests for the 1.2 branch against JDK 7 and JDK 8:
> >> > >
> >> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> >> > 201.2%20(JDK%201.7)/
> >> > >
> >> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> >> > 201.2%20(JDK%201.8)/
> >> > >
> >> > > Due to the lack of response from infra@ I suspect our only options
> >> for
> >> > > continuing on ASF infra is to fix whatever part of our build doesn't
> >> > > tolerate the new paths, or stop using multiconfiguration
> deployments.
> >> I
> >> > am
> >> > > obviously less than thrilled at the idea of having several multiples
> >> of
> >> > > current jobs.
> >> > >
> >> > >
> >> > > On Wed, Aug 10, 2016 at 6:28 PM, Sean Busbey 
> >> > wrote:
> >> > >
> >> > >> Ugh.
> >> > >>
> >> > >> I sent a reply to Gav on builds@ about maybe getting names that
> >> don't
> >> > >> have spaces in them:
> >> > >>
> >> > >> https://lists.apache.org/thread.html/
> 8ac03dc62f9d6862d4f3d5eb37119c
> >> > >> 9c73b4059aaa3ebba52fc63bb6@%3Cbuilds.apache.org%3E
> >> > >>
> >> > >> In the mean time, is this an issue we need file with Hadoop or
> >> > >> something we need to fix in our own code?
> >> > >>
> >> > >> On Wed, Aug 10, 2016 at 6:04 PM, Matteo Bertozzi
> >> > >>  wrote:
> >> > >> > There are a bunch of builds that have most of the test failing.
> >> > >> >
> >> > >> > Example:
> >> > >> > https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=
> >> > >> JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/
> >> > >> org.apache.hadoop.hbase/TestLocalHBaseCluster/
> testLocalHBaseCluster/
> >> > >> >
> >> > >> > from the stack trace looks like the problem is with the jdk name
> >> that
> >> > has
> >> > >> > spaces:
> >> > >> > the hadoop FsVolumeImpl calls setNameFormat(... +
> >> fileName.toString()
> >> > +
> >> > >> ...)
> >> > >> > and this seems to not be escaped
> >> > >> > so we end up with JDK%25201.7%2520(latest) in the string format
> >> and we
> >> > >> get
> >> > >> > a IllegalFormatPrecisionException: 7
> >> > >> >
> >> > >> > 2016-08-10 22:07:46,108 WARN  [DataNode:
> >> > >> > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
> >> > >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
> >> > >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
> >> > >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
> >> > >> 9c88f385e6f1/dfs/data/data1/,
> >> > >> > [DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
> >> > >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
> >> > >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
> >> > >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
> >> > >> 9c88f385e6f1/dfs/data/data2/]]
> >> > >> >  heartbeatin

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-12 Thread Apekshit Sharma
On a separate note:
Trunk had 8 green runs in last 3 days! (
https://builds.apache.org/job/HBase-Trunk_matrix/)
This was due to fixing just the mass failures on trunk and no change in
flaky infra. Which made me to conclude two things:
1. Flaky infra works.
2. It relies heavily on the post-commit build's stability (which every
project should anyways strive for). If the build fails catastrophically
once in a while, we can just exclude that one run using a flag and
everything will work, but if it happens frequently, then it won't work
right.

I have re-enabled Flaky tests job (
https://builds.apache.org/view/H-L/view/HBase/job/HBASE-Flaky-Tests/) which
was disabled for almost a month due to trunk being on fire.
I will keep an eye on how things are going.


On Mon, Sep 12, 2016 at 2:02 PM, Apekshit Sharma  wrote:

> @Sean, Mikhail: I found the alternate solution. Using user defined axis,
> tool environment and env variable injection.
> See latest diff to https://builds.apache.org/job/HBase-Trunk_matrix/ job
> for reference.
>
>
> On Tue, Aug 30, 2016 at 7:39 PM, Mikhail Antonov 
> wrote:
>
>> FYI, I did the same for branch-1.3 builds.  I've disabled hbase-1.3 and
>> hbase-1.3-IT jobs and instead created
>>
>> https://builds.apache.org/job/HBase-1.3-JDK8 and
>> https://builds.apache.org/job/HBase-1.3-JDK7
>>
>> This should work for now until we figure out how to move forward.
>>
>> Thanks,
>> Mikhail
>>
>> On Wed, Aug 17, 2016 at 1:41 PM, Sean Busbey  wrote:
>>
>> > /me smacks forehead
>> >
>> > these replacement jobs, of course, also have special characters in
>> > their names which then show up in the working path.
>> >
>> > renaming them to skip spaces and parens.
>> >
>> > On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey 
>> > wrote:
>> > > FYI, it looks like essentially our entire CI suite is red, probably
>> due
>> > to
>> > > parts of our codebase not tolerating spaces or other special
>> characters
>> > in
>> > > the working directory.
>> > >
>> > > I've made a stop-gap non-multi-configuration set of jobs for running
>> unit
>> > > tests for the 1.2 branch against JDK 7 and JDK 8:
>> > >
>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
>> > 201.2%20(JDK%201.7)/
>> > >
>> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
>> > 201.2%20(JDK%201.8)/
>> > >
>> > > Due to the lack of response from infra@ I suspect our only options
>> for
>> > > continuing on ASF infra is to fix whatever part of our build doesn't
>> > > tolerate the new paths, or stop using multiconfiguration deployments.
>> I
>> > am
>> > > obviously less than thrilled at the idea of having several multiples
>> of
>> > > current jobs.
>> > >
>> > >
>> > > On Wed, Aug 10, 2016 at 6:28 PM, Sean Busbey 
>> > wrote:
>> > >
>> > >> Ugh.
>> > >>
>> > >> I sent a reply to Gav on builds@ about maybe getting names that
>> don't
>> > >> have spaces in them:
>> > >>
>> > >> https://lists.apache.org/thread.html/8ac03dc62f9d6862d4f3d5eb37119c
>> > >> 9c73b4059aaa3ebba52fc63bb6@%3Cbuilds.apache.org%3E
>> > >>
>> > >> In the mean time, is this an issue we need file with Hadoop or
>> > >> something we need to fix in our own code?
>> > >>
>> > >> On Wed, Aug 10, 2016 at 6:04 PM, Matteo Bertozzi
>> > >>  wrote:
>> > >> > There are a bunch of builds that have most of the test failing.
>> > >> >
>> > >> > Example:
>> > >> > https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=
>> > >> JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/
>> > >> org.apache.hadoop.hbase/TestLocalHBaseCluster/testLocalHBaseCluster/
>> > >> >
>> > >> > from the stack trace looks like the problem is with the jdk name
>> that
>> > has
>> > >> > spaces:
>> > >> > the hadoop FsVolumeImpl calls setNameFormat(... +
>> fileName.toString()
>> > +
>> > >> ...)
>> > >> > and this seems to not be escaped
>> > >> > so we end up with JDK%25201.7%2520(latest) in the string format
>> and we
>> > >> get
>> > >> > a IllegalFormatPrecisionException: 7
>> > >> >
>> > >> > 2016-08-10 22:07:46,108 WARN  [DataNode:
>> > >> > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
>> > >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
>> > >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
>> > >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
>> > >> 9c88f385e6f1/dfs/data/data1/,
>> > >> > [DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
>> > >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
>> > >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
>> > >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
>> > >> 9c88f385e6f1/dfs/data/data2/]]
>> > >> >  heartbeating to localhost/127.0.0.1:34629]
>> > >> > datanode.BPServiceActor(831): Unexpected exception in block pool
>> Block
>> > >> > pool  (Datanode Uuid unassigned) service to
>> > >> > localhost/127.0.0.1:34629
>> > >> > java.util.IllegalFormatPrecisionException: 7
>> > >> > at java.util.Formatter$FormatSpecifier.checkText(
>> > >> Formatter.java:2984)
>> > 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-09-12 Thread Apekshit Sharma
@Sean, Mikhail: I found the alternate solution. Using user defined axis,
tool environment and env variable injection.
See latest diff to https://builds.apache.org/job/HBase-Trunk_matrix/ job
for reference.


On Tue, Aug 30, 2016 at 7:39 PM, Mikhail Antonov 
wrote:

> FYI, I did the same for branch-1.3 builds.  I've disabled hbase-1.3 and
> hbase-1.3-IT jobs and instead created
>
> https://builds.apache.org/job/HBase-1.3-JDK8 and
> https://builds.apache.org/job/HBase-1.3-JDK7
>
> This should work for now until we figure out how to move forward.
>
> Thanks,
> Mikhail
>
> On Wed, Aug 17, 2016 at 1:41 PM, Sean Busbey  wrote:
>
> > /me smacks forehead
> >
> > these replacement jobs, of course, also have special characters in
> > their names which then show up in the working path.
> >
> > renaming them to skip spaces and parens.
> >
> > On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey 
> > wrote:
> > > FYI, it looks like essentially our entire CI suite is red, probably due
> > to
> > > parts of our codebase not tolerating spaces or other special characters
> > in
> > > the working directory.
> > >
> > > I've made a stop-gap non-multi-configuration set of jobs for running
> unit
> > > tests for the 1.2 branch against JDK 7 and JDK 8:
> > >
> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> > 201.2%20(JDK%201.7)/
> > >
> > > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> > 201.2%20(JDK%201.8)/
> > >
> > > Due to the lack of response from infra@ I suspect our only options for
> > > continuing on ASF infra is to fix whatever part of our build doesn't
> > > tolerate the new paths, or stop using multiconfiguration deployments. I
> > am
> > > obviously less than thrilled at the idea of having several multiples of
> > > current jobs.
> > >
> > >
> > > On Wed, Aug 10, 2016 at 6:28 PM, Sean Busbey 
> > wrote:
> > >
> > >> Ugh.
> > >>
> > >> I sent a reply to Gav on builds@ about maybe getting names that don't
> > >> have spaces in them:
> > >>
> > >> https://lists.apache.org/thread.html/8ac03dc62f9d6862d4f3d5eb37119c
> > >> 9c73b4059aaa3ebba52fc63bb6@%3Cbuilds.apache.org%3E
> > >>
> > >> In the mean time, is this an issue we need file with Hadoop or
> > >> something we need to fix in our own code?
> > >>
> > >> On Wed, Aug 10, 2016 at 6:04 PM, Matteo Bertozzi
> > >>  wrote:
> > >> > There are a bunch of builds that have most of the test failing.
> > >> >
> > >> > Example:
> > >> > https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=
> > >> JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/
> > >> org.apache.hadoop.hbase/TestLocalHBaseCluster/testLocalHBaseCluster/
> > >> >
> > >> > from the stack trace looks like the problem is with the jdk name
> that
> > has
> > >> > spaces:
> > >> > the hadoop FsVolumeImpl calls setNameFormat(... +
> fileName.toString()
> > +
> > >> ...)
> > >> > and this seems to not be escaped
> > >> > so we end up with JDK%25201.7%2520(latest) in the string format and
> we
> > >> get
> > >> > a IllegalFormatPrecisionException: 7
> > >> >
> > >> > 2016-08-10 22:07:46,108 WARN  [DataNode:
> > >> > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
> > >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
> > >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
> > >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
> > >> 9c88f385e6f1/dfs/data/data1/,
> > >> > [DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
> > >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
> > >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
> > >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
> > >> 9c88f385e6f1/dfs/data/data2/]]
> > >> >  heartbeating to localhost/127.0.0.1:34629]
> > >> > datanode.BPServiceActor(831): Unexpected exception in block pool
> Block
> > >> > pool  (Datanode Uuid unassigned) service to
> > >> > localhost/127.0.0.1:34629
> > >> > java.util.IllegalFormatPrecisionException: 7
> > >> > at java.util.Formatter$FormatSpecifier.checkText(
> > >> Formatter.java:2984)
> > >> > at java.util.Formatter$FormatSpecifier.(
> > >> Formatter.java:2688)
> > >> > at java.util.Formatter.parse(Formatter.java:2528)
> > >> > at java.util.Formatter.format(Formatter.java:2469)
> > >> > at java.util.Formatter.format(Formatter.java:2423)
> > >> > at java.lang.String.format(String.java:2792)
> > >> > at com.google.common.util.concurrent.ThreadFactoryBuilder.
> > >> setNameFormat(ThreadFactoryBuilder.java:68)
> > >> > at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.
> > >> FsVolumeImpl.initializeCacheExecutor(FsVolumeImpl.java:140)
> > >> >
> > >> >
> > >> >
> > >> > Matteo
> > >> >
> > >> >
> > >> > On Tue, Aug 9, 2016 at 9:55 AM, Stack  wrote:
> > >> >
> > >> >> Good on you Sean.
> > >> >> S
> > >> >>
> > >> >> On Mon, Aug 8, 2016 at 9:43 PM, Sean Busbey 
> > wrote:
> > >> >>
> > >> >> > I updated all of our jobs to use the updated JDK versions from
> > infra.
> > >

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-08-30 Thread Mikhail Antonov
FYI, I did the same for branch-1.3 builds.  I've disabled hbase-1.3 and
hbase-1.3-IT jobs and instead created

https://builds.apache.org/job/HBase-1.3-JDK8 and
https://builds.apache.org/job/HBase-1.3-JDK7

This should work for now until we figure out how to move forward.

Thanks,
Mikhail

On Wed, Aug 17, 2016 at 1:41 PM, Sean Busbey  wrote:

> /me smacks forehead
>
> these replacement jobs, of course, also have special characters in
> their names which then show up in the working path.
>
> renaming them to skip spaces and parens.
>
> On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey 
> wrote:
> > FYI, it looks like essentially our entire CI suite is red, probably due
> to
> > parts of our codebase not tolerating spaces or other special characters
> in
> > the working directory.
> >
> > I've made a stop-gap non-multi-configuration set of jobs for running unit
> > tests for the 1.2 branch against JDK 7 and JDK 8:
> >
> > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> 201.2%20(JDK%201.7)/
> >
> > https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> 201.2%20(JDK%201.8)/
> >
> > Due to the lack of response from infra@ I suspect our only options for
> > continuing on ASF infra is to fix whatever part of our build doesn't
> > tolerate the new paths, or stop using multiconfiguration deployments. I
> am
> > obviously less than thrilled at the idea of having several multiples of
> > current jobs.
> >
> >
> > On Wed, Aug 10, 2016 at 6:28 PM, Sean Busbey 
> wrote:
> >
> >> Ugh.
> >>
> >> I sent a reply to Gav on builds@ about maybe getting names that don't
> >> have spaces in them:
> >>
> >> https://lists.apache.org/thread.html/8ac03dc62f9d6862d4f3d5eb37119c
> >> 9c73b4059aaa3ebba52fc63bb6@%3Cbuilds.apache.org%3E
> >>
> >> In the mean time, is this an issue we need file with Hadoop or
> >> something we need to fix in our own code?
> >>
> >> On Wed, Aug 10, 2016 at 6:04 PM, Matteo Bertozzi
> >>  wrote:
> >> > There are a bunch of builds that have most of the test failing.
> >> >
> >> > Example:
> >> > https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=
> >> JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/
> >> org.apache.hadoop.hbase/TestLocalHBaseCluster/testLocalHBaseCluster/
> >> >
> >> > from the stack trace looks like the problem is with the jdk name that
> has
> >> > spaces:
> >> > the hadoop FsVolumeImpl calls setNameFormat(... + fileName.toString()
> +
> >> ...)
> >> > and this seems to not be escaped
> >> > so we end up with JDK%25201.7%2520(latest) in the string format and we
> >> get
> >> > a IllegalFormatPrecisionException: 7
> >> >
> >> > 2016-08-10 22:07:46,108 WARN  [DataNode:
> >> > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
> >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
> >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
> >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
> >> 9c88f385e6f1/dfs/data/data1/,
> >> > [DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
> >> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
> >> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
> >> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
> >> 9c88f385e6f1/dfs/data/data2/]]
> >> >  heartbeating to localhost/127.0.0.1:34629]
> >> > datanode.BPServiceActor(831): Unexpected exception in block pool Block
> >> > pool  (Datanode Uuid unassigned) service to
> >> > localhost/127.0.0.1:34629
> >> > java.util.IllegalFormatPrecisionException: 7
> >> > at java.util.Formatter$FormatSpecifier.checkText(
> >> Formatter.java:2984)
> >> > at java.util.Formatter$FormatSpecifier.(
> >> Formatter.java:2688)
> >> > at java.util.Formatter.parse(Formatter.java:2528)
> >> > at java.util.Formatter.format(Formatter.java:2469)
> >> > at java.util.Formatter.format(Formatter.java:2423)
> >> > at java.lang.String.format(String.java:2792)
> >> > at com.google.common.util.concurrent.ThreadFactoryBuilder.
> >> setNameFormat(ThreadFactoryBuilder.java:68)
> >> > at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.
> >> FsVolumeImpl.initializeCacheExecutor(FsVolumeImpl.java:140)
> >> >
> >> >
> >> >
> >> > Matteo
> >> >
> >> >
> >> > On Tue, Aug 9, 2016 at 9:55 AM, Stack  wrote:
> >> >
> >> >> Good on you Sean.
> >> >> S
> >> >>
> >> >> On Mon, Aug 8, 2016 at 9:43 PM, Sean Busbey 
> wrote:
> >> >>
> >> >> > I updated all of our jobs to use the updated JDK versions from
> infra.
> >> >> > These have spaces in the names, and those names end up in our
> >> >> > workspace path, so try to keep an eye out.
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Mon, Aug 8, 2016 at 10:42 AM, Sean Busbey 
> >> >> wrote:
> >> >> > > running in docker is the default now. relying on the default
> docker
> >> >> > > image that comes with Yetus means that our protoc checks are
> >> >> > > failing[1].
> >> >> > >
> >> >> > >
> >> >> > > [1]: https://issues.apache.org/jira/browse/HBASE-16373
> >> >> > >
> >> >> > > On

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-08-17 Thread Sean Busbey
/me smacks forehead

these replacement jobs, of course, also have special characters in
their names which then show up in the working path.

renaming them to skip spaces and parens.

On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey  wrote:
> FYI, it looks like essentially our entire CI suite is red, probably due to
> parts of our codebase not tolerating spaces or other special characters in
> the working directory.
>
> I've made a stop-gap non-multi-configuration set of jobs for running unit
> tests for the 1.2 branch against JDK 7 and JDK 8:
>
> https://builds.apache.org/view/H-L/view/HBase/job/HBase%201.2%20(JDK%201.7)/
>
> https://builds.apache.org/view/H-L/view/HBase/job/HBase%201.2%20(JDK%201.8)/
>
> Due to the lack of response from infra@ I suspect our only options for
> continuing on ASF infra is to fix whatever part of our build doesn't
> tolerate the new paths, or stop using multiconfiguration deployments. I am
> obviously less than thrilled at the idea of having several multiples of
> current jobs.
>
>
> On Wed, Aug 10, 2016 at 6:28 PM, Sean Busbey  wrote:
>
>> Ugh.
>>
>> I sent a reply to Gav on builds@ about maybe getting names that don't
>> have spaces in them:
>>
>> https://lists.apache.org/thread.html/8ac03dc62f9d6862d4f3d5eb37119c
>> 9c73b4059aaa3ebba52fc63bb6@%3Cbuilds.apache.org%3E
>>
>> In the mean time, is this an issue we need file with Hadoop or
>> something we need to fix in our own code?
>>
>> On Wed, Aug 10, 2016 at 6:04 PM, Matteo Bertozzi
>>  wrote:
>> > There are a bunch of builds that have most of the test failing.
>> >
>> > Example:
>> > https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=
>> JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/
>> org.apache.hadoop.hbase/TestLocalHBaseCluster/testLocalHBaseCluster/
>> >
>> > from the stack trace looks like the problem is with the jdk name that has
>> > spaces:
>> > the hadoop FsVolumeImpl calls setNameFormat(... + fileName.toString() +
>> ...)
>> > and this seems to not be escaped
>> > so we end up with JDK%25201.7%2520(latest) in the string format and we
>> get
>> > a IllegalFormatPrecisionException: 7
>> >
>> > 2016-08-10 22:07:46,108 WARN  [DataNode:
>> > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
>> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
>> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
>> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
>> 9c88f385e6f1/dfs/data/data1/,
>> > [DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
>> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
>> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
>> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
>> 9c88f385e6f1/dfs/data/data2/]]
>> >  heartbeating to localhost/127.0.0.1:34629]
>> > datanode.BPServiceActor(831): Unexpected exception in block pool Block
>> > pool  (Datanode Uuid unassigned) service to
>> > localhost/127.0.0.1:34629
>> > java.util.IllegalFormatPrecisionException: 7
>> > at java.util.Formatter$FormatSpecifier.checkText(
>> Formatter.java:2984)
>> > at java.util.Formatter$FormatSpecifier.(
>> Formatter.java:2688)
>> > at java.util.Formatter.parse(Formatter.java:2528)
>> > at java.util.Formatter.format(Formatter.java:2469)
>> > at java.util.Formatter.format(Formatter.java:2423)
>> > at java.lang.String.format(String.java:2792)
>> > at com.google.common.util.concurrent.ThreadFactoryBuilder.
>> setNameFormat(ThreadFactoryBuilder.java:68)
>> > at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.
>> FsVolumeImpl.initializeCacheExecutor(FsVolumeImpl.java:140)
>> >
>> >
>> >
>> > Matteo
>> >
>> >
>> > On Tue, Aug 9, 2016 at 9:55 AM, Stack  wrote:
>> >
>> >> Good on you Sean.
>> >> S
>> >>
>> >> On Mon, Aug 8, 2016 at 9:43 PM, Sean Busbey  wrote:
>> >>
>> >> > I updated all of our jobs to use the updated JDK versions from infra.
>> >> > These have spaces in the names, and those names end up in our
>> >> > workspace path, so try to keep an eye out.
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Aug 8, 2016 at 10:42 AM, Sean Busbey 
>> >> wrote:
>> >> > > running in docker is the default now. relying on the default docker
>> >> > > image that comes with Yetus means that our protoc checks are
>> >> > > failing[1].
>> >> > >
>> >> > >
>> >> > > [1]: https://issues.apache.org/jira/browse/HBASE-16373
>> >> > >
>> >> > > On Sat, Aug 6, 2016 at 5:03 PM, Sean Busbey 
>> wrote:
>> >> > >> Hi folks!
>> >> > >>
>> >> > >> this morning I merged the patch that updates us to Yetus 0.3.0[1]
>> and
>> >> > updated the precommit job appropriately. I also changed it to use one
>> of
>> >> > the Java versions post the puppet changes to asf build.
>> >> > >>
>> >> > >> The last three builds look normal (#2975 - #2977). I'm gonna try
>> >> > running things in docker next. I'll email again when I make it the
>> >> default.
>> >> > >>
>> >> > >> [1]: https://issues.apache.org/jira/browse/HBASE-15882
>> >> > >>
>> >> > >> On

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-08-17 Thread Sean Busbey
FYI,

I also disabled the following jobs that are failing:

* HBase 1.2 IT
* HBase-0.94
* HBase-0.94-JDK7
* HBase-0.94-on-Hadoop-2
* HBase-0.94-security
* HBase-0.94.28

The first one, Stack has graciously volunteered to run locally for the RC.
The rest are slated for removal in HBASE-16380

On Wed, Aug 17, 2016 at 1:34 PM, Sean Busbey  wrote:

> FYI, it looks like essentially our entire CI suite is red, probably due to
> parts of our codebase not tolerating spaces or other special characters in
> the working directory.
>
> I've made a stop-gap non-multi-configuration set of jobs for running unit
> tests for the 1.2 branch against JDK 7 and JDK 8:
>
> https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> 201.2%20(JDK%201.7)/
>
> https://builds.apache.org/view/H-L/view/HBase/job/HBase%
> 201.2%20(JDK%201.8)/
>
> Due to the lack of response from infra@ I suspect our only options for
> continuing on ASF infra is to fix whatever part of our build doesn't
> tolerate the new paths, or stop using multiconfiguration deployments. I am
> obviously less than thrilled at the idea of having several multiples of
> current jobs.
>
>
> On Wed, Aug 10, 2016 at 6:28 PM, Sean Busbey  wrote:
>
>> Ugh.
>>
>> I sent a reply to Gav on builds@ about maybe getting names that don't
>> have spaces in them:
>>
>> https://lists.apache.org/thread.html/8ac03dc62f9d6862d4f3d5e
>> b37119c9c73b4059aaa3ebba52fc63bb6@%3Cbuilds.apache.org%3E
>>
>> In the mean time, is this an issue we need file with Hadoop or
>> something we need to fix in our own code?
>>
>> On Wed, Aug 10, 2016 at 6:04 PM, Matteo Bertozzi
>>  wrote:
>> > There are a bunch of builds that have most of the test failing.
>> >
>> > Example:
>> > https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=JD
>> K%201.7%20(latest),label=yahoo-not-h2/testReport/junit/org.
>> apache.hadoop.hbase/TestLocalHBaseCluster/testLocalHBaseCluster/
>> >
>> > from the stack trace looks like the problem is with the jdk name that
>> has
>> > spaces:
>> > the hadoop FsVolumeImpl calls setNameFormat(... + fileName.toString() +
>> ...)
>> > and this seems to not be escaped
>> > so we end up with JDK%25201.7%2520(latest) in the string format and we
>> get
>> > a IllegalFormatPrecisionException: 7
>> >
>> > 2016-08-10 22:07:46,108 WARN  [DataNode:
>> > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-Tru
>> nk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-h2/
>> hbase-server/target/test-data/e7099624-ecfa-4674-87de-a8733d
>> 13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-9c88f385e6f1/dfs/data/data1/,
>> > [DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-Trunk
>> _matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-h2/
>> hbase-server/target/test-data/e7099624-ecfa-4674-87de-a8733d
>> 13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-9c88f385e6f1/dfs/data/data2/]]
>> >  heartbeating to localhost/127.0.0.1:34629]
>> > datanode.BPServiceActor(831): Unexpected exception in block pool Block
>> > pool  (Datanode Uuid unassigned) service to
>> > localhost/127.0.0.1:34629
>> > java.util.IllegalFormatPrecisionException: 7
>> > at java.util.Formatter$FormatSpecifier.checkText(Formatter.
>> java:2984)
>> > at java.util.Formatter$FormatSpecifier.(Formatter.java:
>> 2688)
>> > at java.util.Formatter.parse(Formatter.java:2528)
>> > at java.util.Formatter.format(Formatter.java:2469)
>> > at java.util.Formatter.format(Formatter.java:2423)
>> > at java.lang.String.format(String.java:2792)
>> > at com.google.common.util.concurrent.ThreadFactoryBuilder.setNa
>> meFormat(ThreadFactoryBuilder.java:68)
>> > at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolu
>> meImpl.initializeCacheExecutor(FsVolumeImpl.java:140)
>> >
>> >
>> >
>> > Matteo
>> >
>> >
>> > On Tue, Aug 9, 2016 at 9:55 AM, Stack  wrote:
>> >
>> >> Good on you Sean.
>> >> S
>> >>
>> >> On Mon, Aug 8, 2016 at 9:43 PM, Sean Busbey  wrote:
>> >>
>> >> > I updated all of our jobs to use the updated JDK versions from infra.
>> >> > These have spaces in the names, and those names end up in our
>> >> > workspace path, so try to keep an eye out.
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Aug 8, 2016 at 10:42 AM, Sean Busbey 
>> >> wrote:
>> >> > > running in docker is the default now. relying on the default docker
>> >> > > image that comes with Yetus means that our protoc checks are
>> >> > > failing[1].
>> >> > >
>> >> > >
>> >> > > [1]: https://issues.apache.org/jira/browse/HBASE-16373
>> >> > >
>> >> > > On Sat, Aug 6, 2016 at 5:03 PM, Sean Busbey 
>> wrote:
>> >> > >> Hi folks!
>> >> > >>
>> >> > >> this morning I merged the patch that updates us to Yetus 0.3.0[1]
>> and
>> >> > updated the precommit job appropriately. I also changed it to use
>> one of
>> >> > the Java versions post the puppet changes to asf build.
>> >> > >>
>> >> > >> The last three builds look normal (#2975 - #2977). I'm gonna try
>> >> > running things in docker next. I'll email again when I make it the
>> >

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-08-17 Thread Sean Busbey
FYI, it looks like essentially our entire CI suite is red, probably due to
parts of our codebase not tolerating spaces or other special characters in
the working directory.

I've made a stop-gap non-multi-configuration set of jobs for running unit
tests for the 1.2 branch against JDK 7 and JDK 8:

https://builds.apache.org/view/H-L/view/HBase/job/HBase%201.2%20(JDK%201.7)/

https://builds.apache.org/view/H-L/view/HBase/job/HBase%201.2%20(JDK%201.8)/

Due to the lack of response from infra@ I suspect our only options for
continuing on ASF infra is to fix whatever part of our build doesn't
tolerate the new paths, or stop using multiconfiguration deployments. I am
obviously less than thrilled at the idea of having several multiples of
current jobs.


On Wed, Aug 10, 2016 at 6:28 PM, Sean Busbey  wrote:

> Ugh.
>
> I sent a reply to Gav on builds@ about maybe getting names that don't
> have spaces in them:
>
> https://lists.apache.org/thread.html/8ac03dc62f9d6862d4f3d5eb37119c
> 9c73b4059aaa3ebba52fc63bb6@%3Cbuilds.apache.org%3E
>
> In the mean time, is this an issue we need file with Hadoop or
> something we need to fix in our own code?
>
> On Wed, Aug 10, 2016 at 6:04 PM, Matteo Bertozzi
>  wrote:
> > There are a bunch of builds that have most of the test failing.
> >
> > Example:
> > https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=
> JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/
> org.apache.hadoop.hbase/TestLocalHBaseCluster/testLocalHBaseCluster/
> >
> > from the stack trace looks like the problem is with the jdk name that has
> > spaces:
> > the hadoop FsVolumeImpl calls setNameFormat(... + fileName.toString() +
> ...)
> > and this seems to not be escaped
> > so we end up with JDK%25201.7%2520(latest) in the string format and we
> get
> > a IllegalFormatPrecisionException: 7
> >
> > 2016-08-10 22:07:46,108 WARN  [DataNode:
> > [[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
> 9c88f385e6f1/dfs/data/data1/,
> > [DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-
> Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-
> h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-
> a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-
> 9c88f385e6f1/dfs/data/data2/]]
> >  heartbeating to localhost/127.0.0.1:34629]
> > datanode.BPServiceActor(831): Unexpected exception in block pool Block
> > pool  (Datanode Uuid unassigned) service to
> > localhost/127.0.0.1:34629
> > java.util.IllegalFormatPrecisionException: 7
> > at java.util.Formatter$FormatSpecifier.checkText(
> Formatter.java:2984)
> > at java.util.Formatter$FormatSpecifier.(
> Formatter.java:2688)
> > at java.util.Formatter.parse(Formatter.java:2528)
> > at java.util.Formatter.format(Formatter.java:2469)
> > at java.util.Formatter.format(Formatter.java:2423)
> > at java.lang.String.format(String.java:2792)
> > at com.google.common.util.concurrent.ThreadFactoryBuilder.
> setNameFormat(ThreadFactoryBuilder.java:68)
> > at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.
> FsVolumeImpl.initializeCacheExecutor(FsVolumeImpl.java:140)
> >
> >
> >
> > Matteo
> >
> >
> > On Tue, Aug 9, 2016 at 9:55 AM, Stack  wrote:
> >
> >> Good on you Sean.
> >> S
> >>
> >> On Mon, Aug 8, 2016 at 9:43 PM, Sean Busbey  wrote:
> >>
> >> > I updated all of our jobs to use the updated JDK versions from infra.
> >> > These have spaces in the names, and those names end up in our
> >> > workspace path, so try to keep an eye out.
> >> >
> >> >
> >> >
> >> > On Mon, Aug 8, 2016 at 10:42 AM, Sean Busbey 
> >> wrote:
> >> > > running in docker is the default now. relying on the default docker
> >> > > image that comes with Yetus means that our protoc checks are
> >> > > failing[1].
> >> > >
> >> > >
> >> > > [1]: https://issues.apache.org/jira/browse/HBASE-16373
> >> > >
> >> > > On Sat, Aug 6, 2016 at 5:03 PM, Sean Busbey 
> wrote:
> >> > >> Hi folks!
> >> > >>
> >> > >> this morning I merged the patch that updates us to Yetus 0.3.0[1]
> and
> >> > updated the precommit job appropriately. I also changed it to use one
> of
> >> > the Java versions post the puppet changes to asf build.
> >> > >>
> >> > >> The last three builds look normal (#2975 - #2977). I'm gonna try
> >> > running things in docker next. I'll email again when I make it the
> >> default.
> >> > >>
> >> > >> [1]: https://issues.apache.org/jira/browse/HBASE-15882
> >> > >>
> >> > >> On 2016-06-16 10:43 (-0500), Sean Busbey 
> wrote:
> >> > >>> FYI, today our precommit jobs started failing because our chosen
> jdk
> >> > >>> (1.7.0.79) disappeared (mentioned on HBASE-16032).
> >> > >>>
> >> > >>> Initially we were doing something wrong, namely directly
> referencing
> >> > >>> the jenkins build tools area without telling jenkins to give us an
> >> en

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-08-10 Thread Sean Busbey
Ugh.

I sent a reply to Gav on builds@ about maybe getting names that don't
have spaces in them:

https://lists.apache.org/thread.html/8ac03dc62f9d6862d4f3d5eb37119c9c73b4059aaa3ebba52fc63bb6@%3Cbuilds.apache.org%3E

In the mean time, is this an issue we need file with Hadoop or
something we need to fix in our own code?

On Wed, Aug 10, 2016 at 6:04 PM, Matteo Bertozzi
 wrote:
> There are a bunch of builds that have most of the test failing.
>
> Example:
> https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/org.apache.hadoop.hbase/TestLocalHBaseCluster/testLocalHBaseCluster/
>
> from the stack trace looks like the problem is with the jdk name that has
> spaces:
> the hadoop FsVolumeImpl calls setNameFormat(... + fileName.toString() + ...)
> and this seems to not be escaped
> so we end up with JDK%25201.7%2520(latest) in the string format and we get
> a IllegalFormatPrecisionException: 7
>
> 2016-08-10 22:07:46,108 WARN  [DataNode:
> [[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-9c88f385e6f1/dfs/data/data1/,
> [DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-9c88f385e6f1/dfs/data/data2/]]
>  heartbeating to localhost/127.0.0.1:34629]
> datanode.BPServiceActor(831): Unexpected exception in block pool Block
> pool  (Datanode Uuid unassigned) service to
> localhost/127.0.0.1:34629
> java.util.IllegalFormatPrecisionException: 7
> at java.util.Formatter$FormatSpecifier.checkText(Formatter.java:2984)
> at java.util.Formatter$FormatSpecifier.(Formatter.java:2688)
> at java.util.Formatter.parse(Formatter.java:2528)
> at java.util.Formatter.format(Formatter.java:2469)
> at java.util.Formatter.format(Formatter.java:2423)
> at java.lang.String.format(String.java:2792)
> at 
> com.google.common.util.concurrent.ThreadFactoryBuilder.setNameFormat(ThreadFactoryBuilder.java:68)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.initializeCacheExecutor(FsVolumeImpl.java:140)
>
>
>
> Matteo
>
>
> On Tue, Aug 9, 2016 at 9:55 AM, Stack  wrote:
>
>> Good on you Sean.
>> S
>>
>> On Mon, Aug 8, 2016 at 9:43 PM, Sean Busbey  wrote:
>>
>> > I updated all of our jobs to use the updated JDK versions from infra.
>> > These have spaces in the names, and those names end up in our
>> > workspace path, so try to keep an eye out.
>> >
>> >
>> >
>> > On Mon, Aug 8, 2016 at 10:42 AM, Sean Busbey 
>> wrote:
>> > > running in docker is the default now. relying on the default docker
>> > > image that comes with Yetus means that our protoc checks are
>> > > failing[1].
>> > >
>> > >
>> > > [1]: https://issues.apache.org/jira/browse/HBASE-16373
>> > >
>> > > On Sat, Aug 6, 2016 at 5:03 PM, Sean Busbey  wrote:
>> > >> Hi folks!
>> > >>
>> > >> this morning I merged the patch that updates us to Yetus 0.3.0[1] and
>> > updated the precommit job appropriately. I also changed it to use one of
>> > the Java versions post the puppet changes to asf build.
>> > >>
>> > >> The last three builds look normal (#2975 - #2977). I'm gonna try
>> > running things in docker next. I'll email again when I make it the
>> default.
>> > >>
>> > >> [1]: https://issues.apache.org/jira/browse/HBASE-15882
>> > >>
>> > >> On 2016-06-16 10:43 (-0500), Sean Busbey  wrote:
>> > >>> FYI, today our precommit jobs started failing because our chosen jdk
>> > >>> (1.7.0.79) disappeared (mentioned on HBASE-16032).
>> > >>>
>> > >>> Initially we were doing something wrong, namely directly referencing
>> > >>> the jenkins build tools area without telling jenkins to give us an
>> env
>> > >>> variable that stated where the jdk is located. However, after
>> > >>> attempting to switch to the appropriate tooling variable for jdk
>> > >>> 1.7.0.79, I found that it didn't point to a place that worked.
>> > >>>
>> > >>> I've now updated the job to rely on the latest 1.7 jdk, which is
>> > >>> currently 1.7.0.80. I don't know how often "latest" updates.
>> > >>>
>> > >>> Personally, I think this is a sign that we need to prioritize
>> > >>> HBASE-15882 so that we can switch back to using Docker. I won't have
>> > >>> time this week, so if anyone else does please pick up the ticket.
>> > >>>
>> > >>> On Thu, Mar 17, 2016 at 5:19 PM, Stack  wrote:
>> > >>> > Thanks Sean.
>> > >>> > St.Ack
>> > >>> >
>> > >>> > On Wed, Mar 16, 2016 at 12:04 PM, Sean Busbey > >
>> > wrote:
>> > >>> >
>> > >>> >> FYI, I updated the precommit job today to specify that only
>> compile
>> > time
>> > >>> >> checks should be done against jdks other than the primary jdk7
>> > instance.
>> > >>> >>
>> > >>> >> On Mon, Mar 7, 2016 at 8:

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-08-10 Thread Matteo Bertozzi
There are a bunch of builds that have most of the test failing.

Example:
https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/org.apache.hadoop.hbase/TestLocalHBaseCluster/testLocalHBaseCluster/

from the stack trace looks like the problem is with the jdk name that has
spaces:
the hadoop FsVolumeImpl calls setNameFormat(... + fileName.toString() + ...)
and this seems to not be escaped
so we end up with JDK%25201.7%2520(latest) in the string format and we get
a IllegalFormatPrecisionException: 7

2016-08-10 22:07:46,108 WARN  [DataNode:
[[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-9c88f385e6f1/dfs/data/data1/,
[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-9c88f385e6f1/dfs/data/data2/]]
 heartbeating to localhost/127.0.0.1:34629]
datanode.BPServiceActor(831): Unexpected exception in block pool Block
pool  (Datanode Uuid unassigned) service to
localhost/127.0.0.1:34629
java.util.IllegalFormatPrecisionException: 7
at java.util.Formatter$FormatSpecifier.checkText(Formatter.java:2984)
at java.util.Formatter$FormatSpecifier.(Formatter.java:2688)
at java.util.Formatter.parse(Formatter.java:2528)
at java.util.Formatter.format(Formatter.java:2469)
at java.util.Formatter.format(Formatter.java:2423)
at java.lang.String.format(String.java:2792)
at 
com.google.common.util.concurrent.ThreadFactoryBuilder.setNameFormat(ThreadFactoryBuilder.java:68)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.initializeCacheExecutor(FsVolumeImpl.java:140)



Matteo


On Tue, Aug 9, 2016 at 9:55 AM, Stack  wrote:

> Good on you Sean.
> S
>
> On Mon, Aug 8, 2016 at 9:43 PM, Sean Busbey  wrote:
>
> > I updated all of our jobs to use the updated JDK versions from infra.
> > These have spaces in the names, and those names end up in our
> > workspace path, so try to keep an eye out.
> >
> >
> >
> > On Mon, Aug 8, 2016 at 10:42 AM, Sean Busbey 
> wrote:
> > > running in docker is the default now. relying on the default docker
> > > image that comes with Yetus means that our protoc checks are
> > > failing[1].
> > >
> > >
> > > [1]: https://issues.apache.org/jira/browse/HBASE-16373
> > >
> > > On Sat, Aug 6, 2016 at 5:03 PM, Sean Busbey  wrote:
> > >> Hi folks!
> > >>
> > >> this morning I merged the patch that updates us to Yetus 0.3.0[1] and
> > updated the precommit job appropriately. I also changed it to use one of
> > the Java versions post the puppet changes to asf build.
> > >>
> > >> The last three builds look normal (#2975 - #2977). I'm gonna try
> > running things in docker next. I'll email again when I make it the
> default.
> > >>
> > >> [1]: https://issues.apache.org/jira/browse/HBASE-15882
> > >>
> > >> On 2016-06-16 10:43 (-0500), Sean Busbey  wrote:
> > >>> FYI, today our precommit jobs started failing because our chosen jdk
> > >>> (1.7.0.79) disappeared (mentioned on HBASE-16032).
> > >>>
> > >>> Initially we were doing something wrong, namely directly referencing
> > >>> the jenkins build tools area without telling jenkins to give us an
> env
> > >>> variable that stated where the jdk is located. However, after
> > >>> attempting to switch to the appropriate tooling variable for jdk
> > >>> 1.7.0.79, I found that it didn't point to a place that worked.
> > >>>
> > >>> I've now updated the job to rely on the latest 1.7 jdk, which is
> > >>> currently 1.7.0.80. I don't know how often "latest" updates.
> > >>>
> > >>> Personally, I think this is a sign that we need to prioritize
> > >>> HBASE-15882 so that we can switch back to using Docker. I won't have
> > >>> time this week, so if anyone else does please pick up the ticket.
> > >>>
> > >>> On Thu, Mar 17, 2016 at 5:19 PM, Stack  wrote:
> > >>> > Thanks Sean.
> > >>> > St.Ack
> > >>> >
> > >>> > On Wed, Mar 16, 2016 at 12:04 PM, Sean Busbey  >
> > wrote:
> > >>> >
> > >>> >> FYI, I updated the precommit job today to specify that only
> compile
> > time
> > >>> >> checks should be done against jdks other than the primary jdk7
> > instance.
> > >>> >>
> > >>> >> On Mon, Mar 7, 2016 at 8:43 PM, Sean Busbey 
> > wrote:
> > >>> >>
> > >>> >> > I tested things out, and while YETUS-297[1] is present the
> > default runs
> > >>> >> > all plugins that can do multiple jdks against those available
> > (jdk7 and
> > >>> >> > jdk8 in our case).
> > >>> >> >
> > >>> >> > We can configure things to only do a single run of unit tests.
> > They'll be
> > >>> >> > against jdk7, since that is our default jdk. That fine by
> > everyone? It'll
> > >>> >> > save ~1.5 hours on any build that hits hbase-server.

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-08-09 Thread Stack
Good on you Sean.
S

On Mon, Aug 8, 2016 at 9:43 PM, Sean Busbey  wrote:

> I updated all of our jobs to use the updated JDK versions from infra.
> These have spaces in the names, and those names end up in our
> workspace path, so try to keep an eye out.
>
>
>
> On Mon, Aug 8, 2016 at 10:42 AM, Sean Busbey  wrote:
> > running in docker is the default now. relying on the default docker
> > image that comes with Yetus means that our protoc checks are
> > failing[1].
> >
> >
> > [1]: https://issues.apache.org/jira/browse/HBASE-16373
> >
> > On Sat, Aug 6, 2016 at 5:03 PM, Sean Busbey  wrote:
> >> Hi folks!
> >>
> >> this morning I merged the patch that updates us to Yetus 0.3.0[1] and
> updated the precommit job appropriately. I also changed it to use one of
> the Java versions post the puppet changes to asf build.
> >>
> >> The last three builds look normal (#2975 - #2977). I'm gonna try
> running things in docker next. I'll email again when I make it the default.
> >>
> >> [1]: https://issues.apache.org/jira/browse/HBASE-15882
> >>
> >> On 2016-06-16 10:43 (-0500), Sean Busbey  wrote:
> >>> FYI, today our precommit jobs started failing because our chosen jdk
> >>> (1.7.0.79) disappeared (mentioned on HBASE-16032).
> >>>
> >>> Initially we were doing something wrong, namely directly referencing
> >>> the jenkins build tools area without telling jenkins to give us an env
> >>> variable that stated where the jdk is located. However, after
> >>> attempting to switch to the appropriate tooling variable for jdk
> >>> 1.7.0.79, I found that it didn't point to a place that worked.
> >>>
> >>> I've now updated the job to rely on the latest 1.7 jdk, which is
> >>> currently 1.7.0.80. I don't know how often "latest" updates.
> >>>
> >>> Personally, I think this is a sign that we need to prioritize
> >>> HBASE-15882 so that we can switch back to using Docker. I won't have
> >>> time this week, so if anyone else does please pick up the ticket.
> >>>
> >>> On Thu, Mar 17, 2016 at 5:19 PM, Stack  wrote:
> >>> > Thanks Sean.
> >>> > St.Ack
> >>> >
> >>> > On Wed, Mar 16, 2016 at 12:04 PM, Sean Busbey 
> wrote:
> >>> >
> >>> >> FYI, I updated the precommit job today to specify that only compile
> time
> >>> >> checks should be done against jdks other than the primary jdk7
> instance.
> >>> >>
> >>> >> On Mon, Mar 7, 2016 at 8:43 PM, Sean Busbey 
> wrote:
> >>> >>
> >>> >> > I tested things out, and while YETUS-297[1] is present the
> default runs
> >>> >> > all plugins that can do multiple jdks against those available
> (jdk7 and
> >>> >> > jdk8 in our case).
> >>> >> >
> >>> >> > We can configure things to only do a single run of unit tests.
> They'll be
> >>> >> > against jdk7, since that is our default jdk. That fine by
> everyone? It'll
> >>> >> > save ~1.5 hours on any build that hits hbase-server.
> >>> >> >
> >>> >> > On Mon, Mar 7, 2016 at 1:22 PM, Stack  wrote:
> >>> >> >
> >>> >> >> Hurray!
> >>> >> >>
> >>> >> >> It looks like YETUS-96 is in there and we are only running on
> jdk build
> >>> >> >> now, the default (but testing compile against both) Will
> keep an
> >>> >> eye.
> >>> >> >>
> >>> >> >> St.Ack
> >>> >> >>
> >>> >> >>
> >>> >> >> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey <
> [email protected]>
> >>> >> wrote:
> >>> >> >>
> >>> >> >> > FYI, I've just updated our precommit jobs to use the 0.2.0
> release of
> >>> >> >> Yetus
> >>> >> >> > that came out today.
> >>> >> >> >
> >>> >> >> > After keeping an eye out for strangeness today I'll turn
> docker mode
> >>> >> >> back
> >>> >> >> > on by default tonight.
> >>> >> >> >
> >>> >> >> > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey <
> [email protected]>
> >>> >> >> wrote:
> >>> >> >> >
> >>> >> >> > > FYI, I added a new parameter to the precommit job:
> >>> >> >> > >
> >>> >> >> > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the
> >>> >> apache/yetus
> >>> >> >> > > repo rather than our chosen release
> >>> >> >> > >
> >>> >> >> > > It defaults to inactive, but can be used in
> manually-triggered runs
> >>> >> to
> >>> >> >> > > test a solution to a problem in the yetus library. At the
> moment,
> >>> >> I'm
> >>> >> >> > > using it to test a solution to default module ordering  as
> seen in
> >>> >> >> > > HBASE-15075.
> >>> >> >> > >
> >>> >> >> > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey <
> [email protected]>
> >>> >> >> wrote:
> >>> >> >> > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for
> >>> >> precommit
> >>> >> >> > > tests)
> >>> >> >> > > > and updated our jenkins precommit build to use it.
> >>> >> >> > > >
> >>> >> >> > > > Jenkins job has some explanation:
> >>> >> >> > > >
> >>> >> >> > >
> >>> >> >> >
> >>> >> >>
> >>> >> https://builds.apache.org/view/PreCommit%20Builds/job/
> PreCommit-HBASE-Build/
> >>> >> >> > > >
> >>> >> >> > > > Release note from HBASE-13525 does as well.
> >>> >> >> > > >
> >>> >> >> > > > The old job will stick around here for a couple of weeks,
> in case
> >>> >> we
> 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-08-08 Thread Sean Busbey
I updated all of our jobs to use the updated JDK versions from infra.
These have spaces in the names, and those names end up in our
workspace path, so try to keep an eye out.



On Mon, Aug 8, 2016 at 10:42 AM, Sean Busbey  wrote:
> running in docker is the default now. relying on the default docker
> image that comes with Yetus means that our protoc checks are
> failing[1].
>
>
> [1]: https://issues.apache.org/jira/browse/HBASE-16373
>
> On Sat, Aug 6, 2016 at 5:03 PM, Sean Busbey  wrote:
>> Hi folks!
>>
>> this morning I merged the patch that updates us to Yetus 0.3.0[1] and 
>> updated the precommit job appropriately. I also changed it to use one of the 
>> Java versions post the puppet changes to asf build.
>>
>> The last three builds look normal (#2975 - #2977). I'm gonna try running 
>> things in docker next. I'll email again when I make it the default.
>>
>> [1]: https://issues.apache.org/jira/browse/HBASE-15882
>>
>> On 2016-06-16 10:43 (-0500), Sean Busbey  wrote:
>>> FYI, today our precommit jobs started failing because our chosen jdk
>>> (1.7.0.79) disappeared (mentioned on HBASE-16032).
>>>
>>> Initially we were doing something wrong, namely directly referencing
>>> the jenkins build tools area without telling jenkins to give us an env
>>> variable that stated where the jdk is located. However, after
>>> attempting to switch to the appropriate tooling variable for jdk
>>> 1.7.0.79, I found that it didn't point to a place that worked.
>>>
>>> I've now updated the job to rely on the latest 1.7 jdk, which is
>>> currently 1.7.0.80. I don't know how often "latest" updates.
>>>
>>> Personally, I think this is a sign that we need to prioritize
>>> HBASE-15882 so that we can switch back to using Docker. I won't have
>>> time this week, so if anyone else does please pick up the ticket.
>>>
>>> On Thu, Mar 17, 2016 at 5:19 PM, Stack  wrote:
>>> > Thanks Sean.
>>> > St.Ack
>>> >
>>> > On Wed, Mar 16, 2016 at 12:04 PM, Sean Busbey  wrote:
>>> >
>>> >> FYI, I updated the precommit job today to specify that only compile time
>>> >> checks should be done against jdks other than the primary jdk7 instance.
>>> >>
>>> >> On Mon, Mar 7, 2016 at 8:43 PM, Sean Busbey  wrote:
>>> >>
>>> >> > I tested things out, and while YETUS-297[1] is present the default runs
>>> >> > all plugins that can do multiple jdks against those available (jdk7 and
>>> >> > jdk8 in our case).
>>> >> >
>>> >> > We can configure things to only do a single run of unit tests. They'll 
>>> >> > be
>>> >> > against jdk7, since that is our default jdk. That fine by everyone? 
>>> >> > It'll
>>> >> > save ~1.5 hours on any build that hits hbase-server.
>>> >> >
>>> >> > On Mon, Mar 7, 2016 at 1:22 PM, Stack  wrote:
>>> >> >
>>> >> >> Hurray!
>>> >> >>
>>> >> >> It looks like YETUS-96 is in there and we are only running on jdk 
>>> >> >> build
>>> >> >> now, the default (but testing compile against both) Will keep an
>>> >> eye.
>>> >> >>
>>> >> >> St.Ack
>>> >> >>
>>> >> >>
>>> >> >> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey 
>>> >> wrote:
>>> >> >>
>>> >> >> > FYI, I've just updated our precommit jobs to use the 0.2.0 release 
>>> >> >> > of
>>> >> >> Yetus
>>> >> >> > that came out today.
>>> >> >> >
>>> >> >> > After keeping an eye out for strangeness today I'll turn docker mode
>>> >> >> back
>>> >> >> > on by default tonight.
>>> >> >> >
>>> >> >> > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey 
>>> >> >> wrote:
>>> >> >> >
>>> >> >> > > FYI, I added a new parameter to the precommit job:
>>> >> >> > >
>>> >> >> > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the
>>> >> apache/yetus
>>> >> >> > > repo rather than our chosen release
>>> >> >> > >
>>> >> >> > > It defaults to inactive, but can be used in manually-triggered 
>>> >> >> > > runs
>>> >> to
>>> >> >> > > test a solution to a problem in the yetus library. At the moment,
>>> >> I'm
>>> >> >> > > using it to test a solution to default module ordering  as seen in
>>> >> >> > > HBASE-15075.
>>> >> >> > >
>>> >> >> > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey 
>>> >> >> wrote:
>>> >> >> > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for
>>> >> precommit
>>> >> >> > > tests)
>>> >> >> > > > and updated our jenkins precommit build to use it.
>>> >> >> > > >
>>> >> >> > > > Jenkins job has some explanation:
>>> >> >> > > >
>>> >> >> > >
>>> >> >> >
>>> >> >>
>>> >> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
>>> >> >> > > >
>>> >> >> > > > Release note from HBASE-13525 does as well.
>>> >> >> > > >
>>> >> >> > > > The old job will stick around here for a couple of weeks, in 
>>> >> >> > > > case
>>> >> we
>>> >> >> > need
>>> >> >> > > > to refer back to it:
>>> >> >> > > >
>>> >> >> > > >
>>> >> >> > >
>>> >> >> >
>>> >> >>
>>> >> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
>>> >> >> > > >
>>> >> >> > > > If something looks awry, please drop a note on HBASE-13525 while
>>> >> it
>>

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-08-08 Thread Sean Busbey
running in docker is the default now. relying on the default docker
image that comes with Yetus means that our protoc checks are
failing[1].


[1]: https://issues.apache.org/jira/browse/HBASE-16373

On Sat, Aug 6, 2016 at 5:03 PM, Sean Busbey  wrote:
> Hi folks!
>
> this morning I merged the patch that updates us to Yetus 0.3.0[1] and updated 
> the precommit job appropriately. I also changed it to use one of the Java 
> versions post the puppet changes to asf build.
>
> The last three builds look normal (#2975 - #2977). I'm gonna try running 
> things in docker next. I'll email again when I make it the default.
>
> [1]: https://issues.apache.org/jira/browse/HBASE-15882
>
> On 2016-06-16 10:43 (-0500), Sean Busbey  wrote:
>> FYI, today our precommit jobs started failing because our chosen jdk
>> (1.7.0.79) disappeared (mentioned on HBASE-16032).
>>
>> Initially we were doing something wrong, namely directly referencing
>> the jenkins build tools area without telling jenkins to give us an env
>> variable that stated where the jdk is located. However, after
>> attempting to switch to the appropriate tooling variable for jdk
>> 1.7.0.79, I found that it didn't point to a place that worked.
>>
>> I've now updated the job to rely on the latest 1.7 jdk, which is
>> currently 1.7.0.80. I don't know how often "latest" updates.
>>
>> Personally, I think this is a sign that we need to prioritize
>> HBASE-15882 so that we can switch back to using Docker. I won't have
>> time this week, so if anyone else does please pick up the ticket.
>>
>> On Thu, Mar 17, 2016 at 5:19 PM, Stack  wrote:
>> > Thanks Sean.
>> > St.Ack
>> >
>> > On Wed, Mar 16, 2016 at 12:04 PM, Sean Busbey  wrote:
>> >
>> >> FYI, I updated the precommit job today to specify that only compile time
>> >> checks should be done against jdks other than the primary jdk7 instance.
>> >>
>> >> On Mon, Mar 7, 2016 at 8:43 PM, Sean Busbey  wrote:
>> >>
>> >> > I tested things out, and while YETUS-297[1] is present the default runs
>> >> > all plugins that can do multiple jdks against those available (jdk7 and
>> >> > jdk8 in our case).
>> >> >
>> >> > We can configure things to only do a single run of unit tests. They'll 
>> >> > be
>> >> > against jdk7, since that is our default jdk. That fine by everyone? 
>> >> > It'll
>> >> > save ~1.5 hours on any build that hits hbase-server.
>> >> >
>> >> > On Mon, Mar 7, 2016 at 1:22 PM, Stack  wrote:
>> >> >
>> >> >> Hurray!
>> >> >>
>> >> >> It looks like YETUS-96 is in there and we are only running on jdk build
>> >> >> now, the default (but testing compile against both) Will keep an
>> >> eye.
>> >> >>
>> >> >> St.Ack
>> >> >>
>> >> >>
>> >> >> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey 
>> >> wrote:
>> >> >>
>> >> >> > FYI, I've just updated our precommit jobs to use the 0.2.0 release of
>> >> >> Yetus
>> >> >> > that came out today.
>> >> >> >
>> >> >> > After keeping an eye out for strangeness today I'll turn docker mode
>> >> >> back
>> >> >> > on by default tonight.
>> >> >> >
>> >> >> > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey 
>> >> >> wrote:
>> >> >> >
>> >> >> > > FYI, I added a new parameter to the precommit job:
>> >> >> > >
>> >> >> > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the
>> >> apache/yetus
>> >> >> > > repo rather than our chosen release
>> >> >> > >
>> >> >> > > It defaults to inactive, but can be used in manually-triggered runs
>> >> to
>> >> >> > > test a solution to a problem in the yetus library. At the moment,
>> >> I'm
>> >> >> > > using it to test a solution to default module ordering  as seen in
>> >> >> > > HBASE-15075.
>> >> >> > >
>> >> >> > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey 
>> >> >> wrote:
>> >> >> > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for
>> >> precommit
>> >> >> > > tests)
>> >> >> > > > and updated our jenkins precommit build to use it.
>> >> >> > > >
>> >> >> > > > Jenkins job has some explanation:
>> >> >> > > >
>> >> >> > >
>> >> >> >
>> >> >>
>> >> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
>> >> >> > > >
>> >> >> > > > Release note from HBASE-13525 does as well.
>> >> >> > > >
>> >> >> > > > The old job will stick around here for a couple of weeks, in case
>> >> we
>> >> >> > need
>> >> >> > > > to refer back to it:
>> >> >> > > >
>> >> >> > > >
>> >> >> > >
>> >> >> >
>> >> >>
>> >> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
>> >> >> > > >
>> >> >> > > > If something looks awry, please drop a note on HBASE-13525 while
>> >> it
>> >> >> > > remains
>> >> >> > > > open (and make a new issue after).
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
>> >> >> > > >
>> >> >> > > >> As part of my continuing advocacy of builds.apache.org and that
>> >> >> their
>> >> >> > > >> results are now worthy of our trust and nurture, here are some
>> >> >> > > highlights
>> >> >> > > >> from the last few days of b

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-08-06 Thread Sean Busbey
Hi folks!

this morning I merged the patch that updates us to Yetus 0.3.0[1] and updated 
the precommit job appropriately. I also changed it to use one of the Java 
versions post the puppet changes to asf build.

The last three builds look normal (#2975 - #2977). I'm gonna try running things 
in docker next. I'll email again when I make it the default.

[1]: https://issues.apache.org/jira/browse/HBASE-15882

On 2016-06-16 10:43 (-0500), Sean Busbey  wrote: 
> FYI, today our precommit jobs started failing because our chosen jdk
> (1.7.0.79) disappeared (mentioned on HBASE-16032).
> 
> Initially we were doing something wrong, namely directly referencing
> the jenkins build tools area without telling jenkins to give us an env
> variable that stated where the jdk is located. However, after
> attempting to switch to the appropriate tooling variable for jdk
> 1.7.0.79, I found that it didn't point to a place that worked.
> 
> I've now updated the job to rely on the latest 1.7 jdk, which is
> currently 1.7.0.80. I don't know how often "latest" updates.
> 
> Personally, I think this is a sign that we need to prioritize
> HBASE-15882 so that we can switch back to using Docker. I won't have
> time this week, so if anyone else does please pick up the ticket.
> 
> On Thu, Mar 17, 2016 at 5:19 PM, Stack  wrote:
> > Thanks Sean.
> > St.Ack
> >
> > On Wed, Mar 16, 2016 at 12:04 PM, Sean Busbey  wrote:
> >
> >> FYI, I updated the precommit job today to specify that only compile time
> >> checks should be done against jdks other than the primary jdk7 instance.
> >>
> >> On Mon, Mar 7, 2016 at 8:43 PM, Sean Busbey  wrote:
> >>
> >> > I tested things out, and while YETUS-297[1] is present the default runs
> >> > all plugins that can do multiple jdks against those available (jdk7 and
> >> > jdk8 in our case).
> >> >
> >> > We can configure things to only do a single run of unit tests. They'll be
> >> > against jdk7, since that is our default jdk. That fine by everyone? It'll
> >> > save ~1.5 hours on any build that hits hbase-server.
> >> >
> >> > On Mon, Mar 7, 2016 at 1:22 PM, Stack  wrote:
> >> >
> >> >> Hurray!
> >> >>
> >> >> It looks like YETUS-96 is in there and we are only running on jdk build
> >> >> now, the default (but testing compile against both) Will keep an
> >> eye.
> >> >>
> >> >> St.Ack
> >> >>
> >> >>
> >> >> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey 
> >> wrote:
> >> >>
> >> >> > FYI, I've just updated our precommit jobs to use the 0.2.0 release of
> >> >> Yetus
> >> >> > that came out today.
> >> >> >
> >> >> > After keeping an eye out for strangeness today I'll turn docker mode
> >> >> back
> >> >> > on by default tonight.
> >> >> >
> >> >> > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey 
> >> >> wrote:
> >> >> >
> >> >> > > FYI, I added a new parameter to the precommit job:
> >> >> > >
> >> >> > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the
> >> apache/yetus
> >> >> > > repo rather than our chosen release
> >> >> > >
> >> >> > > It defaults to inactive, but can be used in manually-triggered runs
> >> to
> >> >> > > test a solution to a problem in the yetus library. At the moment,
> >> I'm
> >> >> > > using it to test a solution to default module ordering  as seen in
> >> >> > > HBASE-15075.
> >> >> > >
> >> >> > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey 
> >> >> wrote:
> >> >> > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for
> >> precommit
> >> >> > > tests)
> >> >> > > > and updated our jenkins precommit build to use it.
> >> >> > > >
> >> >> > > > Jenkins job has some explanation:
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> >> >> > > >
> >> >> > > > Release note from HBASE-13525 does as well.
> >> >> > > >
> >> >> > > > The old job will stick around here for a couple of weeks, in case
> >> we
> >> >> > need
> >> >> > > > to refer back to it:
> >> >> > > >
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> >> >> > > >
> >> >> > > > If something looks awry, please drop a note on HBASE-13525 while
> >> it
> >> >> > > remains
> >> >> > > > open (and make a new issue after).
> >> >> > > >
> >> >> > > >
> >> >> > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
> >> >> > > >
> >> >> > > >> As part of my continuing advocacy of builds.apache.org and that
> >> >> their
> >> >> > > >> results are now worthy of our trust and nurture, here are some
> >> >> > > highlights
> >> >> > > >> from the last few days of builds:
> >> >> > > >>
> >> >> > > >> + hadoopqa is now finding zombies before the patch is committed.
> >> >> > > >> HBASE-14888 showed "-1 core tests. The patch failed these unit
> >> >> tests:"
> >> >> > > but
> >> >> > > >> didn't have any failed tests listed (I'm trying to see if I can
> >> do
> >> >> > > anything
> >> >> > > >> about this...). Running our little
> >> ./dev-tools/findHangin

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-06-16 Thread Sean Busbey
FYI, today our precommit jobs started failing because our chosen jdk
(1.7.0.79) disappeared (mentioned on HBASE-16032).

Initially we were doing something wrong, namely directly referencing
the jenkins build tools area without telling jenkins to give us an env
variable that stated where the jdk is located. However, after
attempting to switch to the appropriate tooling variable for jdk
1.7.0.79, I found that it didn't point to a place that worked.

I've now updated the job to rely on the latest 1.7 jdk, which is
currently 1.7.0.80. I don't know how often "latest" updates.

Personally, I think this is a sign that we need to prioritize
HBASE-15882 so that we can switch back to using Docker. I won't have
time this week, so if anyone else does please pick up the ticket.

On Thu, Mar 17, 2016 at 5:19 PM, Stack  wrote:
> Thanks Sean.
> St.Ack
>
> On Wed, Mar 16, 2016 at 12:04 PM, Sean Busbey  wrote:
>
>> FYI, I updated the precommit job today to specify that only compile time
>> checks should be done against jdks other than the primary jdk7 instance.
>>
>> On Mon, Mar 7, 2016 at 8:43 PM, Sean Busbey  wrote:
>>
>> > I tested things out, and while YETUS-297[1] is present the default runs
>> > all plugins that can do multiple jdks against those available (jdk7 and
>> > jdk8 in our case).
>> >
>> > We can configure things to only do a single run of unit tests. They'll be
>> > against jdk7, since that is our default jdk. That fine by everyone? It'll
>> > save ~1.5 hours on any build that hits hbase-server.
>> >
>> > On Mon, Mar 7, 2016 at 1:22 PM, Stack  wrote:
>> >
>> >> Hurray!
>> >>
>> >> It looks like YETUS-96 is in there and we are only running on jdk build
>> >> now, the default (but testing compile against both) Will keep an
>> eye.
>> >>
>> >> St.Ack
>> >>
>> >>
>> >> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey 
>> wrote:
>> >>
>> >> > FYI, I've just updated our precommit jobs to use the 0.2.0 release of
>> >> Yetus
>> >> > that came out today.
>> >> >
>> >> > After keeping an eye out for strangeness today I'll turn docker mode
>> >> back
>> >> > on by default tonight.
>> >> >
>> >> > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey 
>> >> wrote:
>> >> >
>> >> > > FYI, I added a new parameter to the precommit job:
>> >> > >
>> >> > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the
>> apache/yetus
>> >> > > repo rather than our chosen release
>> >> > >
>> >> > > It defaults to inactive, but can be used in manually-triggered runs
>> to
>> >> > > test a solution to a problem in the yetus library. At the moment,
>> I'm
>> >> > > using it to test a solution to default module ordering  as seen in
>> >> > > HBASE-15075.
>> >> > >
>> >> > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey 
>> >> wrote:
>> >> > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for
>> precommit
>> >> > > tests)
>> >> > > > and updated our jenkins precommit build to use it.
>> >> > > >
>> >> > > > Jenkins job has some explanation:
>> >> > > >
>> >> > >
>> >> >
>> >>
>> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
>> >> > > >
>> >> > > > Release note from HBASE-13525 does as well.
>> >> > > >
>> >> > > > The old job will stick around here for a couple of weeks, in case
>> we
>> >> > need
>> >> > > > to refer back to it:
>> >> > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
>> >> > > >
>> >> > > > If something looks awry, please drop a note on HBASE-13525 while
>> it
>> >> > > remains
>> >> > > > open (and make a new issue after).
>> >> > > >
>> >> > > >
>> >> > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
>> >> > > >
>> >> > > >> As part of my continuing advocacy of builds.apache.org and that
>> >> their
>> >> > > >> results are now worthy of our trust and nurture, here are some
>> >> > > highlights
>> >> > > >> from the last few days of builds:
>> >> > > >>
>> >> > > >> + hadoopqa is now finding zombies before the patch is committed.
>> >> > > >> HBASE-14888 showed "-1 core tests. The patch failed these unit
>> >> tests:"
>> >> > > but
>> >> > > >> didn't have any failed tests listed (I'm trying to see if I can
>> do
>> >> > > anything
>> >> > > >> about this...). Running our little
>> ./dev-tools/findHangingTests.py
>> >> > > against
>> >> > > >> the consoleText, it showed a hanging test. Running locally, I see
>> >> same
>> >> > > >> hang. This is before the patch landed.
>> >> > > >> + Our branch runs are now near totally zombie and flakey free --
>> >> still
>> >> > > some
>> >> > > >> work to do -- but a recent patch that seemed harmless was
>> causing a
>> >> > > >> reliable flake fail in the backport to branch-1* confirmed by
>> local
>> >> > > runs.
>> >> > > >> The flakeyness was plain to see up in builds.apache.org.
>> >> > > >> + In the last few days I've committed a patch that included
>> javadoc
>> >> > > >> warnings even though hadoopqa said the patch introduced javadoc
>> >> issues
>> >>

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-19 Thread Sean Busbey
FYI, I updated the precommit job today to specify that only compile time
checks should be done against jdks other than the primary jdk7 instance.

On Mon, Mar 7, 2016 at 8:43 PM, Sean Busbey  wrote:

> I tested things out, and while YETUS-297[1] is present the default runs
> all plugins that can do multiple jdks against those available (jdk7 and
> jdk8 in our case).
>
> We can configure things to only do a single run of unit tests. They'll be
> against jdk7, since that is our default jdk. That fine by everyone? It'll
> save ~1.5 hours on any build that hits hbase-server.
>
> On Mon, Mar 7, 2016 at 1:22 PM, Stack  wrote:
>
>> Hurray!
>>
>> It looks like YETUS-96 is in there and we are only running on jdk build
>> now, the default (but testing compile against both) Will keep an eye.
>>
>> St.Ack
>>
>>
>> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey  wrote:
>>
>> > FYI, I've just updated our precommit jobs to use the 0.2.0 release of
>> Yetus
>> > that came out today.
>> >
>> > After keeping an eye out for strangeness today I'll turn docker mode
>> back
>> > on by default tonight.
>> >
>> > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey 
>> wrote:
>> >
>> > > FYI, I added a new parameter to the precommit job:
>> > >
>> > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the apache/yetus
>> > > repo rather than our chosen release
>> > >
>> > > It defaults to inactive, but can be used in manually-triggered runs to
>> > > test a solution to a problem in the yetus library. At the moment, I'm
>> > > using it to test a solution to default module ordering  as seen in
>> > > HBASE-15075.
>> > >
>> > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey 
>> wrote:
>> > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for precommit
>> > > tests)
>> > > > and updated our jenkins precommit build to use it.
>> > > >
>> > > > Jenkins job has some explanation:
>> > > >
>> > >
>> >
>> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
>> > > >
>> > > > Release note from HBASE-13525 does as well.
>> > > >
>> > > > The old job will stick around here for a couple of weeks, in case we
>> > need
>> > > > to refer back to it:
>> > > >
>> > > >
>> > >
>> >
>> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
>> > > >
>> > > > If something looks awry, please drop a note on HBASE-13525 while it
>> > > remains
>> > > > open (and make a new issue after).
>> > > >
>> > > >
>> > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
>> > > >
>> > > >> As part of my continuing advocacy of builds.apache.org and that
>> their
>> > > >> results are now worthy of our trust and nurture, here are some
>> > > highlights
>> > > >> from the last few days of builds:
>> > > >>
>> > > >> + hadoopqa is now finding zombies before the patch is committed.
>> > > >> HBASE-14888 showed "-1 core tests. The patch failed these unit
>> tests:"
>> > > but
>> > > >> didn't have any failed tests listed (I'm trying to see if I can do
>> > > anything
>> > > >> about this...). Running our little ./dev-tools/findHangingTests.py
>> > > against
>> > > >> the consoleText, it showed a hanging test. Running locally, I see
>> same
>> > > >> hang. This is before the patch landed.
>> > > >> + Our branch runs are now near totally zombie and flakey free --
>> still
>> > > some
>> > > >> work to do -- but a recent patch that seemed harmless was causing a
>> > > >> reliable flake fail in the backport to branch-1* confirmed by local
>> > > runs.
>> > > >> The flakeyness was plain to see up in builds.apache.org.
>> > > >> + In the last few days I've committed a patch that included javadoc
>> > > >> warnings even though hadoopqa said the patch introduced javadoc
>> issues
>> > > (I
>> > > >> missed it). This messed up life for folks subsequently as their
>> > patches
>> > > now
>> > > >> reported javadoc issues
>> > > >>
>> > > >> In short, I suggest that builds.apache.org is worth keeping an eye
>> > on,
>> > > >> make
>> > > >> sure you get a clean build out of hadoopqa before committing
>> anything,
>> > > and
>> > > >> lets all work together to try and keep our builds blue: it'll save
>> us
>> > > all
>> > > >> work in the long run.
>> > > >>
>> > > >> St.Ack
>> > > >>
>> > > >>
>> > > >> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
>> > > >>
>> > > >> > Branch-1 and master have stabilized and now run mostly blue
>> (give or
>> > > take
>> > > >> > the odd failure) [1][2]. Having a mostly blue branch-1 has
>> helped us
>> > > >> > identify at least one destabilizing commit in the last few days,
>> > maybe
>> > > >> two;
>> > > >> > this is as it should be (smile).
>> > > >> >
>> > > >> > Lets keep our builds blue. If you commit a patch, make sure
>> > subsequent
>> > > >> > builds stay blue. You can subscribe to [email protected]
>> to
>> > get
>> > > >> > notice of failures if not already subscribed.
>> > > >> >
>> > > >> > Thanks,
>> > > >> > St.Ack
>> > > >> >
>> > > >> > 1. https://builds.apach

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-19 Thread Stack
Thanks Sean.
St.Ack

On Wed, Mar 16, 2016 at 12:04 PM, Sean Busbey  wrote:

> FYI, I updated the precommit job today to specify that only compile time
> checks should be done against jdks other than the primary jdk7 instance.
>
> On Mon, Mar 7, 2016 at 8:43 PM, Sean Busbey  wrote:
>
> > I tested things out, and while YETUS-297[1] is present the default runs
> > all plugins that can do multiple jdks against those available (jdk7 and
> > jdk8 in our case).
> >
> > We can configure things to only do a single run of unit tests. They'll be
> > against jdk7, since that is our default jdk. That fine by everyone? It'll
> > save ~1.5 hours on any build that hits hbase-server.
> >
> > On Mon, Mar 7, 2016 at 1:22 PM, Stack  wrote:
> >
> >> Hurray!
> >>
> >> It looks like YETUS-96 is in there and we are only running on jdk build
> >> now, the default (but testing compile against both) Will keep an
> eye.
> >>
> >> St.Ack
> >>
> >>
> >> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey 
> wrote:
> >>
> >> > FYI, I've just updated our precommit jobs to use the 0.2.0 release of
> >> Yetus
> >> > that came out today.
> >> >
> >> > After keeping an eye out for strangeness today I'll turn docker mode
> >> back
> >> > on by default tonight.
> >> >
> >> > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey 
> >> wrote:
> >> >
> >> > > FYI, I added a new parameter to the precommit job:
> >> > >
> >> > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the
> apache/yetus
> >> > > repo rather than our chosen release
> >> > >
> >> > > It defaults to inactive, but can be used in manually-triggered runs
> to
> >> > > test a solution to a problem in the yetus library. At the moment,
> I'm
> >> > > using it to test a solution to default module ordering  as seen in
> >> > > HBASE-15075.
> >> > >
> >> > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey 
> >> wrote:
> >> > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for
> precommit
> >> > > tests)
> >> > > > and updated our jenkins precommit build to use it.
> >> > > >
> >> > > > Jenkins job has some explanation:
> >> > > >
> >> > >
> >> >
> >>
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> >> > > >
> >> > > > Release note from HBASE-13525 does as well.
> >> > > >
> >> > > > The old job will stick around here for a couple of weeks, in case
> we
> >> > need
> >> > > > to refer back to it:
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> >> > > >
> >> > > > If something looks awry, please drop a note on HBASE-13525 while
> it
> >> > > remains
> >> > > > open (and make a new issue after).
> >> > > >
> >> > > >
> >> > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
> >> > > >
> >> > > >> As part of my continuing advocacy of builds.apache.org and that
> >> their
> >> > > >> results are now worthy of our trust and nurture, here are some
> >> > > highlights
> >> > > >> from the last few days of builds:
> >> > > >>
> >> > > >> + hadoopqa is now finding zombies before the patch is committed.
> >> > > >> HBASE-14888 showed "-1 core tests. The patch failed these unit
> >> tests:"
> >> > > but
> >> > > >> didn't have any failed tests listed (I'm trying to see if I can
> do
> >> > > anything
> >> > > >> about this...). Running our little
> ./dev-tools/findHangingTests.py
> >> > > against
> >> > > >> the consoleText, it showed a hanging test. Running locally, I see
> >> same
> >> > > >> hang. This is before the patch landed.
> >> > > >> + Our branch runs are now near totally zombie and flakey free --
> >> still
> >> > > some
> >> > > >> work to do -- but a recent patch that seemed harmless was
> causing a
> >> > > >> reliable flake fail in the backport to branch-1* confirmed by
> local
> >> > > runs.
> >> > > >> The flakeyness was plain to see up in builds.apache.org.
> >> > > >> + In the last few days I've committed a patch that included
> javadoc
> >> > > >> warnings even though hadoopqa said the patch introduced javadoc
> >> issues
> >> > > (I
> >> > > >> missed it). This messed up life for folks subsequently as their
> >> > patches
> >> > > now
> >> > > >> reported javadoc issues
> >> > > >>
> >> > > >> In short, I suggest that builds.apache.org is worth keeping an
> eye
> >> > on,
> >> > > >> make
> >> > > >> sure you get a clean build out of hadoopqa before committing
> >> anything,
> >> > > and
> >> > > >> lets all work together to try and keep our builds blue: it'll
> save
> >> us
> >> > > all
> >> > > >> work in the long run.
> >> > > >>
> >> > > >> St.Ack
> >> > > >>
> >> > > >>
> >> > > >> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
> >> > > >>
> >> > > >> > Branch-1 and master have stabilized and now run mostly blue
> >> (give or
> >> > > take
> >> > > >> > the odd failure) [1][2]. Having a mostly blue branch-1 has
> >> helped us
> >> > > >> > identify at least one destabilizing commit in the last few
> days,
> >> > maybe
> >> > > >> two;
> >> > > >> > t

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-15 Thread Phil Yang
https://issues.apache.org/jira/browse/YETUS-334

2016-03-15 21:48 GMT+08:00 Sean Busbey :

> No, that definitely looks like a bug. Could you please open an issue on the
> YETUS jira with a link to the relevant builds and HBASE jiras?
>
> On Tue, Mar 15, 2016 at 5:44 AM, Phil Yang  wrote:
>
> > Hi all,
> >
> > Recently pre-commit builds seems run some commands twice.  For example,
> in
> > console of
> https://builds.apache.org/job/PreCommit-HBASE-Build/975/console
> > or https://builds.apache.org/job/PreCommit-HBASE-Build/978/console , we
> > run
> > "Patch findbugs detection", "Patch javadoc verification", "Running unit
> > tests" twice for each task. HadoopQA also comment repeated results with
> > different runtime in JIRA.
> >
> > We will run tests of hbase-server for four times, twice on jdk7 and twice
> > on jdk8, it will be very slow...
> >
> > Is it as expected? Thanks.
> >
> >
> > 2016-03-15 4:39 GMT+08:00 Stack :
> >
> > > https://issues.apache.org/jira/browse/HBASE-15462
> > >
> > > Thanks Sean.
> > >
> > > Looks like a version parse error?
> > >
> > > St.Ack
> > >
> > >
> > >
> > > On Mon, Mar 14, 2016 at 12:55 PM, Sean Busbey 
> > wrote:
> > >
> > > > HBASE please, I'll refile to INFRA or wherever if I can figure out
> the
> > > > source.
> > > >
> > > > On Mon, Mar 14, 2016 at 12:44 PM, Stack  wrote:
> > > >
> > > > > On Mon, Mar 14, 2016 at 12:23 PM, Sean Busbey  >
> > > > wrote:
> > > > >
> > > > > > is there a jira I can track for the docker failures?
> > > > > >
> > > > > >
> > > > > No. All recent hadoopqas fail. Want an INFRA or HBASE issue?
> > > > > Thanks,
> > > > > St.Ack
> > > > >
> > > > >
> > > > >
> > > > > > On Mon, Mar 14, 2016 at 11:08 AM, Stack 
> wrote:
> > > > > >
> > > > > > > Thanks for making the job configuration all nice and tidy BTW
> > Sean.
> > > > > > >
> > > > > > > I unchecked RUN_IN_DOCKER just now to try and get us over
> current
> > > > bout
> > > > > of
> > > > > > > docker build failures.
> > > > > > >
> > > > > > > St.Ack
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey <
> > [email protected]>
> > > > > > wrote:
> > > > > > >
> > > > > > > > FYI, I've just updated our precommit jobs to use the 0.2.0
> > > release
> > > > of
> > > > > > > Yetus
> > > > > > > > that came out today.
> > > > > > > >
> > > > > > > > After keeping an eye out for strangeness today I'll turn
> docker
> > > > mode
> > > > > > back
> > > > > > > > on by default tonight.
> > > > > > > >
> > > > > > > > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey <
> > [email protected]
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > FYI, I added a new parameter to the precommit job:
> > > > > > > > >
> > > > > > > > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the
> > > > > > apache/yetus
> > > > > > > > > repo rather than our chosen release
> > > > > > > > >
> > > > > > > > > It defaults to inactive, but can be used in
> > manually-triggered
> > > > runs
> > > > > > to
> > > > > > > > > test a solution to a problem in the yetus library. At the
> > > moment,
> > > > > I'm
> > > > > > > > > using it to test a solution to default module ordering  as
> > seen
> > > > in
> > > > > > > > > HBASE-15075.
> > > > > > > > >
> > > > > > > > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey <
> > > [email protected]
> > > > >
> > > > > > > wrote:
> > > > > > > > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus
> for
> > > > > > precommit
> > > > > > > > > tests)
> > > > > > > > > > and updated our jenkins precommit build to use it.
> > > > > > > > > >
> > > > > > > > > > Jenkins job has some explanation:
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> > > > > > > > > >
> > > > > > > > > > Release note from HBASE-13525 does as well.
> > > > > > > > > >
> > > > > > > > > > The old job will stick around here for a couple of weeks,
> > in
> > > > case
> > > > > > we
> > > > > > > > need
> > > > > > > > > > to refer back to it:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> > > > > > > > > >
> > > > > > > > > > If something looks awry, please drop a note on
> HBASE-13525
> > > > while
> > > > > it
> > > > > > > > > remains
> > > > > > > > > > open (and make a new issue after).
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack 
> > > > wrote:
> > > > > > > > > >
> > > > > > > > > >> As part of my continuing advocacy of builds.apache.org
> > and
> > > > that
> > > > > > > their
> > > > > > > > > >> results are now worthy of our trust and nurture, here
> are
> > > some
> > > > > > > > > highlights
> > > > > > > > > >> from the last few days of builds:
> > > > > > > > > >>
> > >

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-15 Thread Sean Busbey
No, that definitely looks like a bug. Could you please open an issue on the
YETUS jira with a link to the relevant builds and HBASE jiras?

On Tue, Mar 15, 2016 at 5:44 AM, Phil Yang  wrote:

> Hi all,
>
> Recently pre-commit builds seems run some commands twice.  For example,  in
> console of https://builds.apache.org/job/PreCommit-HBASE-Build/975/console
> or https://builds.apache.org/job/PreCommit-HBASE-Build/978/console , we
> run
> "Patch findbugs detection", "Patch javadoc verification", "Running unit
> tests" twice for each task. HadoopQA also comment repeated results with
> different runtime in JIRA.
>
> We will run tests of hbase-server for four times, twice on jdk7 and twice
> on jdk8, it will be very slow...
>
> Is it as expected? Thanks.
>
>
> 2016-03-15 4:39 GMT+08:00 Stack :
>
> > https://issues.apache.org/jira/browse/HBASE-15462
> >
> > Thanks Sean.
> >
> > Looks like a version parse error?
> >
> > St.Ack
> >
> >
> >
> > On Mon, Mar 14, 2016 at 12:55 PM, Sean Busbey 
> wrote:
> >
> > > HBASE please, I'll refile to INFRA or wherever if I can figure out the
> > > source.
> > >
> > > On Mon, Mar 14, 2016 at 12:44 PM, Stack  wrote:
> > >
> > > > On Mon, Mar 14, 2016 at 12:23 PM, Sean Busbey 
> > > wrote:
> > > >
> > > > > is there a jira I can track for the docker failures?
> > > > >
> > > > >
> > > > No. All recent hadoopqas fail. Want an INFRA or HBASE issue?
> > > > Thanks,
> > > > St.Ack
> > > >
> > > >
> > > >
> > > > > On Mon, Mar 14, 2016 at 11:08 AM, Stack  wrote:
> > > > >
> > > > > > Thanks for making the job configuration all nice and tidy BTW
> Sean.
> > > > > >
> > > > > > I unchecked RUN_IN_DOCKER just now to try and get us over current
> > > bout
> > > > of
> > > > > > docker build failures.
> > > > > >
> > > > > > St.Ack
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey <
> [email protected]>
> > > > > wrote:
> > > > > >
> > > > > > > FYI, I've just updated our precommit jobs to use the 0.2.0
> > release
> > > of
> > > > > > Yetus
> > > > > > > that came out today.
> > > > > > >
> > > > > > > After keeping an eye out for strangeness today I'll turn docker
> > > mode
> > > > > back
> > > > > > > on by default tonight.
> > > > > > >
> > > > > > > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey <
> [email protected]
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > FYI, I added a new parameter to the precommit job:
> > > > > > > >
> > > > > > > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the
> > > > > apache/yetus
> > > > > > > > repo rather than our chosen release
> > > > > > > >
> > > > > > > > It defaults to inactive, but can be used in
> manually-triggered
> > > runs
> > > > > to
> > > > > > > > test a solution to a problem in the yetus library. At the
> > moment,
> > > > I'm
> > > > > > > > using it to test a solution to default module ordering  as
> seen
> > > in
> > > > > > > > HBASE-15075.
> > > > > > > >
> > > > > > > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey <
> > [email protected]
> > > >
> > > > > > wrote:
> > > > > > > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for
> > > > > precommit
> > > > > > > > tests)
> > > > > > > > > and updated our jenkins precommit build to use it.
> > > > > > > > >
> > > > > > > > > Jenkins job has some explanation:
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> > > > > > > > >
> > > > > > > > > Release note from HBASE-13525 does as well.
> > > > > > > > >
> > > > > > > > > The old job will stick around here for a couple of weeks,
> in
> > > case
> > > > > we
> > > > > > > need
> > > > > > > > > to refer back to it:
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> > > > > > > > >
> > > > > > > > > If something looks awry, please drop a note on HBASE-13525
> > > while
> > > > it
> > > > > > > > remains
> > > > > > > > > open (and make a new issue after).
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack 
> > > wrote:
> > > > > > > > >
> > > > > > > > >> As part of my continuing advocacy of builds.apache.org
> and
> > > that
> > > > > > their
> > > > > > > > >> results are now worthy of our trust and nurture, here are
> > some
> > > > > > > > highlights
> > > > > > > > >> from the last few days of builds:
> > > > > > > > >>
> > > > > > > > >> + hadoopqa is now finding zombies before the patch is
> > > committed.
> > > > > > > > >> HBASE-14888 showed "-1 core tests. The patch failed these
> > unit
> > > > > > tests:"
> > > > > > > > but
> > > > > > > > >> didn't have any failed tests listed (I'm trying to see if
> I
> > > can
> > > > do
> > > > > > > > anything
> > > > > > > > >> about this...). Running our little
> > > > ./dev-tools/findHangingTests.py
> > > 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-15 Thread Phil Yang
Hi all,

Recently pre-commit builds seems run some commands twice.  For example,  in
console of https://builds.apache.org/job/PreCommit-HBASE-Build/975/console
or https://builds.apache.org/job/PreCommit-HBASE-Build/978/console , we run
"Patch findbugs detection", "Patch javadoc verification", "Running unit
tests" twice for each task. HadoopQA also comment repeated results with
different runtime in JIRA.

We will run tests of hbase-server for four times, twice on jdk7 and twice
on jdk8, it will be very slow...

Is it as expected? Thanks.


2016-03-15 4:39 GMT+08:00 Stack :

> https://issues.apache.org/jira/browse/HBASE-15462
>
> Thanks Sean.
>
> Looks like a version parse error?
>
> St.Ack
>
>
>
> On Mon, Mar 14, 2016 at 12:55 PM, Sean Busbey  wrote:
>
> > HBASE please, I'll refile to INFRA or wherever if I can figure out the
> > source.
> >
> > On Mon, Mar 14, 2016 at 12:44 PM, Stack  wrote:
> >
> > > On Mon, Mar 14, 2016 at 12:23 PM, Sean Busbey 
> > wrote:
> > >
> > > > is there a jira I can track for the docker failures?
> > > >
> > > >
> > > No. All recent hadoopqas fail. Want an INFRA or HBASE issue?
> > > Thanks,
> > > St.Ack
> > >
> > >
> > >
> > > > On Mon, Mar 14, 2016 at 11:08 AM, Stack  wrote:
> > > >
> > > > > Thanks for making the job configuration all nice and tidy BTW Sean.
> > > > >
> > > > > I unchecked RUN_IN_DOCKER just now to try and get us over current
> > bout
> > > of
> > > > > docker build failures.
> > > > >
> > > > > St.Ack
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey 
> > > > wrote:
> > > > >
> > > > > > FYI, I've just updated our precommit jobs to use the 0.2.0
> release
> > of
> > > > > Yetus
> > > > > > that came out today.
> > > > > >
> > > > > > After keeping an eye out for strangeness today I'll turn docker
> > mode
> > > > back
> > > > > > on by default tonight.
> > > > > >
> > > > > > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey  >
> > > > wrote:
> > > > > >
> > > > > > > FYI, I added a new parameter to the precommit job:
> > > > > > >
> > > > > > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the
> > > > apache/yetus
> > > > > > > repo rather than our chosen release
> > > > > > >
> > > > > > > It defaults to inactive, but can be used in manually-triggered
> > runs
> > > > to
> > > > > > > test a solution to a problem in the yetus library. At the
> moment,
> > > I'm
> > > > > > > using it to test a solution to default module ordering  as seen
> > in
> > > > > > > HBASE-15075.
> > > > > > >
> > > > > > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey <
> [email protected]
> > >
> > > > > wrote:
> > > > > > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for
> > > > precommit
> > > > > > > tests)
> > > > > > > > and updated our jenkins precommit build to use it.
> > > > > > > >
> > > > > > > > Jenkins job has some explanation:
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> > > > > > > >
> > > > > > > > Release note from HBASE-13525 does as well.
> > > > > > > >
> > > > > > > > The old job will stick around here for a couple of weeks, in
> > case
> > > > we
> > > > > > need
> > > > > > > > to refer back to it:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> > > > > > > >
> > > > > > > > If something looks awry, please drop a note on HBASE-13525
> > while
> > > it
> > > > > > > remains
> > > > > > > > open (and make a new issue after).
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack 
> > wrote:
> > > > > > > >
> > > > > > > >> As part of my continuing advocacy of builds.apache.org and
> > that
> > > > > their
> > > > > > > >> results are now worthy of our trust and nurture, here are
> some
> > > > > > > highlights
> > > > > > > >> from the last few days of builds:
> > > > > > > >>
> > > > > > > >> + hadoopqa is now finding zombies before the patch is
> > committed.
> > > > > > > >> HBASE-14888 showed "-1 core tests. The patch failed these
> unit
> > > > > tests:"
> > > > > > > but
> > > > > > > >> didn't have any failed tests listed (I'm trying to see if I
> > can
> > > do
> > > > > > > anything
> > > > > > > >> about this...). Running our little
> > > ./dev-tools/findHangingTests.py
> > > > > > > against
> > > > > > > >> the consoleText, it showed a hanging test. Running locally,
> I
> > > see
> > > > > same
> > > > > > > >> hang. This is before the patch landed.
> > > > > > > >> + Our branch runs are now near totally zombie and flakey
> free
> > --
> > > > > still
> > > > > > > some
> > > > > > > >> work to do -- but a recent patch that seemed harmless was
> > > causing
> > > > a
> > > > > > > >> reliable flake fail in the backport to branch-1* confirmed
> by
> > > > local
> > > > > > > runs.
> > > > > > > >> The flakeyness was plain to see up in b

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-14 Thread Stack
https://issues.apache.org/jira/browse/HBASE-15462

Thanks Sean.

Looks like a version parse error?

St.Ack



On Mon, Mar 14, 2016 at 12:55 PM, Sean Busbey  wrote:

> HBASE please, I'll refile to INFRA or wherever if I can figure out the
> source.
>
> On Mon, Mar 14, 2016 at 12:44 PM, Stack  wrote:
>
> > On Mon, Mar 14, 2016 at 12:23 PM, Sean Busbey 
> wrote:
> >
> > > is there a jira I can track for the docker failures?
> > >
> > >
> > No. All recent hadoopqas fail. Want an INFRA or HBASE issue?
> > Thanks,
> > St.Ack
> >
> >
> >
> > > On Mon, Mar 14, 2016 at 11:08 AM, Stack  wrote:
> > >
> > > > Thanks for making the job configuration all nice and tidy BTW Sean.
> > > >
> > > > I unchecked RUN_IN_DOCKER just now to try and get us over current
> bout
> > of
> > > > docker build failures.
> > > >
> > > > St.Ack
> > > >
> > > >
> > > >
> > > > On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey 
> > > wrote:
> > > >
> > > > > FYI, I've just updated our precommit jobs to use the 0.2.0 release
> of
> > > > Yetus
> > > > > that came out today.
> > > > >
> > > > > After keeping an eye out for strangeness today I'll turn docker
> mode
> > > back
> > > > > on by default tonight.
> > > > >
> > > > > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey 
> > > wrote:
> > > > >
> > > > > > FYI, I added a new parameter to the precommit job:
> > > > > >
> > > > > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the
> > > apache/yetus
> > > > > > repo rather than our chosen release
> > > > > >
> > > > > > It defaults to inactive, but can be used in manually-triggered
> runs
> > > to
> > > > > > test a solution to a problem in the yetus library. At the moment,
> > I'm
> > > > > > using it to test a solution to default module ordering  as seen
> in
> > > > > > HBASE-15075.
> > > > > >
> > > > > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey  >
> > > > wrote:
> > > > > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for
> > > precommit
> > > > > > tests)
> > > > > > > and updated our jenkins precommit build to use it.
> > > > > > >
> > > > > > > Jenkins job has some explanation:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> > > > > > >
> > > > > > > Release note from HBASE-13525 does as well.
> > > > > > >
> > > > > > > The old job will stick around here for a couple of weeks, in
> case
> > > we
> > > > > need
> > > > > > > to refer back to it:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> > > > > > >
> > > > > > > If something looks awry, please drop a note on HBASE-13525
> while
> > it
> > > > > > remains
> > > > > > > open (and make a new issue after).
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack 
> wrote:
> > > > > > >
> > > > > > >> As part of my continuing advocacy of builds.apache.org and
> that
> > > > their
> > > > > > >> results are now worthy of our trust and nurture, here are some
> > > > > > highlights
> > > > > > >> from the last few days of builds:
> > > > > > >>
> > > > > > >> + hadoopqa is now finding zombies before the patch is
> committed.
> > > > > > >> HBASE-14888 showed "-1 core tests. The patch failed these unit
> > > > tests:"
> > > > > > but
> > > > > > >> didn't have any failed tests listed (I'm trying to see if I
> can
> > do
> > > > > > anything
> > > > > > >> about this...). Running our little
> > ./dev-tools/findHangingTests.py
> > > > > > against
> > > > > > >> the consoleText, it showed a hanging test. Running locally, I
> > see
> > > > same
> > > > > > >> hang. This is before the patch landed.
> > > > > > >> + Our branch runs are now near totally zombie and flakey free
> --
> > > > still
> > > > > > some
> > > > > > >> work to do -- but a recent patch that seemed harmless was
> > causing
> > > a
> > > > > > >> reliable flake fail in the backport to branch-1* confirmed by
> > > local
> > > > > > runs.
> > > > > > >> The flakeyness was plain to see up in builds.apache.org.
> > > > > > >> + In the last few days I've committed a patch that included
> > > javadoc
> > > > > > >> warnings even though hadoopqa said the patch introduced
> javadoc
> > > > issues
> > > > > > (I
> > > > > > >> missed it). This messed up life for folks subsequently as
> their
> > > > > patches
> > > > > > now
> > > > > > >> reported javadoc issues
> > > > > > >>
> > > > > > >> In short, I suggest that builds.apache.org is worth keeping
> an
> > > eye
> > > > > on,
> > > > > > >> make
> > > > > > >> sure you get a clean build out of hadoopqa before committing
> > > > anything,
> > > > > > and
> > > > > > >> lets all work together to try and keep our builds blue: it'll
> > save
> > > > us
> > > > > > all
> > > > > > >> work in the long run.
> > > > > > >>
> > > > > > >> St.Ack
> > > > > > >>
> > > > > > >>
> > > > > > >> On Tue, Nov 4, 2014 at 9:38 AM, Stack 
> wrote:
> > > > > > >

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-14 Thread Sean Busbey
HBASE please, I'll refile to INFRA or wherever if I can figure out the
source.

On Mon, Mar 14, 2016 at 12:44 PM, Stack  wrote:

> On Mon, Mar 14, 2016 at 12:23 PM, Sean Busbey  wrote:
>
> > is there a jira I can track for the docker failures?
> >
> >
> No. All recent hadoopqas fail. Want an INFRA or HBASE issue?
> Thanks,
> St.Ack
>
>
>
> > On Mon, Mar 14, 2016 at 11:08 AM, Stack  wrote:
> >
> > > Thanks for making the job configuration all nice and tidy BTW Sean.
> > >
> > > I unchecked RUN_IN_DOCKER just now to try and get us over current bout
> of
> > > docker build failures.
> > >
> > > St.Ack
> > >
> > >
> > >
> > > On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey 
> > wrote:
> > >
> > > > FYI, I've just updated our precommit jobs to use the 0.2.0 release of
> > > Yetus
> > > > that came out today.
> > > >
> > > > After keeping an eye out for strangeness today I'll turn docker mode
> > back
> > > > on by default tonight.
> > > >
> > > > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey 
> > wrote:
> > > >
> > > > > FYI, I added a new parameter to the precommit job:
> > > > >
> > > > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the
> > apache/yetus
> > > > > repo rather than our chosen release
> > > > >
> > > > > It defaults to inactive, but can be used in manually-triggered runs
> > to
> > > > > test a solution to a problem in the yetus library. At the moment,
> I'm
> > > > > using it to test a solution to default module ordering  as seen in
> > > > > HBASE-15075.
> > > > >
> > > > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey 
> > > wrote:
> > > > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for
> > precommit
> > > > > tests)
> > > > > > and updated our jenkins precommit build to use it.
> > > > > >
> > > > > > Jenkins job has some explanation:
> > > > > >
> > > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> > > > > >
> > > > > > Release note from HBASE-13525 does as well.
> > > > > >
> > > > > > The old job will stick around here for a couple of weeks, in case
> > we
> > > > need
> > > > > > to refer back to it:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> > > > > >
> > > > > > If something looks awry, please drop a note on HBASE-13525 while
> it
> > > > > remains
> > > > > > open (and make a new issue after).
> > > > > >
> > > > > >
> > > > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
> > > > > >
> > > > > >> As part of my continuing advocacy of builds.apache.org and that
> > > their
> > > > > >> results are now worthy of our trust and nurture, here are some
> > > > > highlights
> > > > > >> from the last few days of builds:
> > > > > >>
> > > > > >> + hadoopqa is now finding zombies before the patch is committed.
> > > > > >> HBASE-14888 showed "-1 core tests. The patch failed these unit
> > > tests:"
> > > > > but
> > > > > >> didn't have any failed tests listed (I'm trying to see if I can
> do
> > > > > anything
> > > > > >> about this...). Running our little
> ./dev-tools/findHangingTests.py
> > > > > against
> > > > > >> the consoleText, it showed a hanging test. Running locally, I
> see
> > > same
> > > > > >> hang. This is before the patch landed.
> > > > > >> + Our branch runs are now near totally zombie and flakey free --
> > > still
> > > > > some
> > > > > >> work to do -- but a recent patch that seemed harmless was
> causing
> > a
> > > > > >> reliable flake fail in the backport to branch-1* confirmed by
> > local
> > > > > runs.
> > > > > >> The flakeyness was plain to see up in builds.apache.org.
> > > > > >> + In the last few days I've committed a patch that included
> > javadoc
> > > > > >> warnings even though hadoopqa said the patch introduced javadoc
> > > issues
> > > > > (I
> > > > > >> missed it). This messed up life for folks subsequently as their
> > > > patches
> > > > > now
> > > > > >> reported javadoc issues
> > > > > >>
> > > > > >> In short, I suggest that builds.apache.org is worth keeping an
> > eye
> > > > on,
> > > > > >> make
> > > > > >> sure you get a clean build out of hadoopqa before committing
> > > anything,
> > > > > and
> > > > > >> lets all work together to try and keep our builds blue: it'll
> save
> > > us
> > > > > all
> > > > > >> work in the long run.
> > > > > >>
> > > > > >> St.Ack
> > > > > >>
> > > > > >>
> > > > > >> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
> > > > > >>
> > > > > >> > Branch-1 and master have stabilized and now run mostly blue
> > (give
> > > or
> > > > > take
> > > > > >> > the odd failure) [1][2]. Having a mostly blue branch-1 has
> > helped
> > > us
> > > > > >> > identify at least one destabilizing commit in the last few
> days,
> > > > maybe
> > > > > >> two;
> > > > > >> > this is as it should be (smile).
> > > > > >> >
> > > > > >> > Lets keep our builds blue. If you commit a patch, make sure
> > > > subsequent
> > > > > >> > builds stay 

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-14 Thread Stack
On Mon, Mar 14, 2016 at 12:23 PM, Sean Busbey  wrote:

> is there a jira I can track for the docker failures?
>
>
No. All recent hadoopqas fail. Want an INFRA or HBASE issue?
Thanks,
St.Ack



> On Mon, Mar 14, 2016 at 11:08 AM, Stack  wrote:
>
> > Thanks for making the job configuration all nice and tidy BTW Sean.
> >
> > I unchecked RUN_IN_DOCKER just now to try and get us over current bout of
> > docker build failures.
> >
> > St.Ack
> >
> >
> >
> > On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey 
> wrote:
> >
> > > FYI, I've just updated our precommit jobs to use the 0.2.0 release of
> > Yetus
> > > that came out today.
> > >
> > > After keeping an eye out for strangeness today I'll turn docker mode
> back
> > > on by default tonight.
> > >
> > > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey 
> wrote:
> > >
> > > > FYI, I added a new parameter to the precommit job:
> > > >
> > > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the
> apache/yetus
> > > > repo rather than our chosen release
> > > >
> > > > It defaults to inactive, but can be used in manually-triggered runs
> to
> > > > test a solution to a problem in the yetus library. At the moment, I'm
> > > > using it to test a solution to default module ordering  as seen in
> > > > HBASE-15075.
> > > >
> > > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey 
> > wrote:
> > > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for
> precommit
> > > > tests)
> > > > > and updated our jenkins precommit build to use it.
> > > > >
> > > > > Jenkins job has some explanation:
> > > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> > > > >
> > > > > Release note from HBASE-13525 does as well.
> > > > >
> > > > > The old job will stick around here for a couple of weeks, in case
> we
> > > need
> > > > > to refer back to it:
> > > > >
> > > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> > > > >
> > > > > If something looks awry, please drop a note on HBASE-13525 while it
> > > > remains
> > > > > open (and make a new issue after).
> > > > >
> > > > >
> > > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
> > > > >
> > > > >> As part of my continuing advocacy of builds.apache.org and that
> > their
> > > > >> results are now worthy of our trust and nurture, here are some
> > > > highlights
> > > > >> from the last few days of builds:
> > > > >>
> > > > >> + hadoopqa is now finding zombies before the patch is committed.
> > > > >> HBASE-14888 showed "-1 core tests. The patch failed these unit
> > tests:"
> > > > but
> > > > >> didn't have any failed tests listed (I'm trying to see if I can do
> > > > anything
> > > > >> about this...). Running our little ./dev-tools/findHangingTests.py
> > > > against
> > > > >> the consoleText, it showed a hanging test. Running locally, I see
> > same
> > > > >> hang. This is before the patch landed.
> > > > >> + Our branch runs are now near totally zombie and flakey free --
> > still
> > > > some
> > > > >> work to do -- but a recent patch that seemed harmless was causing
> a
> > > > >> reliable flake fail in the backport to branch-1* confirmed by
> local
> > > > runs.
> > > > >> The flakeyness was plain to see up in builds.apache.org.
> > > > >> + In the last few days I've committed a patch that included
> javadoc
> > > > >> warnings even though hadoopqa said the patch introduced javadoc
> > issues
> > > > (I
> > > > >> missed it). This messed up life for folks subsequently as their
> > > patches
> > > > now
> > > > >> reported javadoc issues
> > > > >>
> > > > >> In short, I suggest that builds.apache.org is worth keeping an
> eye
> > > on,
> > > > >> make
> > > > >> sure you get a clean build out of hadoopqa before committing
> > anything,
> > > > and
> > > > >> lets all work together to try and keep our builds blue: it'll save
> > us
> > > > all
> > > > >> work in the long run.
> > > > >>
> > > > >> St.Ack
> > > > >>
> > > > >>
> > > > >> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
> > > > >>
> > > > >> > Branch-1 and master have stabilized and now run mostly blue
> (give
> > or
> > > > take
> > > > >> > the odd failure) [1][2]. Having a mostly blue branch-1 has
> helped
> > us
> > > > >> > identify at least one destabilizing commit in the last few days,
> > > maybe
> > > > >> two;
> > > > >> > this is as it should be (smile).
> > > > >> >
> > > > >> > Lets keep our builds blue. If you commit a patch, make sure
> > > subsequent
> > > > >> > builds stay blue. You can subscribe to [email protected]
> to
> > > get
> > > > >> > notice of failures if not already subscribed.
> > > > >> >
> > > > >> > Thanks,
> > > > >> > St.Ack
> > > > >> >
> > > > >> > 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/
> > > > >> > 2.
> https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/
> > > > >> >
> > > > >> >
> > > > >> > On Mon, Oct 13, 2014 at 4:41 PM, Stack 
> wrote:
> > > > >

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-14 Thread Sean Busbey
is there a jira I can track for the docker failures?

On Mon, Mar 14, 2016 at 11:08 AM, Stack  wrote:

> Thanks for making the job configuration all nice and tidy BTW Sean.
>
> I unchecked RUN_IN_DOCKER just now to try and get us over current bout of
> docker build failures.
>
> St.Ack
>
>
>
> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey  wrote:
>
> > FYI, I've just updated our precommit jobs to use the 0.2.0 release of
> Yetus
> > that came out today.
> >
> > After keeping an eye out for strangeness today I'll turn docker mode back
> > on by default tonight.
> >
> > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey  wrote:
> >
> > > FYI, I added a new parameter to the precommit job:
> > >
> > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the apache/yetus
> > > repo rather than our chosen release
> > >
> > > It defaults to inactive, but can be used in manually-triggered runs to
> > > test a solution to a problem in the yetus library. At the moment, I'm
> > > using it to test a solution to default module ordering  as seen in
> > > HBASE-15075.
> > >
> > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey 
> wrote:
> > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for precommit
> > > tests)
> > > > and updated our jenkins precommit build to use it.
> > > >
> > > > Jenkins job has some explanation:
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> > > >
> > > > Release note from HBASE-13525 does as well.
> > > >
> > > > The old job will stick around here for a couple of weeks, in case we
> > need
> > > > to refer back to it:
> > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> > > >
> > > > If something looks awry, please drop a note on HBASE-13525 while it
> > > remains
> > > > open (and make a new issue after).
> > > >
> > > >
> > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
> > > >
> > > >> As part of my continuing advocacy of builds.apache.org and that
> their
> > > >> results are now worthy of our trust and nurture, here are some
> > > highlights
> > > >> from the last few days of builds:
> > > >>
> > > >> + hadoopqa is now finding zombies before the patch is committed.
> > > >> HBASE-14888 showed "-1 core tests. The patch failed these unit
> tests:"
> > > but
> > > >> didn't have any failed tests listed (I'm trying to see if I can do
> > > anything
> > > >> about this...). Running our little ./dev-tools/findHangingTests.py
> > > against
> > > >> the consoleText, it showed a hanging test. Running locally, I see
> same
> > > >> hang. This is before the patch landed.
> > > >> + Our branch runs are now near totally zombie and flakey free --
> still
> > > some
> > > >> work to do -- but a recent patch that seemed harmless was causing a
> > > >> reliable flake fail in the backport to branch-1* confirmed by local
> > > runs.
> > > >> The flakeyness was plain to see up in builds.apache.org.
> > > >> + In the last few days I've committed a patch that included javadoc
> > > >> warnings even though hadoopqa said the patch introduced javadoc
> issues
> > > (I
> > > >> missed it). This messed up life for folks subsequently as their
> > patches
> > > now
> > > >> reported javadoc issues
> > > >>
> > > >> In short, I suggest that builds.apache.org is worth keeping an eye
> > on,
> > > >> make
> > > >> sure you get a clean build out of hadoopqa before committing
> anything,
> > > and
> > > >> lets all work together to try and keep our builds blue: it'll save
> us
> > > all
> > > >> work in the long run.
> > > >>
> > > >> St.Ack
> > > >>
> > > >>
> > > >> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
> > > >>
> > > >> > Branch-1 and master have stabilized and now run mostly blue (give
> or
> > > take
> > > >> > the odd failure) [1][2]. Having a mostly blue branch-1 has helped
> us
> > > >> > identify at least one destabilizing commit in the last few days,
> > maybe
> > > >> two;
> > > >> > this is as it should be (smile).
> > > >> >
> > > >> > Lets keep our builds blue. If you commit a patch, make sure
> > subsequent
> > > >> > builds stay blue. You can subscribe to [email protected] to
> > get
> > > >> > notice of failures if not already subscribed.
> > > >> >
> > > >> > Thanks,
> > > >> > St.Ack
> > > >> >
> > > >> > 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/
> > > >> > 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/
> > > >> >
> > > >> >
> > > >> > On Mon, Oct 13, 2014 at 4:41 PM, Stack  wrote:
> > > >> >
> > > >> >> A few notes on testing.
> > > >> >>
> > > >> >> Too long to read, infra is more capable now and after some work,
> we
> > > are
> > > >> >> seeing branch-1 and trunk mostly running blue. Lets try and keep
> it
> > > this
> > > >> >> way going forward.
> > > >> >>
> > > >> >> Apache Infra has new, more capable hardware.
> > > >> >>
> > > >> >> A recent spurt of test fixing combined with more capable hardware
> > > seems
> > > >

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-14 Thread Stack
Thanks for making the job configuration all nice and tidy BTW Sean.

I unchecked RUN_IN_DOCKER just now to try and get us over current bout of
docker build failures.

St.Ack



On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey  wrote:

> FYI, I've just updated our precommit jobs to use the 0.2.0 release of Yetus
> that came out today.
>
> After keeping an eye out for strangeness today I'll turn docker mode back
> on by default tonight.
>
> On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey  wrote:
>
> > FYI, I added a new parameter to the precommit job:
> >
> > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the apache/yetus
> > repo rather than our chosen release
> >
> > It defaults to inactive, but can be used in manually-triggered runs to
> > test a solution to a problem in the yetus library. At the moment, I'm
> > using it to test a solution to default module ordering  as seen in
> > HBASE-15075.
> >
> > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey  wrote:
> > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for precommit
> > tests)
> > > and updated our jenkins precommit build to use it.
> > >
> > > Jenkins job has some explanation:
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> > >
> > > Release note from HBASE-13525 does as well.
> > >
> > > The old job will stick around here for a couple of weeks, in case we
> need
> > > to refer back to it:
> > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> > >
> > > If something looks awry, please drop a note on HBASE-13525 while it
> > remains
> > > open (and make a new issue after).
> > >
> > >
> > > On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
> > >
> > >> As part of my continuing advocacy of builds.apache.org and that their
> > >> results are now worthy of our trust and nurture, here are some
> > highlights
> > >> from the last few days of builds:
> > >>
> > >> + hadoopqa is now finding zombies before the patch is committed.
> > >> HBASE-14888 showed "-1 core tests. The patch failed these unit tests:"
> > but
> > >> didn't have any failed tests listed (I'm trying to see if I can do
> > anything
> > >> about this...). Running our little ./dev-tools/findHangingTests.py
> > against
> > >> the consoleText, it showed a hanging test. Running locally, I see same
> > >> hang. This is before the patch landed.
> > >> + Our branch runs are now near totally zombie and flakey free -- still
> > some
> > >> work to do -- but a recent patch that seemed harmless was causing a
> > >> reliable flake fail in the backport to branch-1* confirmed by local
> > runs.
> > >> The flakeyness was plain to see up in builds.apache.org.
> > >> + In the last few days I've committed a patch that included javadoc
> > >> warnings even though hadoopqa said the patch introduced javadoc issues
> > (I
> > >> missed it). This messed up life for folks subsequently as their
> patches
> > now
> > >> reported javadoc issues
> > >>
> > >> In short, I suggest that builds.apache.org is worth keeping an eye
> on,
> > >> make
> > >> sure you get a clean build out of hadoopqa before committing anything,
> > and
> > >> lets all work together to try and keep our builds blue: it'll save us
> > all
> > >> work in the long run.
> > >>
> > >> St.Ack
> > >>
> > >>
> > >> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
> > >>
> > >> > Branch-1 and master have stabilized and now run mostly blue (give or
> > take
> > >> > the odd failure) [1][2]. Having a mostly blue branch-1 has helped us
> > >> > identify at least one destabilizing commit in the last few days,
> maybe
> > >> two;
> > >> > this is as it should be (smile).
> > >> >
> > >> > Lets keep our builds blue. If you commit a patch, make sure
> subsequent
> > >> > builds stay blue. You can subscribe to [email protected] to
> get
> > >> > notice of failures if not already subscribed.
> > >> >
> > >> > Thanks,
> > >> > St.Ack
> > >> >
> > >> > 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/
> > >> > 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/
> > >> >
> > >> >
> > >> > On Mon, Oct 13, 2014 at 4:41 PM, Stack  wrote:
> > >> >
> > >> >> A few notes on testing.
> > >> >>
> > >> >> Too long to read, infra is more capable now and after some work, we
> > are
> > >> >> seeing branch-1 and trunk mostly running blue. Lets try and keep it
> > this
> > >> >> way going forward.
> > >> >>
> > >> >> Apache Infra has new, more capable hardware.
> > >> >>
> > >> >> A recent spurt of test fixing combined with more capable hardware
> > seems
> > >> >> to have gotten us to a new place; tests are mostly passing now on
> > >> branch-1
> > >> >> and master.  Lets try and keep it this way and start to trust our
> > test
> > >> runs
> > >> >> again.  Just a few flakies remain.  Lets try and nail them.
> > >> >>
> > >> >> Our tests now run in parallel with other test suites where previous
> > we
> > >> >> ran alone. You can see this sometime

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-08 Thread Sean Busbey
On Mon, Mar 7, 2016 at 7:42 PM, Mikhail Antonov 
wrote:

> Cutting 1.5 hours off pre-commit build's time would be great. Would
> post-commit builds also only run on jdk7 or both?
>
>
The post-commit builds are matrix builds that do the different JDKs in
parallel. The JDKs picked are based on what we claim to support in the ref
guide, so I wouldn't change them.

-- 
busbey


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-07 Thread Mikhail Antonov
Cutting 1.5 hours off pre-commit build's time would be great. Would
post-commit builds also only run on jdk7 or both?

Mikhail

On Mon, Mar 7, 2016 at 7:37 PM, Ted Yu  wrote:

> Running against jdk 7 only is fine by me.
>
> > On Mar 7, 2016, at 6:43 PM, Sean Busbey  wrote:
> >
> > I tested things out, and while YETUS-297[1] is present the default runs
> all
> > plugins that can do multiple jdks against those available (jdk7 and jdk8
> in
> > our case).
> >
> > We can configure things to only do a single run of unit tests. They'll be
> > against jdk7, since that is our default jdk. That fine by everyone? It'll
> > save ~1.5 hours on any build that hits hbase-server.
> >
> >> On Mon, Mar 7, 2016 at 1:22 PM, Stack  wrote:
> >>
> >> Hurray!
> >>
> >> It looks like YETUS-96 is in there and we are only running on jdk build
> >> now, the default (but testing compile against both) Will keep an
> eye.
> >>
> >> St.Ack
> >>
> >>
> >>> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey 
> wrote:
> >>>
> >>> FYI, I've just updated our precommit jobs to use the 0.2.0 release of
> >> Yetus
> >>> that came out today.
> >>>
> >>> After keeping an eye out for strangeness today I'll turn docker mode
> back
> >>> on by default tonight.
> >>>
> >>> On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey 
> wrote:
> >>>
>  FYI, I added a new parameter to the precommit job:
> 
>  * USE_YETUS_PRERELEASE - causes us to use the HEAD of the apache/yetus
>  repo rather than our chosen release
> 
>  It defaults to inactive, but can be used in manually-triggered runs to
>  test a solution to a problem in the yetus library. At the moment, I'm
>  using it to test a solution to default module ordering  as seen in
>  HBASE-15075.
> 
>  On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey 
> >> wrote:
> > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for precommit
>  tests)
> > and updated our jenkins precommit build to use it.
> >
> > Jenkins job has some explanation:
> >>
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> >
> > Release note from HBASE-13525 does as well.
> >
> > The old job will stick around here for a couple of weeks, in case we
> >>> need
> > to refer back to it:
> >>
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> >
> > If something looks awry, please drop a note on HBASE-13525 while it
>  remains
> > open (and make a new issue after).
> >
> >
> >> On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
> >>
> >> As part of my continuing advocacy of builds.apache.org and that
> >> their
> >> results are now worthy of our trust and nurture, here are some
>  highlights
> >> from the last few days of builds:
> >>
> >> + hadoopqa is now finding zombies before the patch is committed.
> >> HBASE-14888 showed "-1 core tests. The patch failed these unit
> >> tests:"
>  but
> >> didn't have any failed tests listed (I'm trying to see if I can do
>  anything
> >> about this...). Running our little ./dev-tools/findHangingTests.py
>  against
> >> the consoleText, it showed a hanging test. Running locally, I see
> >> same
> >> hang. This is before the patch landed.
> >> + Our branch runs are now near totally zombie and flakey free --
> >> still
>  some
> >> work to do -- but a recent patch that seemed harmless was causing a
> >> reliable flake fail in the backport to branch-1* confirmed by local
>  runs.
> >> The flakeyness was plain to see up in builds.apache.org.
> >> + In the last few days I've committed a patch that included javadoc
> >> warnings even though hadoopqa said the patch introduced javadoc
> >> issues
>  (I
> >> missed it). This messed up life for folks subsequently as their
> >>> patches
>  now
> >> reported javadoc issues
> >>
> >> In short, I suggest that builds.apache.org is worth keeping an eye
> >>> on,
> >> make
> >> sure you get a clean build out of hadoopqa before committing
> >> anything,
>  and
> >> lets all work together to try and keep our builds blue: it'll save
> >> us
>  all
> >> work in the long run.
> >>
> >> St.Ack
> >>
> >>
> >>> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
> >>>
> >>> Branch-1 and master have stabilized and now run mostly blue (give
> >> or
>  take
> >>> the odd failure) [1][2]. Having a mostly blue branch-1 has helped
> >> us
> >>> identify at least one destabilizing commit in the last few days,
> >>> maybe
> >> two;
> >>> this is as it should be (smile).
> >>>
> >>> Lets keep our builds blue. If you commit a patch, make sure
> >>> subsequent
> >>> builds stay blue. You can subscribe to [email protected] to
> >>> get
> >>> notice of failures if not already subscribed.
> >>>
> >>> Thanks,
> >>> S

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-07 Thread Ted Yu
Running against jdk 7 only is fine by me. 

> On Mar 7, 2016, at 6:43 PM, Sean Busbey  wrote:
> 
> I tested things out, and while YETUS-297[1] is present the default runs all
> plugins that can do multiple jdks against those available (jdk7 and jdk8 in
> our case).
> 
> We can configure things to only do a single run of unit tests. They'll be
> against jdk7, since that is our default jdk. That fine by everyone? It'll
> save ~1.5 hours on any build that hits hbase-server.
> 
>> On Mon, Mar 7, 2016 at 1:22 PM, Stack  wrote:
>> 
>> Hurray!
>> 
>> It looks like YETUS-96 is in there and we are only running on jdk build
>> now, the default (but testing compile against both) Will keep an eye.
>> 
>> St.Ack
>> 
>> 
>>> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey  wrote:
>>> 
>>> FYI, I've just updated our precommit jobs to use the 0.2.0 release of
>> Yetus
>>> that came out today.
>>> 
>>> After keeping an eye out for strangeness today I'll turn docker mode back
>>> on by default tonight.
>>> 
>>> On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey  wrote:
>>> 
 FYI, I added a new parameter to the precommit job:
 
 * USE_YETUS_PRERELEASE - causes us to use the HEAD of the apache/yetus
 repo rather than our chosen release
 
 It defaults to inactive, but can be used in manually-triggered runs to
 test a solution to a problem in the yetus library. At the moment, I'm
 using it to test a solution to default module ordering  as seen in
 HBASE-15075.
 
 On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey 
>> wrote:
> FYI, I just pushed HBASE-13525 (switch to Apache Yetus for precommit
 tests)
> and updated our jenkins precommit build to use it.
> 
> Jenkins job has some explanation:
>> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> 
> Release note from HBASE-13525 does as well.
> 
> The old job will stick around here for a couple of weeks, in case we
>>> need
> to refer back to it:
>> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> 
> If something looks awry, please drop a note on HBASE-13525 while it
 remains
> open (and make a new issue after).
> 
> 
>> On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
>> 
>> As part of my continuing advocacy of builds.apache.org and that
>> their
>> results are now worthy of our trust and nurture, here are some
 highlights
>> from the last few days of builds:
>> 
>> + hadoopqa is now finding zombies before the patch is committed.
>> HBASE-14888 showed "-1 core tests. The patch failed these unit
>> tests:"
 but
>> didn't have any failed tests listed (I'm trying to see if I can do
 anything
>> about this...). Running our little ./dev-tools/findHangingTests.py
 against
>> the consoleText, it showed a hanging test. Running locally, I see
>> same
>> hang. This is before the patch landed.
>> + Our branch runs are now near totally zombie and flakey free --
>> still
 some
>> work to do -- but a recent patch that seemed harmless was causing a
>> reliable flake fail in the backport to branch-1* confirmed by local
 runs.
>> The flakeyness was plain to see up in builds.apache.org.
>> + In the last few days I've committed a patch that included javadoc
>> warnings even though hadoopqa said the patch introduced javadoc
>> issues
 (I
>> missed it). This messed up life for folks subsequently as their
>>> patches
 now
>> reported javadoc issues
>> 
>> In short, I suggest that builds.apache.org is worth keeping an eye
>>> on,
>> make
>> sure you get a clean build out of hadoopqa before committing
>> anything,
 and
>> lets all work together to try and keep our builds blue: it'll save
>> us
 all
>> work in the long run.
>> 
>> St.Ack
>> 
>> 
>>> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
>>> 
>>> Branch-1 and master have stabilized and now run mostly blue (give
>> or
 take
>>> the odd failure) [1][2]. Having a mostly blue branch-1 has helped
>> us
>>> identify at least one destabilizing commit in the last few days,
>>> maybe
>> two;
>>> this is as it should be (smile).
>>> 
>>> Lets keep our builds blue. If you commit a patch, make sure
>>> subsequent
>>> builds stay blue. You can subscribe to [email protected] to
>>> get
>>> notice of failures if not already subscribed.
>>> 
>>> Thanks,
>>> St.Ack
>>> 
>>> 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/
>>> 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/
>>> 
>>> 
 On Mon, Oct 13, 2014 at 4:41 PM, Stack  wrote:
 
 A few notes on testing.
 
 Too long to read, infra is more capable now and after some work,
>> we
 are
 seeing branch-1 and trunk mostly running

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-07 Thread Sean Busbey
I tested things out, and while YETUS-297[1] is present the default runs all
plugins that can do multiple jdks against those available (jdk7 and jdk8 in
our case).

We can configure things to only do a single run of unit tests. They'll be
against jdk7, since that is our default jdk. That fine by everyone? It'll
save ~1.5 hours on any build that hits hbase-server.

On Mon, Mar 7, 2016 at 1:22 PM, Stack  wrote:

> Hurray!
>
> It looks like YETUS-96 is in there and we are only running on jdk build
> now, the default (but testing compile against both) Will keep an eye.
>
> St.Ack
>
>
> On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey  wrote:
>
> > FYI, I've just updated our precommit jobs to use the 0.2.0 release of
> Yetus
> > that came out today.
> >
> > After keeping an eye out for strangeness today I'll turn docker mode back
> > on by default tonight.
> >
> > On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey  wrote:
> >
> > > FYI, I added a new parameter to the precommit job:
> > >
> > > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the apache/yetus
> > > repo rather than our chosen release
> > >
> > > It defaults to inactive, but can be used in manually-triggered runs to
> > > test a solution to a problem in the yetus library. At the moment, I'm
> > > using it to test a solution to default module ordering  as seen in
> > > HBASE-15075.
> > >
> > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey 
> wrote:
> > > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for precommit
> > > tests)
> > > > and updated our jenkins precommit build to use it.
> > > >
> > > > Jenkins job has some explanation:
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> > > >
> > > > Release note from HBASE-13525 does as well.
> > > >
> > > > The old job will stick around here for a couple of weeks, in case we
> > need
> > > > to refer back to it:
> > > >
> > > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> > > >
> > > > If something looks awry, please drop a note on HBASE-13525 while it
> > > remains
> > > > open (and make a new issue after).
> > > >
> > > >
> > > > On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
> > > >
> > > >> As part of my continuing advocacy of builds.apache.org and that
> their
> > > >> results are now worthy of our trust and nurture, here are some
> > > highlights
> > > >> from the last few days of builds:
> > > >>
> > > >> + hadoopqa is now finding zombies before the patch is committed.
> > > >> HBASE-14888 showed "-1 core tests. The patch failed these unit
> tests:"
> > > but
> > > >> didn't have any failed tests listed (I'm trying to see if I can do
> > > anything
> > > >> about this...). Running our little ./dev-tools/findHangingTests.py
> > > against
> > > >> the consoleText, it showed a hanging test. Running locally, I see
> same
> > > >> hang. This is before the patch landed.
> > > >> + Our branch runs are now near totally zombie and flakey free --
> still
> > > some
> > > >> work to do -- but a recent patch that seemed harmless was causing a
> > > >> reliable flake fail in the backport to branch-1* confirmed by local
> > > runs.
> > > >> The flakeyness was plain to see up in builds.apache.org.
> > > >> + In the last few days I've committed a patch that included javadoc
> > > >> warnings even though hadoopqa said the patch introduced javadoc
> issues
> > > (I
> > > >> missed it). This messed up life for folks subsequently as their
> > patches
> > > now
> > > >> reported javadoc issues
> > > >>
> > > >> In short, I suggest that builds.apache.org is worth keeping an eye
> > on,
> > > >> make
> > > >> sure you get a clean build out of hadoopqa before committing
> anything,
> > > and
> > > >> lets all work together to try and keep our builds blue: it'll save
> us
> > > all
> > > >> work in the long run.
> > > >>
> > > >> St.Ack
> > > >>
> > > >>
> > > >> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
> > > >>
> > > >> > Branch-1 and master have stabilized and now run mostly blue (give
> or
> > > take
> > > >> > the odd failure) [1][2]. Having a mostly blue branch-1 has helped
> us
> > > >> > identify at least one destabilizing commit in the last few days,
> > maybe
> > > >> two;
> > > >> > this is as it should be (smile).
> > > >> >
> > > >> > Lets keep our builds blue. If you commit a patch, make sure
> > subsequent
> > > >> > builds stay blue. You can subscribe to [email protected] to
> > get
> > > >> > notice of failures if not already subscribed.
> > > >> >
> > > >> > Thanks,
> > > >> > St.Ack
> > > >> >
> > > >> > 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/
> > > >> > 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/
> > > >> >
> > > >> >
> > > >> > On Mon, Oct 13, 2014 at 4:41 PM, Stack  wrote:
> > > >> >
> > > >> >> A few notes on testing.
> > > >> >>
> > > >> >> Too long to read, infra is more capable now and after some work,
> we
> > > are
> > >

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-07 Thread Stack
Hurray!

It looks like YETUS-96 is in there and we are only running on jdk build
now, the default (but testing compile against both) Will keep an eye.

St.Ack


On Mon, Mar 7, 2016 at 10:27 AM, Sean Busbey  wrote:

> FYI, I've just updated our precommit jobs to use the 0.2.0 release of Yetus
> that came out today.
>
> After keeping an eye out for strangeness today I'll turn docker mode back
> on by default tonight.
>
> On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey  wrote:
>
> > FYI, I added a new parameter to the precommit job:
> >
> > * USE_YETUS_PRERELEASE - causes us to use the HEAD of the apache/yetus
> > repo rather than our chosen release
> >
> > It defaults to inactive, but can be used in manually-triggered runs to
> > test a solution to a problem in the yetus library. At the moment, I'm
> > using it to test a solution to default module ordering  as seen in
> > HBASE-15075.
> >
> > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey  wrote:
> > > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for precommit
> > tests)
> > > and updated our jenkins precommit build to use it.
> > >
> > > Jenkins job has some explanation:
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> > >
> > > Release note from HBASE-13525 does as well.
> > >
> > > The old job will stick around here for a couple of weeks, in case we
> need
> > > to refer back to it:
> > >
> > >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> > >
> > > If something looks awry, please drop a note on HBASE-13525 while it
> > remains
> > > open (and make a new issue after).
> > >
> > >
> > > On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
> > >
> > >> As part of my continuing advocacy of builds.apache.org and that their
> > >> results are now worthy of our trust and nurture, here are some
> > highlights
> > >> from the last few days of builds:
> > >>
> > >> + hadoopqa is now finding zombies before the patch is committed.
> > >> HBASE-14888 showed "-1 core tests. The patch failed these unit tests:"
> > but
> > >> didn't have any failed tests listed (I'm trying to see if I can do
> > anything
> > >> about this...). Running our little ./dev-tools/findHangingTests.py
> > against
> > >> the consoleText, it showed a hanging test. Running locally, I see same
> > >> hang. This is before the patch landed.
> > >> + Our branch runs are now near totally zombie and flakey free -- still
> > some
> > >> work to do -- but a recent patch that seemed harmless was causing a
> > >> reliable flake fail in the backport to branch-1* confirmed by local
> > runs.
> > >> The flakeyness was plain to see up in builds.apache.org.
> > >> + In the last few days I've committed a patch that included javadoc
> > >> warnings even though hadoopqa said the patch introduced javadoc issues
> > (I
> > >> missed it). This messed up life for folks subsequently as their
> patches
> > now
> > >> reported javadoc issues
> > >>
> > >> In short, I suggest that builds.apache.org is worth keeping an eye
> on,
> > >> make
> > >> sure you get a clean build out of hadoopqa before committing anything,
> > and
> > >> lets all work together to try and keep our builds blue: it'll save us
> > all
> > >> work in the long run.
> > >>
> > >> St.Ack
> > >>
> > >>
> > >> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
> > >>
> > >> > Branch-1 and master have stabilized and now run mostly blue (give or
> > take
> > >> > the odd failure) [1][2]. Having a mostly blue branch-1 has helped us
> > >> > identify at least one destabilizing commit in the last few days,
> maybe
> > >> two;
> > >> > this is as it should be (smile).
> > >> >
> > >> > Lets keep our builds blue. If you commit a patch, make sure
> subsequent
> > >> > builds stay blue. You can subscribe to [email protected] to
> get
> > >> > notice of failures if not already subscribed.
> > >> >
> > >> > Thanks,
> > >> > St.Ack
> > >> >
> > >> > 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/
> > >> > 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/
> > >> >
> > >> >
> > >> > On Mon, Oct 13, 2014 at 4:41 PM, Stack  wrote:
> > >> >
> > >> >> A few notes on testing.
> > >> >>
> > >> >> Too long to read, infra is more capable now and after some work, we
> > are
> > >> >> seeing branch-1 and trunk mostly running blue. Lets try and keep it
> > this
> > >> >> way going forward.
> > >> >>
> > >> >> Apache Infra has new, more capable hardware.
> > >> >>
> > >> >> A recent spurt of test fixing combined with more capable hardware
> > seems
> > >> >> to have gotten us to a new place; tests are mostly passing now on
> > >> branch-1
> > >> >> and master.  Lets try and keep it this way and start to trust our
> > test
> > >> runs
> > >> >> again.  Just a few flakies remain.  Lets try and nail them.
> > >> >>
> > >> >> Our tests now run in parallel with other test suites where previous
> > we
> > >> >> ran alone. You can see this sometimes when our z

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-03-07 Thread Sean Busbey
FYI, I've just updated our precommit jobs to use the 0.2.0 release of Yetus
that came out today.

After keeping an eye out for strangeness today I'll turn docker mode back
on by default tonight.

On Wed, Jan 13, 2016 at 10:14 AM, Sean Busbey  wrote:

> FYI, I added a new parameter to the precommit job:
>
> * USE_YETUS_PRERELEASE - causes us to use the HEAD of the apache/yetus
> repo rather than our chosen release
>
> It defaults to inactive, but can be used in manually-triggered runs to
> test a solution to a problem in the yetus library. At the moment, I'm
> using it to test a solution to default module ordering  as seen in
> HBASE-15075.
>
> On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey  wrote:
> > FYI, I just pushed HBASE-13525 (switch to Apache Yetus for precommit
> tests)
> > and updated our jenkins precommit build to use it.
> >
> > Jenkins job has some explanation:
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
> >
> > Release note from HBASE-13525 does as well.
> >
> > The old job will stick around here for a couple of weeks, in case we need
> > to refer back to it:
> >
> >
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
> >
> > If something looks awry, please drop a note on HBASE-13525 while it
> remains
> > open (and make a new issue after).
> >
> >
> > On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
> >
> >> As part of my continuing advocacy of builds.apache.org and that their
> >> results are now worthy of our trust and nurture, here are some
> highlights
> >> from the last few days of builds:
> >>
> >> + hadoopqa is now finding zombies before the patch is committed.
> >> HBASE-14888 showed "-1 core tests. The patch failed these unit tests:"
> but
> >> didn't have any failed tests listed (I'm trying to see if I can do
> anything
> >> about this...). Running our little ./dev-tools/findHangingTests.py
> against
> >> the consoleText, it showed a hanging test. Running locally, I see same
> >> hang. This is before the patch landed.
> >> + Our branch runs are now near totally zombie and flakey free -- still
> some
> >> work to do -- but a recent patch that seemed harmless was causing a
> >> reliable flake fail in the backport to branch-1* confirmed by local
> runs.
> >> The flakeyness was plain to see up in builds.apache.org.
> >> + In the last few days I've committed a patch that included javadoc
> >> warnings even though hadoopqa said the patch introduced javadoc issues
> (I
> >> missed it). This messed up life for folks subsequently as their patches
> now
> >> reported javadoc issues
> >>
> >> In short, I suggest that builds.apache.org is worth keeping an eye on,
> >> make
> >> sure you get a clean build out of hadoopqa before committing anything,
> and
> >> lets all work together to try and keep our builds blue: it'll save us
> all
> >> work in the long run.
> >>
> >> St.Ack
> >>
> >>
> >> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
> >>
> >> > Branch-1 and master have stabilized and now run mostly blue (give or
> take
> >> > the odd failure) [1][2]. Having a mostly blue branch-1 has helped us
> >> > identify at least one destabilizing commit in the last few days, maybe
> >> two;
> >> > this is as it should be (smile).
> >> >
> >> > Lets keep our builds blue. If you commit a patch, make sure subsequent
> >> > builds stay blue. You can subscribe to [email protected] to get
> >> > notice of failures if not already subscribed.
> >> >
> >> > Thanks,
> >> > St.Ack
> >> >
> >> > 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/
> >> > 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/
> >> >
> >> >
> >> > On Mon, Oct 13, 2014 at 4:41 PM, Stack  wrote:
> >> >
> >> >> A few notes on testing.
> >> >>
> >> >> Too long to read, infra is more capable now and after some work, we
> are
> >> >> seeing branch-1 and trunk mostly running blue. Lets try and keep it
> this
> >> >> way going forward.
> >> >>
> >> >> Apache Infra has new, more capable hardware.
> >> >>
> >> >> A recent spurt of test fixing combined with more capable hardware
> seems
> >> >> to have gotten us to a new place; tests are mostly passing now on
> >> branch-1
> >> >> and master.  Lets try and keep it this way and start to trust our
> test
> >> runs
> >> >> again.  Just a few flakies remain.  Lets try and nail them.
> >> >>
> >> >> Our tests now run in parallel with other test suites where previous
> we
> >> >> ran alone. You can see this sometimes when our zombie detector
> reports
> >> >> tests from another project altogether as lingerers (To be fixed).
> Some
> >> of
> >> >> our tests are failing because a concurrent hbase run is undoing
> classes
> >> and
> >> >> data from under it. Also, lets fix.
> >> >>
> >> >> Our tests are brittle. It takes 75minutes for them to complete.  Many
> >> are
> >> >> heavy-duty integration tests starting up multiple clusters and
> mapreduce
> >> >> all in the one JVM. It is a miracle they pass

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-01-22 Thread stack
Thank you Sean (and Andrew)
St.Ack
On Jan 22, 2016 10:12 PM, "Sean Busbey"  wrote:

> Andrew in infra made us a label that covers all the hosts save H2.
> I've updated all the nightly builds to use it.
>
> (specifying a label expression as we do in precommit doesn't work on
> matrix builds because the & and | from the expression end up in the
> filesystem path)
>
> On Fri, Jan 22, 2016 at 3:08 PM, Sean Busbey  wrote:
> > we should probably ensure the earlier branch builds also exclude H2.
> >
> > I'll leave myself a note to look at it this evening. If anyone gets to it
> > before then, please update here.
> >
> > On Fri, Jan 22, 2016 at 1:33 PM, Stack  wrote:
> >
> >> Related to the below, I just changed the trunk matrix build job to
> exclude
> >> H2 from the build roster (with Sean's help); it seems to be responsible
> for
> >> this failure type -- *Caused by: java.lang.IndexOutOfBoundsException:
> >> Index: 0, Size: 0* -- in particular. Here is recent example:
> >>
> >>
> https://builds.apache.org/view/H-L/view/HBase/job/HBase-Trunk_matrix/652/jdk=latest1.7,label=Hadoop/console
> >>
> >> Lets see if it helps.
> >>
> >> While I have your attention, see the nice checkstyle and findbug graph
> >> trajectories here:
> >> https://builds.apache.org/view/H-L/view/HBase/job/HBase-Trunk_matrix/
> >>
> >> St.Ack
> >>
> >>
> >> On Tue, Jan 19, 2016 at 7:02 PM, Sean Busbey 
> wrote:
> >>
> >> > On Tue, Jan 19, 2016 at 11:48 AM, Stack  wrote:
> >> >
> >> > > On Tue, Jan 19, 2016 at 5:46 AM, Sean Busbey 
> >> wrote:
> >> > >
> >> > > > We could start forcing a clean repository on every build (though
> this
> >> > > > seems heavy handed).
> >> > > >
> >> > > > IIRC, we ran into this ages ago and it was one particular
> dependency.
> >> > > > Presuming we can track down what that was, we could add some
> >> pre-build
> >> > > > work that verifies a known good copy of that dependency is
> present.
> >> > > >
> >> > > > For now, I've blacklisted the H2 build host again, since that's
> the
> >> > > > only host this has been happening on. We added it back last week
> so
> >> > > > that Jon could test a fix from infra.
> >> > > >
> >> > > >
> >> > > Thanks Sean.
> >> > >
> >> > > Yeah, clean repo each time would be OTT.
> >> > >
> >> > > If you have pointers on how to find the bad dependency, I'll dig.
> >> > >
> >> > > Of course, all builds fine locally.
> >> > >
> >> > > St.Ack
> >> > >
> >> > >
> >> > >
> >> > Relevant bits from our old precommit build (lucky we kept it! ;) )
> >> >
> >> >
> >> > 
> >> > ...
> >> > # holding place for local repo
> >> > if [ -d "${WORKSPACE}/maven_repo" ]; then
> >> >   echo "removing hbase artifacts from prior runs."
> >> >   rm -rf "${WORKSPACE}/maven_repo/org/apache/hbase"
> >> >   echo "removing javax.inject in case we got a bad dependency"
> >> >   rm -rf "${WORKSPACE}/maven_repo/javax/inject"
> >> > # uncomment to list entire contents of repo
> >> > #  find "${WORKSPACE}/maven_repo"
> >> > else
> >> >   mkdir "${WORKSPACE}/maven_repo"
> >> > fi
> >> >
> >> > ...
> >> >
> >> > if [ "$?" -ne "0" ]; then
> >> >   echo "test patch failed, checking javax.inject dependency"
> >> >   find "${WORKSPACE}/maven_repo/javax/inject"
> >> >   cat
> >> >
> "${WORKSPACE}/maven_repo/javax/inject/javax.inject/1/javax.inject-1.pom"
> >> >   exit 1
> >> > fi
> >> > ---
> >> >
> >> > So it looks like javax:inject was the culprit before. A first step
> would
> >> be
> >> > to remove it from our local repos before running; that'll require
> looking
> >> > up how Yetus names the branch-specific maven repos. A better step
> would
> >> be
> >> > to manually install it from a known good location so we can avoid
> >> whatever
> >> > nonsense is happening on H2.
> >> >
> >>
> >
> >
> >
> > --
> > Sean
>


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-01-22 Thread Sean Busbey
Andrew in infra made us a label that covers all the hosts save H2.
I've updated all the nightly builds to use it.

(specifying a label expression as we do in precommit doesn't work on
matrix builds because the & and | from the expression end up in the
filesystem path)

On Fri, Jan 22, 2016 at 3:08 PM, Sean Busbey  wrote:
> we should probably ensure the earlier branch builds also exclude H2.
>
> I'll leave myself a note to look at it this evening. If anyone gets to it
> before then, please update here.
>
> On Fri, Jan 22, 2016 at 1:33 PM, Stack  wrote:
>
>> Related to the below, I just changed the trunk matrix build job to exclude
>> H2 from the build roster (with Sean's help); it seems to be responsible for
>> this failure type -- *Caused by: java.lang.IndexOutOfBoundsException:
>> Index: 0, Size: 0* -- in particular. Here is recent example:
>>
>> https://builds.apache.org/view/H-L/view/HBase/job/HBase-Trunk_matrix/652/jdk=latest1.7,label=Hadoop/console
>>
>> Lets see if it helps.
>>
>> While I have your attention, see the nice checkstyle and findbug graph
>> trajectories here:
>> https://builds.apache.org/view/H-L/view/HBase/job/HBase-Trunk_matrix/
>>
>> St.Ack
>>
>>
>> On Tue, Jan 19, 2016 at 7:02 PM, Sean Busbey  wrote:
>>
>> > On Tue, Jan 19, 2016 at 11:48 AM, Stack  wrote:
>> >
>> > > On Tue, Jan 19, 2016 at 5:46 AM, Sean Busbey 
>> wrote:
>> > >
>> > > > We could start forcing a clean repository on every build (though this
>> > > > seems heavy handed).
>> > > >
>> > > > IIRC, we ran into this ages ago and it was one particular dependency.
>> > > > Presuming we can track down what that was, we could add some
>> pre-build
>> > > > work that verifies a known good copy of that dependency is present.
>> > > >
>> > > > For now, I've blacklisted the H2 build host again, since that's the
>> > > > only host this has been happening on. We added it back last week so
>> > > > that Jon could test a fix from infra.
>> > > >
>> > > >
>> > > Thanks Sean.
>> > >
>> > > Yeah, clean repo each time would be OTT.
>> > >
>> > > If you have pointers on how to find the bad dependency, I'll dig.
>> > >
>> > > Of course, all builds fine locally.
>> > >
>> > > St.Ack
>> > >
>> > >
>> > >
>> > Relevant bits from our old precommit build (lucky we kept it! ;) )
>> >
>> >
>> > 
>> > ...
>> > # holding place for local repo
>> > if [ -d "${WORKSPACE}/maven_repo" ]; then
>> >   echo "removing hbase artifacts from prior runs."
>> >   rm -rf "${WORKSPACE}/maven_repo/org/apache/hbase"
>> >   echo "removing javax.inject in case we got a bad dependency"
>> >   rm -rf "${WORKSPACE}/maven_repo/javax/inject"
>> > # uncomment to list entire contents of repo
>> > #  find "${WORKSPACE}/maven_repo"
>> > else
>> >   mkdir "${WORKSPACE}/maven_repo"
>> > fi
>> >
>> > ...
>> >
>> > if [ "$?" -ne "0" ]; then
>> >   echo "test patch failed, checking javax.inject dependency"
>> >   find "${WORKSPACE}/maven_repo/javax/inject"
>> >   cat
>> > "${WORKSPACE}/maven_repo/javax/inject/javax.inject/1/javax.inject-1.pom"
>> >   exit 1
>> > fi
>> > ---
>> >
>> > So it looks like javax:inject was the culprit before. A first step would
>> be
>> > to remove it from our local repos before running; that'll require looking
>> > up how Yetus names the branch-specific maven repos. A better step would
>> be
>> > to manually install it from a known good location so we can avoid
>> whatever
>> > nonsense is happening on H2.
>> >
>>
>
>
>
> --
> Sean


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-01-22 Thread Sean Busbey
we should probably ensure the earlier branch builds also exclude H2.

I'll leave myself a note to look at it this evening. If anyone gets to it
before then, please update here.

On Fri, Jan 22, 2016 at 1:33 PM, Stack  wrote:

> Related to the below, I just changed the trunk matrix build job to exclude
> H2 from the build roster (with Sean's help); it seems to be responsible for
> this failure type -- *Caused by: java.lang.IndexOutOfBoundsException:
> Index: 0, Size: 0* -- in particular. Here is recent example:
>
> https://builds.apache.org/view/H-L/view/HBase/job/HBase-Trunk_matrix/652/jdk=latest1.7,label=Hadoop/console
>
> Lets see if it helps.
>
> While I have your attention, see the nice checkstyle and findbug graph
> trajectories here:
> https://builds.apache.org/view/H-L/view/HBase/job/HBase-Trunk_matrix/
>
> St.Ack
>
>
> On Tue, Jan 19, 2016 at 7:02 PM, Sean Busbey  wrote:
>
> > On Tue, Jan 19, 2016 at 11:48 AM, Stack  wrote:
> >
> > > On Tue, Jan 19, 2016 at 5:46 AM, Sean Busbey 
> wrote:
> > >
> > > > We could start forcing a clean repository on every build (though this
> > > > seems heavy handed).
> > > >
> > > > IIRC, we ran into this ages ago and it was one particular dependency.
> > > > Presuming we can track down what that was, we could add some
> pre-build
> > > > work that verifies a known good copy of that dependency is present.
> > > >
> > > > For now, I've blacklisted the H2 build host again, since that's the
> > > > only host this has been happening on. We added it back last week so
> > > > that Jon could test a fix from infra.
> > > >
> > > >
> > > Thanks Sean.
> > >
> > > Yeah, clean repo each time would be OTT.
> > >
> > > If you have pointers on how to find the bad dependency, I'll dig.
> > >
> > > Of course, all builds fine locally.
> > >
> > > St.Ack
> > >
> > >
> > >
> > Relevant bits from our old precommit build (lucky we kept it! ;) )
> >
> >
> > 
> > ...
> > # holding place for local repo
> > if [ -d "${WORKSPACE}/maven_repo" ]; then
> >   echo "removing hbase artifacts from prior runs."
> >   rm -rf "${WORKSPACE}/maven_repo/org/apache/hbase"
> >   echo "removing javax.inject in case we got a bad dependency"
> >   rm -rf "${WORKSPACE}/maven_repo/javax/inject"
> > # uncomment to list entire contents of repo
> > #  find "${WORKSPACE}/maven_repo"
> > else
> >   mkdir "${WORKSPACE}/maven_repo"
> > fi
> >
> > ...
> >
> > if [ "$?" -ne "0" ]; then
> >   echo "test patch failed, checking javax.inject dependency"
> >   find "${WORKSPACE}/maven_repo/javax/inject"
> >   cat
> > "${WORKSPACE}/maven_repo/javax/inject/javax.inject/1/javax.inject-1.pom"
> >   exit 1
> > fi
> > ---
> >
> > So it looks like javax:inject was the culprit before. A first step would
> be
> > to remove it from our local repos before running; that'll require looking
> > up how Yetus names the branch-specific maven repos. A better step would
> be
> > to manually install it from a known good location so we can avoid
> whatever
> > nonsense is happening on H2.
> >
>



-- 
Sean


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-01-22 Thread Stack
Related to the below, I just changed the trunk matrix build job to exclude
H2 from the build roster (with Sean's help); it seems to be responsible for
this failure type -- *Caused by: java.lang.IndexOutOfBoundsException:
Index: 0, Size: 0* -- in particular. Here is recent example:
https://builds.apache.org/view/H-L/view/HBase/job/HBase-Trunk_matrix/652/jdk=latest1.7,label=Hadoop/console

Lets see if it helps.

While I have your attention, see the nice checkstyle and findbug graph
trajectories here:
https://builds.apache.org/view/H-L/view/HBase/job/HBase-Trunk_matrix/

St.Ack


On Tue, Jan 19, 2016 at 7:02 PM, Sean Busbey  wrote:

> On Tue, Jan 19, 2016 at 11:48 AM, Stack  wrote:
>
> > On Tue, Jan 19, 2016 at 5:46 AM, Sean Busbey  wrote:
> >
> > > We could start forcing a clean repository on every build (though this
> > > seems heavy handed).
> > >
> > > IIRC, we ran into this ages ago and it was one particular dependency.
> > > Presuming we can track down what that was, we could add some pre-build
> > > work that verifies a known good copy of that dependency is present.
> > >
> > > For now, I've blacklisted the H2 build host again, since that's the
> > > only host this has been happening on. We added it back last week so
> > > that Jon could test a fix from infra.
> > >
> > >
> > Thanks Sean.
> >
> > Yeah, clean repo each time would be OTT.
> >
> > If you have pointers on how to find the bad dependency, I'll dig.
> >
> > Of course, all builds fine locally.
> >
> > St.Ack
> >
> >
> >
> Relevant bits from our old precommit build (lucky we kept it! ;) )
>
>
> 
> ...
> # holding place for local repo
> if [ -d "${WORKSPACE}/maven_repo" ]; then
>   echo "removing hbase artifacts from prior runs."
>   rm -rf "${WORKSPACE}/maven_repo/org/apache/hbase"
>   echo "removing javax.inject in case we got a bad dependency"
>   rm -rf "${WORKSPACE}/maven_repo/javax/inject"
> # uncomment to list entire contents of repo
> #  find "${WORKSPACE}/maven_repo"
> else
>   mkdir "${WORKSPACE}/maven_repo"
> fi
>
> ...
>
> if [ "$?" -ne "0" ]; then
>   echo "test patch failed, checking javax.inject dependency"
>   find "${WORKSPACE}/maven_repo/javax/inject"
>   cat
> "${WORKSPACE}/maven_repo/javax/inject/javax.inject/1/javax.inject-1.pom"
>   exit 1
> fi
> ---
>
> So it looks like javax:inject was the culprit before. A first step would be
> to remove it from our local repos before running; that'll require looking
> up how Yetus names the branch-specific maven repos. A better step would be
> to manually install it from a known good location so we can avoid whatever
> nonsense is happening on H2.
>


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-01-19 Thread Sean Busbey
On Tue, Jan 19, 2016 at 11:48 AM, Stack  wrote:

> On Tue, Jan 19, 2016 at 5:46 AM, Sean Busbey  wrote:
>
> > We could start forcing a clean repository on every build (though this
> > seems heavy handed).
> >
> > IIRC, we ran into this ages ago and it was one particular dependency.
> > Presuming we can track down what that was, we could add some pre-build
> > work that verifies a known good copy of that dependency is present.
> >
> > For now, I've blacklisted the H2 build host again, since that's the
> > only host this has been happening on. We added it back last week so
> > that Jon could test a fix from infra.
> >
> >
> Thanks Sean.
>
> Yeah, clean repo each time would be OTT.
>
> If you have pointers on how to find the bad dependency, I'll dig.
>
> Of course, all builds fine locally.
>
> St.Ack
>
>
>
Relevant bits from our old precommit build (lucky we kept it! ;) )



...
# holding place for local repo
if [ -d "${WORKSPACE}/maven_repo" ]; then
  echo "removing hbase artifacts from prior runs."
  rm -rf "${WORKSPACE}/maven_repo/org/apache/hbase"
  echo "removing javax.inject in case we got a bad dependency"
  rm -rf "${WORKSPACE}/maven_repo/javax/inject"
# uncomment to list entire contents of repo
#  find "${WORKSPACE}/maven_repo"
else
  mkdir "${WORKSPACE}/maven_repo"
fi

...

if [ "$?" -ne "0" ]; then
  echo "test patch failed, checking javax.inject dependency"
  find "${WORKSPACE}/maven_repo/javax/inject"
  cat
"${WORKSPACE}/maven_repo/javax/inject/javax.inject/1/javax.inject-1.pom"
  exit 1
fi
---

So it looks like javax:inject was the culprit before. A first step would be
to remove it from our local repos before running; that'll require looking
up how Yetus names the branch-specific maven repos. A better step would be
to manually install it from a known good location so we can avoid whatever
nonsense is happening on H2.


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-01-19 Thread Stack
On Tue, Jan 19, 2016 at 5:46 AM, Sean Busbey  wrote:

> We could start forcing a clean repository on every build (though this
> seems heavy handed).
>
> IIRC, we ran into this ages ago and it was one particular dependency.
> Presuming we can track down what that was, we could add some pre-build
> work that verifies a known good copy of that dependency is present.
>
> For now, I've blacklisted the H2 build host again, since that's the
> only host this has been happening on. We added it back last week so
> that Jon could test a fix from infra.
>
>
Thanks Sean.

Yeah, clean repo each time would be OTT.

If you have pointers on how to find the bad dependency, I'll dig.

Of course, all builds fine locally.

St.Ack



> -Sean
>
> On Mon, Jan 18, 2016 at 10:55 PM, Stack  wrote:
> > Anyone know what the refresh timeout is for the below? We seem to be in a
> > phase where we have a bad pom and the hadoop test builds are failing. Can
> > we force refresh of the local repository by doing something like a custom
> > build run?
> >
> > Thanks,
> > St.Ack
> > P.S. Here is what I am talking about:
> >
> https://builds.apache.org/job/PreCommit-HBASE-Build/178/artifact/patchprocess/patch-javac-2.4.0.txt
> > which is from this run today:
> > https://issues.apache.org/jira/browse/HBASE-15086  Thanks
> >
> >
> > On Thu, Nov 5, 2015 at 10:42 AM, Sean Busbey 
> wrote:
> >
> >> If Maven has trouble grabbing a pom but not an artifact, it'll
> >> substitute in a placeholder pom that doesn't have e.g. license
> >> information. That can result in a local repo that fails this way until
> >> the refresh timeout hits for grabbing a pom again.
> >>
> >> On Wed, Nov 4, 2015 at 5:33 PM, Stack  wrote:
> >> > Thanks Andrew. Weird is that it is sporadic. Will keep an eye on it.
> >> > St.Ack
> >> >
> >> > On Wed, Nov 4, 2015 at 8:14 AM, Andrew Purtell <
> [email protected]
> >> >
> >> > wrote:
> >> >
> >> >> > [ERROR] Error invoking method 'get(java.lang.Integer)' in
> >> >> java.util.ArrayList at META-INF/LICENSE.vm[line 1627, column 22]
> >> >>
> >> >> This means a Velocity macro for building LICENSE info about a
> component
> >> >> has failed because necessary information is missing in the Maven
> model.
> >> >> When I have seen this it has been due to a change in dependencies
> >> without
> >> >> necessary updates to supplemental-models.xml. Running the LICENSE and
> >> >> NOTICE aggregations in debug mode will print out clues as to what
> >> >> specifically is missing. See defines in the POMs for doing so. (I'm
> not
> >> at
> >> >> the computer yet so can't pull up the specifics.)
> >> >>
> >> >>
> >> >> > On Nov 4, 2015, at 6:23 AM, Stack  wrote:
> >> >> >
> >> >> > [ERROR] Error invoking method 'get(java.lang.Integer)' in
> >> >> > java.util.ArrayList at META-INF/LICENSE.vm[line 1627, column 22]
> >> >>
> >>
> >>
> >>
> >> --
> >> Sean
> >>
>


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-01-19 Thread Sean Busbey
We could start forcing a clean repository on every build (though this
seems heavy handed).

IIRC, we ran into this ages ago and it was one particular dependency.
Presuming we can track down what that was, we could add some pre-build
work that verifies a known good copy of that dependency is present.

For now, I've blacklisted the H2 build host again, since that's the
only host this has been happening on. We added it back last week so
that Jon could test a fix from infra.

-Sean

On Mon, Jan 18, 2016 at 10:55 PM, Stack  wrote:
> Anyone know what the refresh timeout is for the below? We seem to be in a
> phase where we have a bad pom and the hadoop test builds are failing. Can
> we force refresh of the local repository by doing something like a custom
> build run?
>
> Thanks,
> St.Ack
> P.S. Here is what I am talking about:
> https://builds.apache.org/job/PreCommit-HBASE-Build/178/artifact/patchprocess/patch-javac-2.4.0.txt
> which is from this run today:
> https://issues.apache.org/jira/browse/HBASE-15086  Thanks
>
>
> On Thu, Nov 5, 2015 at 10:42 AM, Sean Busbey  wrote:
>
>> If Maven has trouble grabbing a pom but not an artifact, it'll
>> substitute in a placeholder pom that doesn't have e.g. license
>> information. That can result in a local repo that fails this way until
>> the refresh timeout hits for grabbing a pom again.
>>
>> On Wed, Nov 4, 2015 at 5:33 PM, Stack  wrote:
>> > Thanks Andrew. Weird is that it is sporadic. Will keep an eye on it.
>> > St.Ack
>> >
>> > On Wed, Nov 4, 2015 at 8:14 AM, Andrew Purtell > >
>> > wrote:
>> >
>> >> > [ERROR] Error invoking method 'get(java.lang.Integer)' in
>> >> java.util.ArrayList at META-INF/LICENSE.vm[line 1627, column 22]
>> >>
>> >> This means a Velocity macro for building LICENSE info about a component
>> >> has failed because necessary information is missing in the Maven model.
>> >> When I have seen this it has been due to a change in dependencies
>> without
>> >> necessary updates to supplemental-models.xml. Running the LICENSE and
>> >> NOTICE aggregations in debug mode will print out clues as to what
>> >> specifically is missing. See defines in the POMs for doing so. (I'm not
>> at
>> >> the computer yet so can't pull up the specifics.)
>> >>
>> >>
>> >> > On Nov 4, 2015, at 6:23 AM, Stack  wrote:
>> >> >
>> >> > [ERROR] Error invoking method 'get(java.lang.Integer)' in
>> >> > java.util.ArrayList at META-INF/LICENSE.vm[line 1627, column 22]
>> >>
>>
>>
>>
>> --
>> Sean
>>


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-01-18 Thread Stack
Anyone know what the refresh timeout is for the below? We seem to be in a
phase where we have a bad pom and the hadoop test builds are failing. Can
we force refresh of the local repository by doing something like a custom
build run?

Thanks,
St.Ack
P.S. Here is what I am talking about:
https://builds.apache.org/job/PreCommit-HBASE-Build/178/artifact/patchprocess/patch-javac-2.4.0.txt
which is from this run today:
https://issues.apache.org/jira/browse/HBASE-15086  Thanks


On Thu, Nov 5, 2015 at 10:42 AM, Sean Busbey  wrote:

> If Maven has trouble grabbing a pom but not an artifact, it'll
> substitute in a placeholder pom that doesn't have e.g. license
> information. That can result in a local repo that fails this way until
> the refresh timeout hits for grabbing a pom again.
>
> On Wed, Nov 4, 2015 at 5:33 PM, Stack  wrote:
> > Thanks Andrew. Weird is that it is sporadic. Will keep an eye on it.
> > St.Ack
> >
> > On Wed, Nov 4, 2015 at 8:14 AM, Andrew Purtell  >
> > wrote:
> >
> >> > [ERROR] Error invoking method 'get(java.lang.Integer)' in
> >> java.util.ArrayList at META-INF/LICENSE.vm[line 1627, column 22]
> >>
> >> This means a Velocity macro for building LICENSE info about a component
> >> has failed because necessary information is missing in the Maven model.
> >> When I have seen this it has been due to a change in dependencies
> without
> >> necessary updates to supplemental-models.xml. Running the LICENSE and
> >> NOTICE aggregations in debug mode will print out clues as to what
> >> specifically is missing. See defines in the POMs for doing so. (I'm not
> at
> >> the computer yet so can't pull up the specifics.)
> >>
> >>
> >> > On Nov 4, 2015, at 6:23 AM, Stack  wrote:
> >> >
> >> > [ERROR] Error invoking method 'get(java.lang.Integer)' in
> >> > java.util.ArrayList at META-INF/LICENSE.vm[line 1627, column 22]
> >>
>
>
>
> --
> Sean
>


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-01-13 Thread Sean Busbey
FYI, I added a new parameter to the precommit job:

* USE_YETUS_PRERELEASE - causes us to use the HEAD of the apache/yetus
repo rather than our chosen release

It defaults to inactive, but can be used in manually-triggered runs to
test a solution to a problem in the yetus library. At the moment, I'm
using it to test a solution to default module ordering  as seen in
HBASE-15075.

On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey  wrote:
> FYI, I just pushed HBASE-13525 (switch to Apache Yetus for precommit tests)
> and updated our jenkins precommit build to use it.
>
> Jenkins job has some explanation:
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
>
> Release note from HBASE-13525 does as well.
>
> The old job will stick around here for a couple of weeks, in case we need
> to refer back to it:
>
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
>
> If something looks awry, please drop a note on HBASE-13525 while it remains
> open (and make a new issue after).
>
>
> On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
>
>> As part of my continuing advocacy of builds.apache.org and that their
>> results are now worthy of our trust and nurture, here are some highlights
>> from the last few days of builds:
>>
>> + hadoopqa is now finding zombies before the patch is committed.
>> HBASE-14888 showed "-1 core tests. The patch failed these unit tests:" but
>> didn't have any failed tests listed (I'm trying to see if I can do anything
>> about this...). Running our little ./dev-tools/findHangingTests.py against
>> the consoleText, it showed a hanging test. Running locally, I see same
>> hang. This is before the patch landed.
>> + Our branch runs are now near totally zombie and flakey free -- still some
>> work to do -- but a recent patch that seemed harmless was causing a
>> reliable flake fail in the backport to branch-1* confirmed by local runs.
>> The flakeyness was plain to see up in builds.apache.org.
>> + In the last few days I've committed a patch that included javadoc
>> warnings even though hadoopqa said the patch introduced javadoc issues (I
>> missed it). This messed up life for folks subsequently as their patches now
>> reported javadoc issues
>>
>> In short, I suggest that builds.apache.org is worth keeping an eye on,
>> make
>> sure you get a clean build out of hadoopqa before committing anything, and
>> lets all work together to try and keep our builds blue: it'll save us all
>> work in the long run.
>>
>> St.Ack
>>
>>
>> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
>>
>> > Branch-1 and master have stabilized and now run mostly blue (give or take
>> > the odd failure) [1][2]. Having a mostly blue branch-1 has helped us
>> > identify at least one destabilizing commit in the last few days, maybe
>> two;
>> > this is as it should be (smile).
>> >
>> > Lets keep our builds blue. If you commit a patch, make sure subsequent
>> > builds stay blue. You can subscribe to [email protected] to get
>> > notice of failures if not already subscribed.
>> >
>> > Thanks,
>> > St.Ack
>> >
>> > 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/
>> > 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/
>> >
>> >
>> > On Mon, Oct 13, 2014 at 4:41 PM, Stack  wrote:
>> >
>> >> A few notes on testing.
>> >>
>> >> Too long to read, infra is more capable now and after some work, we are
>> >> seeing branch-1 and trunk mostly running blue. Lets try and keep it this
>> >> way going forward.
>> >>
>> >> Apache Infra has new, more capable hardware.
>> >>
>> >> A recent spurt of test fixing combined with more capable hardware seems
>> >> to have gotten us to a new place; tests are mostly passing now on
>> branch-1
>> >> and master.  Lets try and keep it this way and start to trust our test
>> runs
>> >> again.  Just a few flakies remain.  Lets try and nail them.
>> >>
>> >> Our tests now run in parallel with other test suites where previous we
>> >> ran alone. You can see this sometimes when our zombie detector reports
>> >> tests from another project altogether as lingerers (To be fixed).  Some
>> of
>> >> our tests are failing because a concurrent hbase run is undoing classes
>> and
>> >> data from under it. Also, lets fix.
>> >>
>> >> Our tests are brittle. It takes 75minutes for them to complete.  Many
>> are
>> >> heavy-duty integration tests starting up multiple clusters and mapreduce
>> >> all in the one JVM. It is a miracle they pass at all.  Usually
>> integration
>> >> tests have been cast as unit tests because there was no where else for
>> them
>> >> to get an airing.  We have the hbase-it suite now which would be a more
>> apt
>> >> place but until these are run on a regular basis in public for all to
>> see,
>> >> the fat integration tests disguised as unit tests will remain.  A
>> review of
>> >> our current unit tests weeding the old cruft and the no longer relevant
>> or
>> >> duplicates would be a nice undertaking if som

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-01-11 Thread Sean Busbey
Found the problem (not setting the path to commands in the case where
there is a cached install :/ ); have now turned off debug by default.

On Mon, Jan 11, 2016 at 9:18 AM, Sean Busbey  wrote:
> We've had a few precommit jobs fail because the cache for our yetus
> install was present but not executable.
>
> I've turned on debugging so we can try to figure out what's going on
> the next time one happens.
>
> On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey  wrote:
>> FYI, I just pushed HBASE-13525 (switch to Apache Yetus for precommit tests)
>> and updated our jenkins precommit build to use it.
>>
>> Jenkins job has some explanation:
>> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
>>
>> Release note from HBASE-13525 does as well.
>>
>> The old job will stick around here for a couple of weeks, in case we need
>> to refer back to it:
>>
>> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
>>
>> If something looks awry, please drop a note on HBASE-13525 while it remains
>> open (and make a new issue after).
>>
>>
>> On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
>>
>>> As part of my continuing advocacy of builds.apache.org and that their
>>> results are now worthy of our trust and nurture, here are some highlights
>>> from the last few days of builds:
>>>
>>> + hadoopqa is now finding zombies before the patch is committed.
>>> HBASE-14888 showed "-1 core tests. The patch failed these unit tests:" but
>>> didn't have any failed tests listed (I'm trying to see if I can do anything
>>> about this...). Running our little ./dev-tools/findHangingTests.py against
>>> the consoleText, it showed a hanging test. Running locally, I see same
>>> hang. This is before the patch landed.
>>> + Our branch runs are now near totally zombie and flakey free -- still some
>>> work to do -- but a recent patch that seemed harmless was causing a
>>> reliable flake fail in the backport to branch-1* confirmed by local runs.
>>> The flakeyness was plain to see up in builds.apache.org.
>>> + In the last few days I've committed a patch that included javadoc
>>> warnings even though hadoopqa said the patch introduced javadoc issues (I
>>> missed it). This messed up life for folks subsequently as their patches now
>>> reported javadoc issues
>>>
>>> In short, I suggest that builds.apache.org is worth keeping an eye on,
>>> make
>>> sure you get a clean build out of hadoopqa before committing anything, and
>>> lets all work together to try and keep our builds blue: it'll save us all
>>> work in the long run.
>>>
>>> St.Ack
>>>
>>>
>>> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
>>>
>>> > Branch-1 and master have stabilized and now run mostly blue (give or take
>>> > the odd failure) [1][2]. Having a mostly blue branch-1 has helped us
>>> > identify at least one destabilizing commit in the last few days, maybe
>>> two;
>>> > this is as it should be (smile).
>>> >
>>> > Lets keep our builds blue. If you commit a patch, make sure subsequent
>>> > builds stay blue. You can subscribe to [email protected] to get
>>> > notice of failures if not already subscribed.
>>> >
>>> > Thanks,
>>> > St.Ack
>>> >
>>> > 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/
>>> > 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/
>>> >
>>> >
>>> > On Mon, Oct 13, 2014 at 4:41 PM, Stack  wrote:
>>> >
>>> >> A few notes on testing.
>>> >>
>>> >> Too long to read, infra is more capable now and after some work, we are
>>> >> seeing branch-1 and trunk mostly running blue. Lets try and keep it this
>>> >> way going forward.
>>> >>
>>> >> Apache Infra has new, more capable hardware.
>>> >>
>>> >> A recent spurt of test fixing combined with more capable hardware seems
>>> >> to have gotten us to a new place; tests are mostly passing now on
>>> branch-1
>>> >> and master.  Lets try and keep it this way and start to trust our test
>>> runs
>>> >> again.  Just a few flakies remain.  Lets try and nail them.
>>> >>
>>> >> Our tests now run in parallel with other test suites where previous we
>>> >> ran alone. You can see this sometimes when our zombie detector reports
>>> >> tests from another project altogether as lingerers (To be fixed).  Some
>>> of
>>> >> our tests are failing because a concurrent hbase run is undoing classes
>>> and
>>> >> data from under it. Also, lets fix.
>>> >>
>>> >> Our tests are brittle. It takes 75minutes for them to complete.  Many
>>> are
>>> >> heavy-duty integration tests starting up multiple clusters and mapreduce
>>> >> all in the one JVM. It is a miracle they pass at all.  Usually
>>> integration
>>> >> tests have been cast as unit tests because there was no where else for
>>> them
>>> >> to get an airing.  We have the hbase-it suite now which would be a more
>>> apt
>>> >> place but until these are run on a regular basis in public for all to
>>> see,
>>> >> the fat integration tests disguised as unit tests will remain.  A
>>> review of
>

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-01-11 Thread Sean Busbey
We've had a few precommit jobs fail because the cache for our yetus
install was present but not executable.

I've turned on debugging so we can try to figure out what's going on
the next time one happens.

On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey  wrote:
> FYI, I just pushed HBASE-13525 (switch to Apache Yetus for precommit tests)
> and updated our jenkins precommit build to use it.
>
> Jenkins job has some explanation:
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/
>
> Release note from HBASE-13525 does as well.
>
> The old job will stick around here for a couple of weeks, in case we need
> to refer back to it:
>
> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/
>
> If something looks awry, please drop a note on HBASE-13525 while it remains
> open (and make a new issue after).
>
>
> On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:
>
>> As part of my continuing advocacy of builds.apache.org and that their
>> results are now worthy of our trust and nurture, here are some highlights
>> from the last few days of builds:
>>
>> + hadoopqa is now finding zombies before the patch is committed.
>> HBASE-14888 showed "-1 core tests. The patch failed these unit tests:" but
>> didn't have any failed tests listed (I'm trying to see if I can do anything
>> about this...). Running our little ./dev-tools/findHangingTests.py against
>> the consoleText, it showed a hanging test. Running locally, I see same
>> hang. This is before the patch landed.
>> + Our branch runs are now near totally zombie and flakey free -- still some
>> work to do -- but a recent patch that seemed harmless was causing a
>> reliable flake fail in the backport to branch-1* confirmed by local runs.
>> The flakeyness was plain to see up in builds.apache.org.
>> + In the last few days I've committed a patch that included javadoc
>> warnings even though hadoopqa said the patch introduced javadoc issues (I
>> missed it). This messed up life for folks subsequently as their patches now
>> reported javadoc issues
>>
>> In short, I suggest that builds.apache.org is worth keeping an eye on,
>> make
>> sure you get a clean build out of hadoopqa before committing anything, and
>> lets all work together to try and keep our builds blue: it'll save us all
>> work in the long run.
>>
>> St.Ack
>>
>>
>> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
>>
>> > Branch-1 and master have stabilized and now run mostly blue (give or take
>> > the odd failure) [1][2]. Having a mostly blue branch-1 has helped us
>> > identify at least one destabilizing commit in the last few days, maybe
>> two;
>> > this is as it should be (smile).
>> >
>> > Lets keep our builds blue. If you commit a patch, make sure subsequent
>> > builds stay blue. You can subscribe to [email protected] to get
>> > notice of failures if not already subscribed.
>> >
>> > Thanks,
>> > St.Ack
>> >
>> > 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/
>> > 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/
>> >
>> >
>> > On Mon, Oct 13, 2014 at 4:41 PM, Stack  wrote:
>> >
>> >> A few notes on testing.
>> >>
>> >> Too long to read, infra is more capable now and after some work, we are
>> >> seeing branch-1 and trunk mostly running blue. Lets try and keep it this
>> >> way going forward.
>> >>
>> >> Apache Infra has new, more capable hardware.
>> >>
>> >> A recent spurt of test fixing combined with more capable hardware seems
>> >> to have gotten us to a new place; tests are mostly passing now on
>> branch-1
>> >> and master.  Lets try and keep it this way and start to trust our test
>> runs
>> >> again.  Just a few flakies remain.  Lets try and nail them.
>> >>
>> >> Our tests now run in parallel with other test suites where previous we
>> >> ran alone. You can see this sometimes when our zombie detector reports
>> >> tests from another project altogether as lingerers (To be fixed).  Some
>> of
>> >> our tests are failing because a concurrent hbase run is undoing classes
>> and
>> >> data from under it. Also, lets fix.
>> >>
>> >> Our tests are brittle. It takes 75minutes for them to complete.  Many
>> are
>> >> heavy-duty integration tests starting up multiple clusters and mapreduce
>> >> all in the one JVM. It is a miracle they pass at all.  Usually
>> integration
>> >> tests have been cast as unit tests because there was no where else for
>> them
>> >> to get an airing.  We have the hbase-it suite now which would be a more
>> apt
>> >> place but until these are run on a regular basis in public for all to
>> see,
>> >> the fat integration tests disguised as unit tests will remain.  A
>> review of
>> >> our current unit tests weeding the old cruft and the no longer relevant
>> or
>> >> duplicates would be a nice undertaking if someone is looking to
>> contribute.
>> >>
>> >> Alex Newman has been working on making our tests work up on travis and
>> >> circle-ci.  That'll be sweet when it goes end-to-end.

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-01-08 Thread Sean Busbey
FYI, I just pushed HBASE-13525 (switch to Apache Yetus for precommit tests)
and updated our jenkins precommit build to use it.

Jenkins job has some explanation:
https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/

Release note from HBASE-13525 does as well.

The old job will stick around here for a couple of weeks, in case we need
to refer back to it:

https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/

If something looks awry, please drop a note on HBASE-13525 while it remains
open (and make a new issue after).


On Wed, Dec 2, 2015 at 3:22 PM, Stack  wrote:

> As part of my continuing advocacy of builds.apache.org and that their
> results are now worthy of our trust and nurture, here are some highlights
> from the last few days of builds:
>
> + hadoopqa is now finding zombies before the patch is committed.
> HBASE-14888 showed "-1 core tests. The patch failed these unit tests:" but
> didn't have any failed tests listed (I'm trying to see if I can do anything
> about this...). Running our little ./dev-tools/findHangingTests.py against
> the consoleText, it showed a hanging test. Running locally, I see same
> hang. This is before the patch landed.
> + Our branch runs are now near totally zombie and flakey free -- still some
> work to do -- but a recent patch that seemed harmless was causing a
> reliable flake fail in the backport to branch-1* confirmed by local runs.
> The flakeyness was plain to see up in builds.apache.org.
> + In the last few days I've committed a patch that included javadoc
> warnings even though hadoopqa said the patch introduced javadoc issues (I
> missed it). This messed up life for folks subsequently as their patches now
> reported javadoc issues
>
> In short, I suggest that builds.apache.org is worth keeping an eye on,
> make
> sure you get a clean build out of hadoopqa before committing anything, and
> lets all work together to try and keep our builds blue: it'll save us all
> work in the long run.
>
> St.Ack
>
>
> On Tue, Nov 4, 2014 at 9:38 AM, Stack  wrote:
>
> > Branch-1 and master have stabilized and now run mostly blue (give or take
> > the odd failure) [1][2]. Having a mostly blue branch-1 has helped us
> > identify at least one destabilizing commit in the last few days, maybe
> two;
> > this is as it should be (smile).
> >
> > Lets keep our builds blue. If you commit a patch, make sure subsequent
> > builds stay blue. You can subscribe to [email protected] to get
> > notice of failures if not already subscribed.
> >
> > Thanks,
> > St.Ack
> >
> > 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/
> > 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/
> >
> >
> > On Mon, Oct 13, 2014 at 4:41 PM, Stack  wrote:
> >
> >> A few notes on testing.
> >>
> >> Too long to read, infra is more capable now and after some work, we are
> >> seeing branch-1 and trunk mostly running blue. Lets try and keep it this
> >> way going forward.
> >>
> >> Apache Infra has new, more capable hardware.
> >>
> >> A recent spurt of test fixing combined with more capable hardware seems
> >> to have gotten us to a new place; tests are mostly passing now on
> branch-1
> >> and master.  Lets try and keep it this way and start to trust our test
> runs
> >> again.  Just a few flakies remain.  Lets try and nail them.
> >>
> >> Our tests now run in parallel with other test suites where previous we
> >> ran alone. You can see this sometimes when our zombie detector reports
> >> tests from another project altogether as lingerers (To be fixed).  Some
> of
> >> our tests are failing because a concurrent hbase run is undoing classes
> and
> >> data from under it. Also, lets fix.
> >>
> >> Our tests are brittle. It takes 75minutes for them to complete.  Many
> are
> >> heavy-duty integration tests starting up multiple clusters and mapreduce
> >> all in the one JVM. It is a miracle they pass at all.  Usually
> integration
> >> tests have been cast as unit tests because there was no where else for
> them
> >> to get an airing.  We have the hbase-it suite now which would be a more
> apt
> >> place but until these are run on a regular basis in public for all to
> see,
> >> the fat integration tests disguised as unit tests will remain.  A
> review of
> >> our current unit tests weeding the old cruft and the no longer relevant
> or
> >> duplicates would be a nice undertaking if someone is looking to
> contribute.
> >>
> >> Alex Newman has been working on making our tests work up on travis and
> >> circle-ci.  That'll be sweet when it goes end-to-end.  He also added in
> >> some "type" categorizations -- client, filter, mapreduce -- alongside
> our
> >> old "sizing" categorizations of small/medium/large.  His thinking is
> that
> >> we can run these categorizations in parallel so we could run the total
> >> suite in about the time of the longest test, say 20-30minutes?  We could
> >> even change Apache to run them this way.

  1   2   >