Re: [VOTE] The 1st HBase 0.98.22 release candidate (RC0) is available

Dima Spivak Wed, 07 Sep 2016 17:42:28 -0700

+1

- Started up a 5-node clusterdock cluster (Hadoop 2.2.0, Oracle JDK 7u79)
from binary tarballs.
- Verified that the web UI works and that the HBase Version attribute
matches the expected Git hash.
- Ran ITBLL with 1 billion nodes and the serverKilling monkey (`clusterdock_ssh
node-3.cluster hbase
org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList -m serverKilling
loop 1 16 62500000 ${RANDOM} 16`), which passed.


-Dima

On Tue, Sep 6, 2016 at 11:58 AM, Andrew Purtell <apurt...@apache.org> wrote:

> Thanks for the +1, Heng.
>
> > TestThriftServer.beforeClass:97 » IO Shutting down
>
> Looks like the minicluster failed to launch. Port binding problem, perhaps?
> It passes when rerun manually because probably no other test is executing
> concurrently. By default our build runs unit tests with some parallelism.
> FWIW this can be disabled with '-Dsurefire.firstPartForkCount=1
> -Dsurefire.secondPartForkCount=1'.
>
> Also, I use '-Dsurefire.rerunFailingTestsCount=2' to help distinguish
> between failures and flakes.
>
>
> On Tue, Sep 6, 2016 at 1:57 AM, Heng Chen <heng.chen.1...@gmail.com>
> wrote:
>
> > +1
> >
> > - Unpacked source and binary tarballs: layout looks good
> >
> > - Started up a 3-node cluster (Hadoop 2.7.2, Oracle JDK 8u20, 2 master, 3
> > rs) from binary tarballs.
> >
> > - Verified that the web UI works and shell works
> >
> > - build from source and run test case (JDK 8u20),  passed. (There is some
> > failed test case about thrift server, but could pass when rerun manually,
> > list the failed test case below)
> >
> >     TestThriftServer.beforeClass:97 » IO Shutting down
> >
> >     TestThriftServerCmdLine.setUpBeforeClass:119 » IO Shutting down
> >
> >     TestThriftHBaseServiceHandler.beforeClass:135 » IO Shutting down
> >
> >     TestThriftHBaseServiceHandlerWithLabels.beforeClass:135 » IO
> Shutting
> > down
> >
> > - Run LTT with 1M rows (100 writers,  30 readers (100%),  10 updaters
> > (20%))  all keys verified,  no warns, no errors,  no failed, latencies
> lgtm
> >
> > - Run ITBLL with 2M rows (slowDeterministic), passed.
> >
> > - Run ITBLL with 2.5M rows (serverKilling), passed.
> >
> > Some notes:  because 0.98 compiled with hadoop 2.2.0,  so when i run
> ITBLL
> > on hadoop 2.7.2, it failed due to compatibiltiy issue, see HBASE-16564,
> so
> > i replace hadoop-2.2.0 jar with hadoop 2.5.1,  and pass the ITBLL.  Still
> > give +1 because it is MapReduce issue not HBase
> >
> >
> >
> >
> > 2016-09-05 13:41 GMT+08:00 Dima Spivak <dimaspi...@apache.org>:
> >
> > > Ugh, sorry guys, I'm dumb. I was running 1 mapper per RS before, but
> > > switched to a d2.4xlarge instance today and, after noticing cores
> sitting
> > > idly, decided to try setting the number of mappers and reducers to the
> > > number of cores to speed testing up (RAM is still grossly underutilized
> > > with less than 16 GB/122 GB in use at any one time). This definitely
> made
> > > runs go faster (generation took less than 3 hours, verification took
> > about
> > > 1 hour), but I just realized that the number of nodes I picked
> (62500000)
> > > isn't a multiple of 25,000,000 and so the list won't wrap properly.
> I'll
> > > rerun and confirm, but I'm guessing this is a false alarm.
> > >
> > > Sorry again. :(
> > >
> > > -Dima
> > >
> > > On Sun, Sep 4, 2016 at 9:56 PM, Andrew Purtell <
> andrew.purt...@gmail.com
> > >
> > > wrote:
> > >
> > > > I will also try your incantation (and JRE version) on this RC and
> > 0.98.21
> > > > next week to answer those same questions.
> > > >
> > > > Looks like you are using a multiple of RSes (16) as numMappers? Is
> that
> > > > 4x? On what kind of instance type? I am (also, I think) using a 5
> node
> > > > "cluster" with 4 RS nodes but numMappers 4 and numNodes 250000000.
> > Since
> > > > with clusterdock everything is contending for one instance's
> resources
> > I
> > > > didn't want to overdo and so have started at 1 mapper per RS. Since
> you
> > > > appear to be using a higher value, I'm curious if you've found that
> you
> > > > will get stable results with that, if more mappers in this
> > configuration
> > > > does a better job finding problems in your experience, and what
> > instance
> > > > type are you using? I've been using a d2.4xlarge.
> > > >
> > > > > On Sep 4, 2016, at 9:04 PM, Andrew Purtell <
> andrew.purt...@gmail.com
> > >
> > > > wrote:
> > > > >
> > > > > I've been running 1B tests with slowDeterministic. 0.98.21 and this
> > > > 0.98.22 RC. I get 1B referenced, all ok.
> > > > >
> > > > > Did you run serverKilling with 0.98.21? And did it pass? Or does
> > > 0.98.21
> > > > pass for you now? If so then we have a regression. If not then it's
> > > > something to look at for 0.98.23 I'd say.
> > > > >
> > > > >> On Sep 4, 2016, at 8:44 PM, Dima Spivak <dimaspi...@apache.org>
> > > wrote:
> > > > >>
> > > > >> Anyone else running ITBLL seeing issues? I just ran a 5-node
> > > clusterdock
> > > > >> cluster with JDK 7u79 of this RC and tried out ITBLL with 1
> billion
> > > rows
> > > > >> and the serverKilling monkey (`hbase
> > > > >> org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList -m
> > > > serverKilling
> > > > >> loop 1 16 62500000 ${RANDOM} 16`). This failed for me because of
> > > > >> unreferenced list nodes:
> > > > >>
> > > > >> org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$
> > > Verify$Counts
> > > > >> REFERENCED=732006926
> > > > >> UNREFERENCED=12003580
> > > > >>
> > > > >> Perhaps this is similar to what Mikhail saw a while back with
> later
> > > > >> releases?
> > > > >>
> > > > >> -Dima
> > > > >>
> > > > >>> On Sat, Sep 3, 2016 at 8:34 AM, Andrew Purtell <
> > apurt...@apache.org>
> > > > wrote:
> > > > >>>
> > > > >>> The 1st HBase 0.98.2
> > > > >>> 2 release candidate (RC0) is available for download at
> > > > >>> https://dist.apache.org/repos/dist/dev/hbase/hbase-0.98.22RC0
> and
> > > > Maven
> > > > >>> artifacts are also available in the temporary repository
> > > > >>> https://repository.apache.org/content/repositories/
> > > orgapachehbase-1151
> > > > .
> > > > >>>
> > > > >>> The detailed source and binary compatibility report for this
> > release
> > > > with
> > > > >>> respect to the previous is available for your review at
> > > > >>> https://dist.apache.org/repos/dist/dev/hbase/hbase-0.98.
> > > > >>> 22RC0/0.98.21_0.98.22RC0_compat_report.html
> > > > >>> . There are no reported compatibility issues.
> > > > >>>
> > > > >>> The
> > > > >>> 25
> > > > >>> issues resolved in this release can be found at
> > > > https://s.apache.org/C7SV
> > > > >>> .
> > > > >>>
> > > > >>> I have made the following assessments of this candidate:
> > > > >>> - Release audit check
> > > > >>> : pass
> > > > >>>
> > > > >>> -
> > > > >>>  Unit test suite: pass 10/10 (7u79)
> > > > >>>
> > > > >>> - Loaded 1M keys with LTT (10 readers, 10 writers, 10 updaters
> > (20%):
> > > > all
> > > > >>> keys verified, no unusual messages or errors, latencies in the
> > > ballpark
> > > > >>> - IntegrationTestBigLinkedList
> > > > >>> 1B rows: 100% referenced, no errors (8u91)
> > > > >>> - Built head of Apache Phoenix 4.x-HBase-0.98 branch
> > > > >>> :
> > > > >>> no errors (7u79)
> > > > >>>
> > > > >>> Signed with my code signing key D5365CCD.
> > > > >>>
> > > > >>> Please try out the candidate and vote +1/0/-1. This vote will be
> > open
> > > > for
> > > > >>> at least 72 hours. Unless objection I will try to close it
> > > > >>> Friday September 9, 2016 if we have sufficient votes.
> > > > >>>
> > > > >>> --
> > > > >>> Best regards,
> > > > >>>
> > > > >>>  - Andy
> > > > >>>
> > > > >>> Problems worthy of attack prove their worth by hitting back. -
> Piet
> > > > Hein
> > > > >>> (via Tom White)
> > > > >>>
> > > >
> > >
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Re: [VOTE] The 1st HBase 0.98.22 release candidate (RC0) is available

Reply via email to