Is an extension here a reasonable ask? Putting the vote up right before
what is a long New Year weekend for many folks doesn't give a lot of
opportunity for thorough review.

Mike

On Mon, Jan 1, 2018 at 1:30 PM, stack <[email protected]> wrote:

> This is great stuff jms.  Thank you.  Away from computer at mo but will dig
> in.
>
> Is it possible old files left over written with old hbase with old hfile
> version? Can you see on source?  They should have but updated by a
> compaction if a long time idle, I agree.
>
> Yeah. If region assign fails, and goes into assignable state, we need
> intervention. We've been shutting down all the ways in which this could
> happen but you seem to have stumbled on a new one. I will take a look at
> your logs.
>
> What you going to vote?  Does it basically work?
>
> Thanks again for the try out.
> S
>
> On Dec 31, 2017 12:43 PM, "Jean-Marc Spaggiari" <[email protected]>
> wrote:
>
> Sorry to spam the list :(
>
> Another interesting thing.
>
> Now most of my tablesare online. For few I'm getting this:
> Caused by: java.lang.IllegalArgumentException: Invalid HFile version:
> major=2, minor=1: expected at least major=2 and minor=3
>         at
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.checkFileVersion(
> HFileReaderImpl.java:332)
>         at
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.<init>(
> HFileReaderImpl.java:199)
>         at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile.
> java:538)
>         ... 13 more
>
> What is interesting is tat I'm not doing anything on the source cluster for
> weeks/months. So all tables are all major compacted the same way. I will
> major compact them all under HFiles v3 format and retry.
>
> 2017-12-31 13:33 GMT-05:00 Jean-Marc Spaggiari <[email protected]>:
>
> > Ok. With a brand new DestCP from source cluster, regions are getting
> > assigned correctly. So sound like if they get stuck initially for any
> > reason, then even if the reason is fixed they can not get assigned
> anymore
> > again. Will keep playing.
> >
> > I kept the previous /hbase just in case we need something from it.
> >
> > Thanks,
> >
> > JMS
> >
> > 2017-12-31 10:23 GMT-05:00 Jean-Marc Spaggiari <[email protected]
> >:
> >
> >> Nothing bad that I can see. Here is a region server log:
> >> https://pastebin.com/0r76Y6ap
> >>
> >> Disabling the table makes the regions leave the transition mode. I'm
> >> trying to disable all tables one by one (because it get stuck after each
> >> disable) and will see if re-enabling them helps...
> >>
> >> On the master side, I now have errors all over:
> >> 2017-12-31 10:06:26,511 WARN  [ProcExecWrkr-89]
> >> assignment.RegionTransitionProcedure: Retryable error trying to
> >> transition: pid=511, ppid=398, state=RUNNABLE:REGION_
> TRANSITION_DISPATCH;
> >> UnassignProcedure table=work_proposed, region=
> d0a58b76ad9376b12b3e763660049d3d,
> >> server=node3.com,16020,1514693337210; rit=OPENING, location=node3.com
> >> ,16020,1514693337210
> >> org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected
> >> [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but
> >> current state=OPENING
> >> at org.apache.hadoop.hbase.master.assignment.RegionStates$Regio
> >> nStateNode.transitionState(RegionStates.java:155)
> >> at org.apache.hadoop.hbase.master.assignment.AssignmentManager.
> >> markRegionAsClosing(AssignmentManager.java:1530)
> >> at org.apache.hadoop.hbase.master.assignment.UnassignProcedure.
> >> updateTransition(UnassignProcedure.java:179)
> >> at org.apache.hadoop.hbase.master.assignment.RegionTransitionPr
> >> ocedure.execute(RegionTransitionProcedure.java:309)
> >> at org.apache.hadoop.hbase.master.assignment.RegionTransitionPr
> >> ocedure.execute(RegionTransitionProcedure.java:85)
> >> at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Proce
> >> dure.java:845)
> >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execPro
> >> cedure(ProcedureExecutor.java:1456)
> >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execute
> >> Procedure(ProcedureExecutor.java:1225)
> >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$
> >> 800(ProcedureExecutor.java:78)
> >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerT
> >> hread.run(ProcedureExecutor.java:1735)
> >>
> >> Non-stop showing on the logs. Probably because I disabled the table.
> >> Restarting HBase so see if it clears that a but...
> >>
> >> After restart there isn't any org.apache.hadoop.hbase.except
> >> ions.UnexpectedStateException on the logs. Only INFO lever. And nothing
> >> bad. But still, regions are stuck in transition even for the disabled
> >> tables.
> >>
> >> Master ls are here. I removed some sections because it always says the
> >> same thing, for each and every single region: https://pastebin.com/K
> >> 6SQ7DXP
> >>
> >> JMS
> >>
> >> 2017-12-31 9:58 GMT-05:00 stack <[email protected]>:
> >>
> >>> There is nothing further up in the master log from regionservers or on
> >>> regionservers side on open?
> >>>
> >>> Thanks,
> >>> S
> >>>
> >>> On Dec 31, 2017 8:37 AM, "stack" <[email protected]> wrote:
> >>>
> >>> > Good questions.  If you disable snappy does it work?  If you start
> over
> >>> > fresh does it work?  It should be picking up native libs.  Make an
> >>> issue
> >>> > please jms.  Thanks for giving it a go.
> >>> >
> >>> > S
> >>> >
> >>> > On Dec 30, 2017 11:49 PM, "Jean-Marc Spaggiari" <
> >>> [email protected]>
> >>> > wrote:
> >>> >
> >>> >> Hi Stack,
> >>> >>
> >>> >> I just tried to give it a try... Wipe out all HDFS content and code,
> >>> all
> >>> >> HBase content and code, and all ZK. Re-build a brand new cluster
> with
> >>> 7
> >>> >> physical worker nodes. I'm able to get HBase start, how-ever I'm not
> >>> able
> >>> >> to get my regions online.
> >>> >>
> >>> >> 2017-12-31 00:42:03,187 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node8.16020,1514693333206, table=pageMini,
> >>> >> region=a778eb67898dfd378e426f2e7700faea
> >>> >> 2017-12-31 00:42:03,187 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node6.16020,1514693336563, table=work_proposed,
> >>> >> region=4a1d86197ace3f4c8b1c8de28dbe1d34
> >>> >> 2017-12-31 00:42:03,187 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node1.16020,1514693336898, table=page_crc,
> >>> >> region=86b3912a09a5676b6851636ed22c2abc
> >>> >> 2017-12-31 00:42:03,187 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node7.16020,1514693337406, table=pageAvro,
> >>> >> region=391784c43c87bdea6df05f96accad0ff
> >>> >> 2017-12-31 00:42:03,187 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node8.16020,1514693333206, table=page,
> >>> >> region=5850d782a3beea18872769bf8fd70fc7
> >>> >> 2017-12-31 00:42:03,187 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node5.16020,1514693330961, table=work_proposed,
> >>> >> region=1d892c9b54b66f802b82c2f9fe847f1f
> >>> >> 2017-12-31 00:42:03,187 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node5.16020,1514693330961, table=pageAvro,
> >>> >> region=e9de2c68cc01883e959d7953a4251687
> >>> >> 2017-12-31 00:42:03,187 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node3.16020,1514693337210, table=page,
> >>> >> region=e2e5fc1c262273893f10e92f24817d1b
> >>> >> 2017-12-31 00:42:03,187 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node3.16020,1514693337210, table=page,
> >>> >> region=89c443c09f10bd1584b1bb86a637e1a8
> >>> >> 2017-12-31 00:42:03,188 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node5.16020,1514693330961, table=page,
> >>> >> region=8ca93e9285233ca7b31992f194056bc1
> >>> >> 2017-12-31 00:42:03,188 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node4.16020,1514693339685, table=work_proposed,
> >>> >> region=9afcf06c4d0d21d7e04b0223edcfc40a
> >>> >> 2017-12-31 00:42:03,188 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node6.16020,1514693336563, table=page,
> >>> >> region=3457b3237c576eecd550eccee3f584cd
> >>> >> 2017-12-31 00:42:03,188 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node1.16020,1514693336898, table=page,
> >>> >> region=dd5fb1dbd41945a9ccbc110b8d4a51b5
> >>> >> 2017-12-31 00:42:03,188 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node7.16020,1514693337406, table=work_proposed,
> >>> >> region=480bb37af54d9fa57c727da9e8a33578
> >>> >> 2017-12-31 00:42:03,188 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node8.16020,1514693333206, table=page_crc,
> >>> >> region=56b18d470a569c5474ea084f0d995726
> >>> >> 2017-12-31 00:42:03,188 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node6.16020,1514693336563, table=page_duplicate,
> >>> >> region=e744a9af161de965c70c7d1a08b07660
> >>> >> 2017-12-31 00:42:03,188 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node1.16020,1514693336898, table=page_proposed,
> >>> >> region=1c75e53308acac6313db4be63c2b48fe
> >>> >> 2017-12-31 00:42:03,188 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node8.16020,1514693333206, table=work_proposed,
> >>> >> region=45a25ba85f6341a177db7b15554259f9
> >>> >> 2017-12-31 00:42:03,188 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node3.16020,1514693337210, table=work_proposed,
> >>> >> region=d0a58b76ad9376b12b3e763660049d3d
> >>> >> 2017-12-31 00:42:03,188 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node3.16020,1514693337210, table=page,
> >>> >> region=599a4b7b21b1d93fa232ebbbef37a31b
> >>> >> 2017-12-31 00:42:03,188 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node1.16020,1514693336898, table=page_proposed,
> >>> >> region=55c07269cc907b8e8875c2a1c4ec27d5
> >>> >> 2017-12-31 00:42:03,188 WARN  [ProcExecTimeout]
> >>> >> assignment.AssignmentManager: TODO Handle stuck in transition:
> >>> >> rit=OPENING,
> >>> >> location=node5.,16020,1514693330961, table=page_crc,
> >>> >> region=fa3a3d7ebc64ce2a5494cae01477d8d8
> >>> >>
> >>> >> I'm 99% confident this is because of SNAPPY. I'm fighting to get it
> >>> >> working
> >>> >> but it's such a pain! My concern here is I don't see any exception
> >>> >> anywhere
> >>> >> on any logs. Nothing on the RS side, nothing on the master side
> >>> (Except
> >>> >> extract above).
> >>> >>
> >>> >> I suspect it's snappy because of this:
> >>> >>
> >>> >> hbase@node2:~/hbase-2.0.0-beta-1$ bin/hbase
> >>> >> org.apache.hadoop.hbase.util.CompressionTest
> hdfs://node2/tmp/snappy
> >>> >> snappy
> >>> >> 2017-12-31 00:45:31,006 WARN  [main] util.NativeCodeLoader: Unable
> to
> >>> load
> >>> >> native-hadoop library for your platform... using builtin-java
> classes
> >>> >> where
> >>> >> applicable
> >>> >> 2017-12-31 00:45:33,283 INFO  [main] metrics.MetricRegistries:
> Loaded
> >>> >> MetricRegistries class
> >>> >> org.apache.hadoop.hbase.metrics.impl.MetricRegistriesImpl
> >>> >> 2017-12-31 00:45:33,366 INFO  [main] hfile.CacheConfig: Created
> >>> >> cacheConfig: CacheConfig:disabled
> >>> >> Exception in thread "main" java.lang.RuntimeException: native snappy
> >>> >> library not available: this version of libhadoop was built without
> >>> snappy
> >>> >> support.
> >>> >>         at
> >>> >> org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoa
> >>> >> ded(SnappyCodec.java:65)
> >>> >>         at
> >>> >> org.apache.hadoop.io.compress.SnappyCodec.getCompressorType(
> >>> >> SnappyCodec.java:134)
> >>> >>         at
> >>> >> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecP
> >>> ool.java:150)
> >>> >>         at
> >>> >> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecP
> >>> ool.java:168)
> >>> >>         at
> >>> >> org.apache.hadoop.hbase.io.compress.Compression$Algorithm.
> >>> >> getCompressor(Compression.java:355)
> >>> >>         at
> >>> >> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultEncodin
> >>> >> gContext.<init>(HFileBlockDefaultEncodingContext.java:90)
> >>> >>         at
> >>> >> org.apache.hadoop.hbase.io.hfile.NoOpDataBlockEncoder.newDat
> >>> >> aBlockEncodingContext(NoOpDataBlockEncoder.java:85)
> >>> >>         at
> >>> >> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.<init>(
> >>> >> HFileBlock.java:923)
> >>> >>         at
> >>> >> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.finishInit(
> >>> >> HFileWriterImpl.java:296)
> >>> >>         at
> >>> >> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.<init>(HFil
> >>> >> eWriterImpl.java:186)
> >>> >>         at
> >>> >> org.apache.hadoop.hbase.io.hfile.HFile$WriterFactory.create(
> >>> >> HFile.java:339)
> >>> >>         at
> >>> >> org.apache.hadoop.hbase.util.CompressionTest.doSmokeTest(Com
> >>> >> pressionTest.java:129)
> >>> >>         at
> >>> >> org.apache.hadoop.hbase.util.CompressionTest.main(Compressio
> >>> >> nTest.java:167)
> >>> >>
> >>> >> But I think my installation is fine:
> >>> >> hbase@node2:~/hbase-2.0.0-beta-1$ ll native-build/
> >>> >> total 308
> >>> >> lrwxrwxrwx 1 hbase hbase     24 déc 31 00:29 libhadoopsnappy.so ->
> >>> >> libhadoopsnappy.so.0.0.1
> >>> >> lrwxrwxrwx 1 hbase hbase     24 déc 31 00:29 libhadoopsnappy.so.0 ->
> >>> >> libhadoopsnappy.so.0.0.1
> >>> >> -rwxr-xr-x 1 hbase hbase 120144 déc 31 00:29
> libhadoopsnappy.so.0.0.1
> >>> >> lrwxrwxrwx 1 hbase hbase     18 déc  1  2012 libsnappy.so ->
> >>> >> libsnappy.so.1.1.3
> >>> >> lrwxrwxrwx 1 hbase hbase     18 déc  1  2012 libsnappy.so.1 ->
> >>> >> libsnappy.so.1.1.3
> >>> >> -rwxr-xr-x 1 hbase hbase 178210 déc  1  2012 libsnappy.so.1.1.3
> >>> >> drwxr-xr-x 3 hbase hbase   4096 déc 30 15:44 python2.6
> >>> >> drwxr-xr-x 4 hbase hbase   4096 déc 30 23:35 python2.7
> >>> >> drwxr-xr-x 3 hbase hbase   4096 déc 30 23:29 python3.5
> >>> >>
> >>> >> an in hbase-env.sh:
> >>> >> export JAVA_HOME=/usr/local/jdk1.8.0_151
> >>> >> export HBASE_LIBRARY_PATH=/home/hbase/hbase-2.0.0-beta-1/
> native-build
> >>> >>
> >>> >>
> >>> >> So there is 2 things here.
> >>> >> 1) Why are the region servers not reporting any error when they are
> >>> not
> >>> >> able to open a region because of the compression codec not being
> >>> loaded?
> >>> >> 2) Why is HBase not picking up the Snappy codec.
> >>> >>
> >>> >> Thanks,
> >>> >>
> >>> >> JMS
> >>> >>
> >>> >>
> >>> >> 2017-12-29 13:15 GMT-05:00 Stack <[email protected]>:
> >>> >>
> >>> >> > The first release candidate for HBase 2.0.0-beta-1 is up at:
> >>> >> >
> >>> >> >  https://dist.apache.org/repos/dist/dev/hbase/hbase-2.0.0-bet
> >>> a-1-RC0/
> >>> >> >
> >>> >> > Maven artifacts are available from a staging directory here:
> >>> >> >
> >>> >> >  https://repository.apache.org/content/repositories/orgapache
> >>> hbase-1188
> >>> >> >
> >>> >> > All was signed with my key at 8ACC93D2 [1]
> >>> >> >
> >>> >> > I tagged the RC as 2.0.0-beta-1-RC0
> >>> >> > (0907563eb72697b394b8b960fe54887d6ff304fd)
> >>> >> >
> >>> >> > hbase-2.0.0-beta-1 is our first beta release. It includes all that
> >>> was
> >>> >> in
> >>> >> > previous alphas (new assignment manager, offheap read/write path,
> >>> >> in-memory
> >>> >> > compactions, etc.). The APIs and feature-set are sealed.
> >>> >> >
> >>> >> > hbase-2.0.0-beta-1 is a not-for-production preview of hbase-2.0.0.
> >>> It is
> >>> >> > meant for devs and downstreamers to test drive and flag us if we
> >>> messed
> >>> >> up
> >>> >> > on anything ahead of our rolling GAs. We are particular interested
> >>> in
> >>> >> > hearing from Coprocessor developers.
> >>> >> >
> >>> >> > The list of features addressed in 2.0.0 so far can be found here
> >>> [3].
> >>> >> There
> >>> >> > are thousands. The list of ~2k+ fixes in 2.0.0 exclusively can be
> >>> found
> >>> >> > here [4] (My JIRA JQL foo is a bit dodgy -- forgive me if
> mistakes).
> >>> >> >
> >>> >> > I've updated our overview doc. on the state of 2.0.0 [6]. We'll do
> >>> one
> >>> >> more
> >>> >> > beta before we put up our first 2.0.0 Release Candidate by the end
> >>> of
> >>> >> > January, 2.0.0-beta-2. Its focus will be making it so users can do
> a
> >>> >> > rolling upgrade on to hbase-2.x from hbase-1.x (and any bug fixes
> >>> found
> >>> >> > running beta-1). Here is the list of what we have targeted so far
> >>> for
> >>> >> > beta-2 [5]. Check it out.
> >>> >> >
> >>> >> > One knownissue is that the User API has not been properly filtered
> >>> so it
> >>> >> > shows more than just InterfaceAudience Public content
> (HBASE-19663,
> >>> to
> >>> >> be
> >>> >> > fixed by beta-2).
> >>> >> >
> >>> >> > Please take this beta for a spin. Please vote on whether it ok to
> >>> put
> >>> >> out
> >>> >> > this RC as our first beta (Note CHANGES has not yet been updated).
> >>> Let
> >>> >> the
> >>> >> > VOTE be open for 72 hours (Monday)
> >>> >> >
> >>> >> > Thanks,
> >>> >> > Your 2.0.0 Release Manager
> >>> >> >
> >>> >> > 1. http://pgp.mit.edu/pks/lookup?op=get&search=0x9816C7FC8ACC93D2
> >>> >> > 3. https://goo.gl/scYjJr
> >>> >> > 4. https://goo.gl/dFFT8b
> >>> >> > 5. https://issues.apache.org/jira/projects/HBASE/versions/
> 12340862
> >>> >> > 6. https://docs.google.com/document/d/
> 1WCsVlnHjJeKUcl7wHwqb4z9iEu_
> >>> >> > ktczrlKHK8N4SZzs/
> >>> >> >
> >>> >>
> >>> >
> >>>
> >>
> >>
> >
>

Reply via email to