Hi, Ming,

This time it happened after I did 'update statistics for table xxxx on
every column'  which leading to the HBase regionserver crashed because of
out of memory error. After I restart the regionservers this happened. When
HBase doing the recovery step,  errors like this

"ERROR [RS_OPEN_REGION-hadoop2slave7:60020-2] regionserver.HRegion: Found
decreasing SeqId. PreId=1517
key=TRAFODION._MD_.OBJECTS/836a95deb60624ab3e620e44ae54fe15/1419;
edit=[#edits: 1 = <TRAFODION


                                                SEABASE



SB_HISTOGRAM_INTERVALS


                             BT/#1:\x01/1474598326967/Put/vlen=9/seqid=0; >]
2016-09-23 10:42:33,508 INFO  [RS_OPEN_REGION-hadoop2slave7:60020-2]
transactional.TrxRegionObserver: Trafodion Recovery Region Observer CP:
preWALRestore coprocessor is invoked ... in table TRAFODION._MD_.OBJECTS"

 comes up, each time is the Trafodion meta table TRAFODION._MD_.*

when the* bulkload* leading to this,  I think the reason may be that the
load operation from hive just create hfile rather than write the WALs.
but to the update statistic operation, I'm not sure how this happened cause
I am not clear how the update statistic work.


Qiao

Liu, Ming (Ming) <ming....@esgyn.cn>于2016年9月23日周五 上午10:46写道:

> Hi, Qiao,
>
> before this region fail to open, did you do a bulkload from hive?
> I know you hit same issue several times before, but have different java
> error stack before. So want to confirm with you.
> And you paste the error stack from sqlci, is it possible to find what is
> the error stack in Region Server log?
>
> Last time, it was something like below, could you find same issue this
> time? So two questions: did you do a bulkload? And could you find the same
> error stack?
>
> 2016-09-08 16:44:36,327 ERROR [RS_OPEN_REGION-hadoop2slave7:60020-0]
> handler.OpenRegionHandler:
> Failed open of
> region=TRAFODION._MD_.COLUMNS,,1471946223350.b6191867e73d4203d3ac6fad3c860138.,
> starting to roll back the global memstore size.
> org.apache.hadoop.hbase.DroppedSnapshotException: region:
> TRAFODION._MD_.COLUMNS,,1471946223350.b6191867e73d4203d3ac6fad3c860138.
>                 at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2243)
>                 at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1972)
>                 at
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:3826)
>                 at
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionStores(HRegion.java:969)
>                 at
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:841)
>                 at
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:814)
>                 at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5828)
>                 at
> org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegion.openHRegion(TransactionalRegion.java:101)
>                 at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5794)
>                 at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5765)
>                 at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5721)
>                 at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5672)
>                 at
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:356)
>                 at
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:126)
>                 at
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>                 at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>                 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>                 at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.AssertionError: Key \xB9"b*M3c\x00ADMCKID
>
>
>
>  /#1:\x01/1473306352163/Put/vlen=8/seqid=1749 followed
> by a smaller key \xB9"b*M3c\x00ADMCKID
>
>
>                  /#1:\x01/1473306352163/Put/vlen=8/seqid=4003 in cf #1
>                 at
> org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:699)
>                 at
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:493)
>                 at
> org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:115)
>                 at
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:71)
>                 at
> org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:940)
>                 at
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2217)
>                 at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2197)
>                 ... 17 more
>
> Ming
> -----Original Message-----
> From: 乔彦克 [mailto:qya...@gmail.com]
> Sent: Friday, September 23, 2016 9:59 AM
> To: d...@trafodion.incubator.apache.org;
> user@trafodion.incubator.apache.org
> Cc: Dave Birdsall <dave.birds...@esgyn.com>
> Subject: Re: Trafodion meta table region in hbase cannot be opened
>
> Thanks for your reply, Dave.
> Scan for  "TRAFODION._MD_.VERSIONS"  doesn't work beacuse the table's
> region is not online and cannot be assigned.
> I have nothing to do but delete all Trafodion tables in HBase and after
> the clearing work, I restart all the service, then the system gets Ok to
> work.
> I've get stuck by such problem several times due to the region failed to
> open. there maybe have some bugs when loading data to Trafodion from Hive,
> I don't know quite clear.
>
> Best Regards,
> Qiao
>
> Dave Birdsall <dave.birds...@esgyn.com>于2016年9月22日周四 下午11:52写道:
>
> > Hi Qiao,
> >
> >
> >
> > If you go into the hbase shell, and do the following:
> >
> >
> >
> > scan “TRAFODION._MD_.VERSIONS”
> >
> >
> >
> > What happens? If HBase is up and working correctly, and if nothing has
> > been corrupted, this should return 3 rows.
> >
> >
> >
> > Dave
> >
> >
> >
> > *From:* 乔彦克 [mailto:qya...@gmail.com]
> > *Sent:* Wednesday, September 21, 2016 8:13 PM
> > *To:* user@trafodion.incubator.apache.org
> > *Cc:* dev <d...@trafodion.incubator.apache.org>
> > *Subject:* Re: Trafodion meta table region in hbase cannot be opened
> >
> >
> >
> > I tried to assign the region by hand in hbase shell, but still cannot
> > open regions.
> >
> > Below is the log errors while do some basic query.
> >
> > SQL>get tables;
> >
> >
> >
> > *** ERROR[1398] Error -704 occured while accessing the hbase
> > subsystem. Fix that error and make sure hbase is up and running. Error
> Details:
> >
> > org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> > attempts=21, exceptions:
> >
> > Thu Sep 22 10:26:02 CST 2016, null, java.net.SocketTimeoutException:
> > callTimeout=600000, callDuration=792843: row '' on table
> > 'TRAFODION._MD_.VERSIONS' at
> >
> > region=TRAFODION._MD_.VERSIONS,,1474445084590.0350ce973b1e43e17b2d14ec
> > 72b7f867., hostname=hadoop2slave7,60020,1474442934725, seqNum=2
> >
> >
> >
> >
> > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throw
> > EnrichedException(RpcRetryingCallerWithReadReplicas.java:270)
> >
> >
> > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(Scanne
> > rCallableWithReplicas.java:203)
> >
> >
> > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(Scanne
> > rCallableWithReplicas.java:57)
> >
> >
> > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(Rp
> > cRetryingCaller.java:200)
> >
> > org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:2
> > 94)
> >
> >
> > org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner
> > .java:269)
> >
> >
> > org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstr
> > uction(ClientScanner.java:141)
> >
> > org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java
> > :136)
> >
> > org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:886)
> >
> >
> > org.apache.hadoop.hbase.client.transactional.TransactionalTable.getSca
> > nner(TransactionalTable.java:813)
> >
> >
> > org.apache.hadoop.hbase.client.transactional.RMInterface.getScanner(RM
> > Interface.java:428)
> >
> > org.trafodion.sql.HTableClient.startScan(HTableClient.java:983)
> >
> > .  [2016-09-22 11:09:54]
> >
> >
> >
> > Best regards,
> >
> > Qiao
> >
> >
> >
> > 乔彦克 <qya...@gmail.com>于2016年9月22日周四 上午11:07写道:
> >
> > Hi, deal group,
> >
> >         Yesterday after I restarted the HBase regionservers(cause one
> > is down, I want to restart all the regionservers for load balance),
> > there are
> > 7 Trafodion table regions cannot be opened. And I cannot log in to the
> > Trafodion shell using 'trafci'.
> >
> > Does anyone get stuck by this problem ever?
> >
> > How Can I recover my trafodion to normal state?
> >
> > [image: region not open.png]
> >
> >
> >
> > Any reply is appreciated.
> >
> > Thanks,
> >
> > Qiao
> >
>

Reply via email to