Hi, Qiao,

before this region fail to open, did you do a bulkload from hive?
I know you hit same issue several times before, but have different java error 
stack before. So want to confirm with you.
And you paste the error stack from sqlci, is it possible to find what is the 
error stack in Region Server log?

Last time, it was something like below, could you find same issue this time? So 
two questions: did you do a bulkload? And could you find the same error stack?

2016-09-08 16:44:36,327 ERROR [RS_OPEN_REGION-hadoop2slave7:60020-0] 
handler.OpenRegionHandler:
Failed open of 
region=TRAFODION._MD_.COLUMNS,,1471946223350.b6191867e73d4203d3ac6fad3c860138.,
starting to roll back the global memstore size.
org.apache.hadoop.hbase.DroppedSnapshotException: region: 
TRAFODION._MD_.COLUMNS,,1471946223350.b6191867e73d4203d3ac6fad3c860138.
                at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2243)
                at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1972)
                at 
org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:3826)
                at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionStores(HRegion.java:969)
                at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:841)
                at 
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:814)
                at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5828)
                at 
org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegion.openHRegion(TransactionalRegion.java:101)
                at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5794)
                at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5765)
                at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5721)
                at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5672)
                at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:356)
                at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:126)
                at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
                at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
                at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.AssertionError: Key \xB9"b*M3c\x00ADMCKID                  
         
                                                                                
         
                                                                                
         
                                         
/#1:\x01/1473306352163/Put/vlen=8/seqid=1749 followed
by a smaller key \xB9"b*M3c\x00ADMCKID                                          
         
                                                                                
         
                                                                                
         
                 /#1:\x01/1473306352163/Put/vlen=8/seqid=4003 in cf #1
                at 
org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:699)
                at 
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:493)
                at 
org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:115)
                at 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:71)
                at 
org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:940)
                at 
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2217)
                at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2197)
                ... 17 more

Ming
-----Original Message-----
From: 乔彦克 [mailto:qya...@gmail.com] 
Sent: Friday, September 23, 2016 9:59 AM
To: d...@trafodion.incubator.apache.org; user@trafodion.incubator.apache.org
Cc: Dave Birdsall <dave.birds...@esgyn.com>
Subject: Re: Trafodion meta table region in hbase cannot be opened

Thanks for your reply, Dave.
Scan for  "TRAFODION._MD_.VERSIONS"  doesn't work beacuse the table's region is 
not online and cannot be assigned.
I have nothing to do but delete all Trafodion tables in HBase and after the 
clearing work, I restart all the service, then the system gets Ok to work.
I've get stuck by such problem several times due to the region failed to open. 
there maybe have some bugs when loading data to Trafodion from Hive, I don't 
know quite clear.

Best Regards,
Qiao

Dave Birdsall <dave.birds...@esgyn.com>于2016年9月22日周四 下午11:52写道:

> Hi Qiao,
>
>
>
> If you go into the hbase shell, and do the following:
>
>
>
> scan “TRAFODION._MD_.VERSIONS”
>
>
>
> What happens? If HBase is up and working correctly, and if nothing has 
> been corrupted, this should return 3 rows.
>
>
>
> Dave
>
>
>
> *From:* 乔彦克 [mailto:qya...@gmail.com]
> *Sent:* Wednesday, September 21, 2016 8:13 PM
> *To:* user@trafodion.incubator.apache.org
> *Cc:* dev <d...@trafodion.incubator.apache.org>
> *Subject:* Re: Trafodion meta table region in hbase cannot be opened
>
>
>
> I tried to assign the region by hand in hbase shell, but still cannot 
> open regions.
>
> Below is the log errors while do some basic query.
>
> SQL>get tables;
>
>
>
> *** ERROR[1398] Error -704 occured while accessing the hbase 
> subsystem. Fix that error and make sure hbase is up and running. Error 
> Details:
>
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
> attempts=21, exceptions:
>
> Thu Sep 22 10:26:02 CST 2016, null, java.net.SocketTimeoutException:
> callTimeout=600000, callDuration=792843: row '' on table 
> 'TRAFODION._MD_.VERSIONS' at
>
> region=TRAFODION._MD_.VERSIONS,,1474445084590.0350ce973b1e43e17b2d14ec
> 72b7f867., hostname=hadoop2slave7,60020,1474442934725, seqNum=2
>
>
>
>
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throw
> EnrichedException(RpcRetryingCallerWithReadReplicas.java:270)
>
>
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(Scanne
> rCallableWithReplicas.java:203)
>
>
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(Scanne
> rCallableWithReplicas.java:57)
>
>
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(Rp
> cRetryingCaller.java:200)
>
> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:2
> 94)
>
>
> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner
> .java:269)
>
>
> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstr
> uction(ClientScanner.java:141)
>
> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java
> :136)
>
> org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:886)
>
>
> org.apache.hadoop.hbase.client.transactional.TransactionalTable.getSca
> nner(TransactionalTable.java:813)
>
>
> org.apache.hadoop.hbase.client.transactional.RMInterface.getScanner(RM
> Interface.java:428)
>
> org.trafodion.sql.HTableClient.startScan(HTableClient.java:983)
>
> .  [2016-09-22 11:09:54]
>
>
>
> Best regards,
>
> Qiao
>
>
>
> 乔彦克 <qya...@gmail.com>于2016年9月22日周四 上午11:07写道:
>
> Hi, deal group,
>
>         Yesterday after I restarted the HBase regionservers(cause one 
> is down, I want to restart all the regionservers for load balance), 
> there are
> 7 Trafodion table regions cannot be opened. And I cannot log in to the 
> Trafodion shell using 'trafci'.
>
> Does anyone get stuck by this problem ever?
>
> How Can I recover my trafodion to normal state?
>
> [image: region not open.png]
>
>
>
> Any reply is appreciated.
>
> Thanks,
>
> Qiao
>

Reply via email to