[ 
https://issues.apache.org/jira/browse/HBASE-10492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898410#comment-13898410
 ] 

stack commented on HBASE-10492:
-------------------------------

6minutes is eons, a body blow.  If you run oracle jvm, does it exhibit same 
latencies?

ibm jdk taking this long scheduling a thread is a problem, a problem for us if 
we are to run well on ibmjdk.   We should dig and figure if it something about 
the way this thread is scheduled in particular or if it is the case that any 
thread can be swapped out for pauses of this magnitude.

> open daughter regions can unpredictably take long time
> ------------------------------------------------------
>
>                 Key: HBASE-10492
>                 URL: https://issues.apache.org/jira/browse/HBASE-10492
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.96.0
>            Reporter: Jerry He
>
> I have seen during a stress testing client was getting 
> "RetriesExhaustedWithDetailsException: Failed 748 actions: 
> NotServingRegionException"
> On the master log, 2014-02-08 20:43 is the timestamp from OFFLINE to 
> SPLITTING_NEW, 2014-02-08 21:41 is the timestamp from SPLITTING_NEW to OPEN.
> The corresponding time period on the region sever log is:
> {code}
> 2014-02-08 20:44:12,662 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not 
> found for region: 010c1981882d1a59201af5e2dc589d44
> 2014-02-08 20:44:12,666 WARN 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem: .regioninfo file not 
> found for region: c2eb9b7971ca7f3fed3da86df5b788e7
> {code}
> There were no INFO related to these two regions until: (at the end see this: 
> Split took 57mins, 16sec)
> {code}
> 2014-02-08 21:41:14,029 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Onlined c2eb9b7971ca7f3fed3da86df5b788e7; next sequenceid=213355
> 2014-02-08 21:41:14,031 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Onlined 010c1981882d1a59201af5e2dc589d44; next sequenceid=213354
> 2014-02-08 21:41:14,032 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks 
> for 
> region=tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
> 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
> Updated row 
> tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
>  with server=hdtest208.svl.ibm.com,60020,1391887547473
> 2014-02-08 21:41:14,054 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Finished post open deploy 
> task for 
> tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
> 2014-02-08 21:41:14,054 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Post open deploy tasks 
> for 
> region=tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44.
> 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: 
> Completed compaction of 10 file(s) in cf of 
> tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c.
>  into 451be6df8c604993ae540b808d9cfa08(size=72.8 M), total size for store is 
> 2.4 G. This selection was in queue for 0sec, and took 1mins, 40sec to execute.
> 2014-02-08 21:41:14,059 INFO 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Completed 
> compaction: Request = 
> regionName=tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c.,
>  storeName=cf, fileCount=10, fileSize=94.1 M, priority=9883, 
> time=1391924373278861000; duration=1mins, 40sec
> 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Starting compaction on cf in region 
> tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
> 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: 
> Starting compaction of 10 file(s) in cf of 
> tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
>  into 
> tmpdir=gpfs:/hbase/data/default/tpch_hb_1000_2.lineitem/c2eb9b7971ca7f3fed3da86df5b788e7/.tmp,
>  totalSize=709.7 M
> 2014-02-08 21:41:14,066 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
> Updated row 
> tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44. with 
> server=hdtest208.svl.ibm.com,60020,1391887547473
> 2014-02-08 21:41:14,066 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Finished post open deploy 
> task for 
> tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44.
> 2014-02-08 21:41:14,190 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Region split, hbase:meta 
> updated, and report to master. 
> Parent=tpch_hb_1000_2.lineitem,,1391918508561.b576e8db65d56ec08db5ca900587c28d.,
>  new regions: 
> tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44., 
> tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7..
>  Split took 57mins, 16sec
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to