Here is the region server log distilled down to events pertinent to the region 
move.  Not much being logged while the Store is being closed.
This region was about 1.4G compressed, which I consider small.

2015-10-15 08:34:28,510 INFO 
org.apache.hadoop.hbase.regionserver.RSRpcServices: Close 
dad6c71ed395df19220ef1056a110086, moving to 
hb17.prod1.connexity.net,60020,1444425924021
2015-10-15 08:34:28,510 DEBUG 
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing 
close of 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
2015-10-15 08:34:28,511 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Closing 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.:
 disabling compactions & flushes
2015-10-15 08:34:28,511 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Running close preflush of 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
2015-10-15 08:34:28,511 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Started memstore flush for 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.,
 current region memstore size 44.89 MB, and 1/1 column families' memstores are 
being flushed.
2015-10-15 08:34:28,511 WARN org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
Couldn't find oldest seqNum for the region we are about to flush: 
[dad6c71ed395df19220ef1056a110086]
2015-10-15 08:34:29,137 INFO 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed, 
sequenceid=70839856845, memsize=44.9 M, hasBloomFilter=false, into tmp file 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/.tmp/3600b47839a945de9733cf17581458e0
2015-10-15 08:34:29,144 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegionFileSystem: Committing store file 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/.tmp/3600b47839a945de9733cf17581458e0
 as 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/L/3600b47839a945de9733cf17581458e0
2015-10-15 08:34:29,150 INFO org.apache.hadoop.hbase.regionserver.HStore: Added 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/L/3600b47839a945de9733cf17581458e0,
 entries=190131, sequenceid=70839856845, filesize=3.2 M
2015-10-15 08:34:29,151 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Finished memstore flush of ~44.89 MB/47066832, currentsize=22.39 KB/22928 for 
region 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
 in 640ms, sequenceid=70839856845, compaction requested=false
2015-10-15 08:34:29,152 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Updates disabled for region 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
2015-10-15 08:34:29,152 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Started memstore flush for 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.,
 current region memstore size 22.39 KB, and 1/1 column families' memstores are 
being flushed.
2015-10-15 08:34:29,152 WARN org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
Couldn't find oldest seqNum for the region we are about to flush: 
[dad6c71ed395df19220ef1056a110086]
2015-10-15 08:34:29,225 INFO 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed, 
sequenceid=70839856876, memsize=22.4 K, hasBloomFilter=false, into tmp file 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/.tmp/a8a74549c7a44174b105cbed23cfaea1
2015-10-15 08:34:29,231 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegionFileSystem: Committing store file 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/.tmp/a8a74549c7a44174b105cbed23cfaea1
 as 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/L/a8a74549c7a44174b105cbed23cfaea1
2015-10-15 08:34:29,279 INFO org.apache.hadoop.hbase.regionserver.HStore: Added 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/L/a8a74549c7a44174b105cbed23cfaea1,
 entries=128, sequenceid=70839856876, filesize=2.6 K
2015-10-15 08:34:29,280 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Finished memstore flush of ~22.39 KB/22928, currentsize=0 B/0 for region 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
 in 128ms, sequenceid=70839856876, compaction requested=true


2015-10-15 08:37:02,945 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Closed L
2015-10-15 08:37:02,954 DEBUG org.apache.hadoop.hbase.wal.WALSplitter: Wrote 
region 
seqId=hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/recovered.edits/70839856879.seqid
 to file, newSeqId=70839856879, maxSeqId=70838935131
2015-10-15 08:37:02,955 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Closed 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
2015-10-15 08:37:02,955 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Adding moved region record: 
dad6c71ed395df19220ef1056a110086 to 
hb17.prod1.connexity.net,60020,1444425924021 as of 70838935131
2015-10-15 08:37:02,956 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x45049a3c1b80a8b, 
quorum=ccm2.prod1.connexity.net:2181,ccm1.prod1.connexity.net:2181,db2.prod1.connexity.net:2181,sql2.prod1.connexity.net:2181,ccm3.prod1.connexity.net:2181,
 baseZNode=/hbase Transitioning dad6c71ed395df19220ef1056a110086 from 
M_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
2015-10-15 08:37:02,967 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x45049a3c1b80a8b, 
quorum=ccm2.prod1.connexity.net:2181,ccm1.prod1.connexity.net:2181,db2.prod1.connexity.net:2181,sql2.prod1.connexity.net:2181,ccm3.prod1.connexity.net:2181,
 baseZNode=/hbase Transitioned node dad6c71ed395df19220ef1056a110086 from 
M_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED
2015-10-15 08:37:02,967 DEBUG 
org.apache.hadoop.hbase.coordination.ZkCloseRegionCoordination: Set closed 
state in zk for 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
 on hb18.prod1.connexity.net,60020,1444427168636
2015-10-15 08:37:02,967 DEBUG 
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Closed 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
~                                                                               
                                                                                
                                    






On 10/15/15, 11:05 AM, "Esteban Gutierrez" <[email protected]> wrote:

>Hi Randy,
>
>As Ted suggested a couple of stack traces would be helpful but if you could
>attach the HMaster and a couple of RS logs where you see this problem that
>would be great. Also, do you have a large regions that might need some time
>to flush? have you looked into any potential problem in HDFS?
>
>thanks,
>esteban.
>
>--
>Cloudera, Inc.
>
>
>On Thu, Oct 15, 2015 at 10:39 AM, Ted Yu <[email protected]> wrote:
>
>> Can you give a bit more detail on why block eviction was cause for the
>> slow region movement?
>>
>> Did you happen to take stack traces ?
>>
>> Thanks
>>
>> > On Oct 15, 2015, at 10:32 AM, Randy Fox <[email protected]> wrote:
>> >
>> > Hi,
>> >
>> > We just upgraded from 0.94 to 1.0.0 and have noticed that region moves
>> are super slow (order of minutes) whereas previously they where in the
>> seconds range.  After looking at the code, I think the time is spent
>> waiting for the blocks to be evicted from block cache.
>> >
>> > I wanted to verify that this theory is correct and see if there is
>> anything that can be done to speed up the moves.
>> >
>> > This is particular painful as we are trying to get our configs tuned to
>> the new SW and need to do rolling restarts which is taking almost 24 hours
>> on our cluster.  We also do our own manual rebalancing of regions across
>> RS’s and that task is also now painful.
>> >
>> >
>> > Thanks,
>> >
>> > Randy
>>

Reply via email to