[jira] [Commented] (HBASE-5312) Closed parent region present in Hlog.lastSeqWritten

Todd Lipcon (Commented) (JIRA) Wed, 08 Feb 2012 23:04:06 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204313#comment-13204313
 ]


Todd Lipcon commented on HBASE-5312:
------------------------------------

This bug has been around a while, see:
https://issues.apache.org/jira/browse/HBASE-3481?focusedCommentId=12987965&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12987965

We also experienced it on a customer cluster this week.

Any leads on this, Ram?
                
> Closed parent region present in Hlog.lastSeqWritten
> ---------------------------------------------------
>
>                 Key: HBASE-5312
>                 URL: https://issues.apache.org/jira/browse/HBASE-5312
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.5
>            Reporter: ramkrishna.s.vasudevan
>             Fix For: 0.90.7
>
>
> This is in reference to the mail sent in the dev mailing list
> "Closed parent region present in Hlog.lastSeqWritten".
> The sceanrio described is
> We had a region that was split into two daughters.  When the hlog roll tried 
> to flush the region there was an entry in the HLog.lastSeqWritten that was 
> not flushed or removed from the lastSeqWritten during the parent close.
> Because this flush was not happening subsequent flushes were getting blocked
> {code}
>  05:06:44,422 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many
>  hlogs: logs=122, maxlogs=32; forcing flush of 1 regions(s):
>  2acaf8e3acfd2e8a5825a1f6f0aca4a8
>  05:06:44,422 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed
>  to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null,
> requester=null
>  05:10:48,666 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many
>  hlogs: logs=123, maxlogs=32; forcing flush of 1 regions(s):
>  2acaf8e3acfd2e8a5825a1f6f0aca4a8
>  05:10:48,666 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed
>  to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null,
> requester=null
>  05:14:46,075 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many
>  hlogs: logs=124, maxlogs=32; forcing flush of 1 regions(s):
>  2acaf8e3acfd2e8a5825a1f6f0aca4a8
>  05:14:46,075 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed
>  to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null,
> requester=null
>  05:15:41,584 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many
>  hlogs: logs=125, maxlogs=32; forcing flush of 1 regions(s):
>  2acaf8e3acfd2e8a5825a1f6f0aca4a8
>  05:15:41,584 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed
>  to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null,
> {code}
> Lets see what happened for the region 2acaf8e3acfd2e8a5825a1f6f0aca4a8
> {code}
> 2012-01-06 00:30:55,214 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Renaming flushed file at 
> hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/.tmp/1755862026714756815
>  to 
> hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/value/973789709483406123
> 2012-01-06 00:30:58,946 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Instantiated 
> Htable_UFDR_016,049790700093168-04565200000,1325809837958.0ebe5bd7fcbc09ee074d5600b9d4e062.
> 2012-01-06 00:30:59,614 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Added 
> hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/value/973789709483406123,
>  entries=7537, sequenceid=20312223, memsize=4.2m, filesize=2.9m
> 2012-01-06 00:30:59,787 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Finished snapshotting, commencing flushing stores
> 2012-01-06 00:30:59,787 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Finished memstore flush of ~133.5m for region 
> Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. in 
> 21816ms, sequenceid=20312223, compaction requested=true
> 2012-01-06 00:30:59,787 DEBUG 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested 
> for Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. 
> because regionserver20020.cacheFlusher; priority=0, compaction queue size=5840
> {code}
> A user triggered split has been issued to this region which can be seen in 
> the above logs.
> The flushing of this region has resulted in a seq id 20312223.
> The region has been splitted and the parent region has been closed
> {code}
> 00:31:12,607 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: 
> Starting split of region 
> Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.
> 00:31:13,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing 
> Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.: 
> disabling compactions & flushes
> 00:31:13,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates 
> disabled for region 
> Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.
> 00:31:13,718 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed 
> Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.
> 00:31:39,552 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Offlined parent 
> region Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. 
> in META
> 00:31:41,374 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded 
> hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/259d0c620c9105928e52713f4a5a252e/value/239119195230239381.2acaf8e3acfd2e8a5825a1f6f0aca4a8,
>  isReference=true, isBulkLoadResult=false, seqid=20201181, 
> majorCompaction=false
> 2012-01-06 00:31:42,529 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> loaded 
> hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/52ff3c7c6df6e0337876bbca29cee84a/value/973789709483406123.2acaf8e3acfd2e8a5825a1f6f0aca4a8,
>  isReference=true, isBulkLoadResult=false, seqid=20312224, 
> majorCompaction=false
> 2012-01-06 00:31:42,532 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
> loaded 
> hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/259d0c620c9105928e52713f4a5a252e/value/973789709483406123.2acaf8e3acfd2e8a5825a1f6f0aca4a8,
>  isReference=true, isBulkLoadResult=false, seqid=20312223, 
> majorCompaction=false
> INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region split, 
> META updated, and report to master. 
> Parent=Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8., 
> new regions: 
> Htable_UFDR_031,00332,1325809872607.259d0c620c9105928e52713f4a5a252e., 
> Htable_UFDR_031,003732800093168-03594291912,1325809872607.52ff3c7c6df6e0337876bbca29cee84a..
>  Split took 29sec
> {code}
> In the above logs we can also see that the new daugher regions have the next 
> seq id as 20312223 and 20312224.
> {code}
> DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Found 1 hlogs to remove 
> out of total 4; oldest outstanding sequenceid is 20312224 from region 
> 2acaf8e3acfd2e8a5825a1f6f0aca4a8
> {code}
> Now we see that the parent region which was clsoed has a seq id 20312224 
> which is not flushed.
> So further flush are failing as the region is already removed from 
> onlineRegionList.
> The doubt here is before the region could be closed, a put has arrived for 
> this region.  But due to some reason the flush has  not happened for that. We 
> tried to dig this, but not able to get this problem again.
> 0.90.5 version + few 0.90.6 patches ( before 0.90.6RC0)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5312) Closed parent region present in Hlog.lastSeqWritten

Reply via email to