[
https://issues.apache.org/jira/browse/HBASE-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205214#comment-13205214
]
Jimmy Xiang commented on HBASE-5312:
------------------------------------
Have anyone seen this issue on 0.92 release? Could we add some logging so that
we will have some clue when it happens again?
> Closed parent region present in Hlog.lastSeqWritten
> ---------------------------------------------------
>
> Key: HBASE-5312
> URL: https://issues.apache.org/jira/browse/HBASE-5312
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.5
> Reporter: ramkrishna.s.vasudevan
> Fix For: 0.90.7
>
>
> This is in reference to the mail sent in the dev mailing list
> "Closed parent region present in Hlog.lastSeqWritten".
> The sceanrio described is
> We had a region that was split into two daughters. When the hlog roll tried
> to flush the region there was an entry in the HLog.lastSeqWritten that was
> not flushed or removed from the lastSeqWritten during the parent close.
> Because this flush was not happening subsequent flushes were getting blocked
> {code}
> 05:06:44,422 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many
> hlogs: logs=122, maxlogs=32; forcing flush of 1 regions(s):
> 2acaf8e3acfd2e8a5825a1f6f0aca4a8
> 05:06:44,422 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed
> to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null,
> requester=null
> 05:10:48,666 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many
> hlogs: logs=123, maxlogs=32; forcing flush of 1 regions(s):
> 2acaf8e3acfd2e8a5825a1f6f0aca4a8
> 05:10:48,666 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed
> to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null,
> requester=null
> 05:14:46,075 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many
> hlogs: logs=124, maxlogs=32; forcing flush of 1 regions(s):
> 2acaf8e3acfd2e8a5825a1f6f0aca4a8
> 05:14:46,075 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed
> to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null,
> requester=null
> 05:15:41,584 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many
> hlogs: logs=125, maxlogs=32; forcing flush of 1 regions(s):
> 2acaf8e3acfd2e8a5825a1f6f0aca4a8
> 05:15:41,584 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed
> to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null,
> {code}
> Lets see what happened for the region 2acaf8e3acfd2e8a5825a1f6f0aca4a8
> {code}
> 2012-01-06 00:30:55,214 INFO org.apache.hadoop.hbase.regionserver.Store:
> Renaming flushed file at
> hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/.tmp/1755862026714756815
> to
> hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/value/973789709483406123
> 2012-01-06 00:30:58,946 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Instantiated
> Htable_UFDR_016,049790700093168-04565200000,1325809837958.0ebe5bd7fcbc09ee074d5600b9d4e062.
> 2012-01-06 00:30:59,614 INFO org.apache.hadoop.hbase.regionserver.Store:
> Added
> hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/value/973789709483406123,
> entries=7537, sequenceid=20312223, memsize=4.2m, filesize=2.9m
> 2012-01-06 00:30:59,787 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Finished snapshotting, commencing flushing stores
> 2012-01-06 00:30:59,787 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Finished memstore flush of ~133.5m for region
> Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. in
> 21816ms, sequenceid=20312223, compaction requested=true
> 2012-01-06 00:30:59,787 DEBUG
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested
> for Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.
> because regionserver20020.cacheFlusher; priority=0, compaction queue size=5840
> {code}
> A user triggered split has been issued to this region which can be seen in
> the above logs.
> The flushing of this region has resulted in a seq id 20312223.
> The region has been splitted and the parent region has been closed
> {code}
> 00:31:12,607 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction:
> Starting split of region
> Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.
> 00:31:13,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing
> Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.:
> disabling compactions & flushes
> 00:31:13,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates
> disabled for region
> Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.
> 00:31:13,718 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed
> Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.
> 00:31:39,552 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Offlined parent
> region Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.
> in META
> 00:31:41,374 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded
> hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/259d0c620c9105928e52713f4a5a252e/value/239119195230239381.2acaf8e3acfd2e8a5825a1f6f0aca4a8,
> isReference=true, isBulkLoadResult=false, seqid=20201181,
> majorCompaction=false
> 2012-01-06 00:31:42,529 DEBUG org.apache.hadoop.hbase.regionserver.Store:
> loaded
> hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/52ff3c7c6df6e0337876bbca29cee84a/value/973789709483406123.2acaf8e3acfd2e8a5825a1f6f0aca4a8,
> isReference=true, isBulkLoadResult=false, seqid=20312224,
> majorCompaction=false
> 2012-01-06 00:31:42,532 DEBUG org.apache.hadoop.hbase.regionserver.Store:
> loaded
> hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/259d0c620c9105928e52713f4a5a252e/value/973789709483406123.2acaf8e3acfd2e8a5825a1f6f0aca4a8,
> isReference=true, isBulkLoadResult=false, seqid=20312223,
> majorCompaction=false
> INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region split,
> META updated, and report to master.
> Parent=Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.,
> new regions:
> Htable_UFDR_031,00332,1325809872607.259d0c620c9105928e52713f4a5a252e.,
> Htable_UFDR_031,003732800093168-03594291912,1325809872607.52ff3c7c6df6e0337876bbca29cee84a..
> Split took 29sec
> {code}
> In the above logs we can also see that the new daugher regions have the next
> seq id as 20312223 and 20312224.
> {code}
> DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Found 1 hlogs to remove
> out of total 4; oldest outstanding sequenceid is 20312224 from region
> 2acaf8e3acfd2e8a5825a1f6f0aca4a8
> {code}
> Now we see that the parent region which was clsoed has a seq id 20312224
> which is not flushed.
> So further flush are failing as the region is already removed from
> onlineRegionList.
> The doubt here is before the region could be closed, a put has arrived for
> this region. But due to some reason the flush has not happened for that. We
> tried to dig this, but not able to get this problem again.
> 0.90.5 version + few 0.90.6 patches ( before 0.90.6RC0)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira