[
https://issues.apache.org/jira/browse/HBASE-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780514#action_12780514
]
Lars George commented on HBASE-1994:
------------------------------------
Based on the above the plan could be to use ZK to flag a split in progress and
then do the process more resilient
- Flag in ZK that split is running
- Read log but do not delete it yet
- Write logs into region directories with a temp extension and a sequential
filename
- Once this all succeeded delete old logs and rename split logs to have
non-temp extension
- Clear ZK flag
That way if something happens the original logs are still preserved until
properly split. A split dying halfway through is also OK as the temp split
files can safely be removed.
> Master will lose hlog entries while splitting if region has empty
> oldlogfile.log
> --------------------------------------------------------------------------------
>
> Key: HBASE-1994
> URL: https://issues.apache.org/jira/browse/HBASE-1994
> Project: Hadoop HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.21.0
> Reporter: Cosmin Lehene
> Priority: Blocker
> Fix For: 0.21.0
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> I don't know yet how an empty oldlogfile.log can exist, however it happened.
> Master will fail to put the splits in the region oldlogfile.log if an empty
> oldlogfile.log already exists there.
> This is the master log after I artificially reproduced it by placing an empty
> oldlogfile.log in /hbase/.META./1028785192/oldlogfile.log and then killed the
> regionserver that was holding the .META. table
> 2009-11-19 09:08:36,012 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
> Splitting 1 hlog(s) in hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773
> 2009-11-19 09:08:36,012 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog:
> Splitting hlog 1 of 1:
> hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773/hlog.dat.1258637493128,
> length=0
> 2009-11-19 09:08:36,019 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog:
> Adding queue for .META.,,1
> 2009-11-19 09:08:36,037 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog:
> Pushed=795 entries from
> hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773/hlog.dat.1258637493128
> 2009-11-19 09:08:36,038 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog:
> Thread got 795 to process
> 2009-11-19 09:08:36,043 WARN org.apache.hadoop.hbase.regionserver.wal.HLog:
> Old hlog file hdfs://b0:9000/hbase/.META./1028785192/oldlogfile.log already
> exists. Copying existing file to new file
> 2009-11-19 09:08:36,079 WARN org.apache.hadoop.hbase.regionserver.wal.HLog:
> Got while writing region .META.,,1 log java.io.EOFException
> 2009-11-19 09:08:36,081 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
> hlog file splitting completed in 70 millis for
> hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.