[
https://issues.apache.org/jira/browse/HBASE-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780981#action_12780981
]
Lars George commented on HBASE-1994:
------------------------------------
I am really torn looking at HBASE-1364. It seems there is a lot of planning
already to get splits to work better. Is this bug here simply a preliminary fix
to avoid the empty log files and also not using the same file names? If so it
is fine. But I would like to know what the overall plan is.
Before reading HBASE-1364 I had a similar thought how I would improve on my
suggestion above. I thought we could do this:
- Master sees abandoned log and sets marker in ZK for regions contained in log
- RegionServer on start up or region open checks ZK marker and reads the
original file extracting the HLogKey's it needs storing them into a local file
in the region (this is the distributed log split, not all RS's split here but
those that need to read the logs local anyways.) Also possible to also put the
data into a MemStore at the same time.
- The RS applies the log and flushes, then sets a semaphore that it completed
next to the original log.
- If we have a local copy done as well as per above then the RS can delete it
now
- RS clears marker in ZK for its region(s)
- Once the master sees that all RS have read the log and processed it it can
delete the original log
I am probably missing some intrinsic details as I have read the HLog etc. code
but did not debug it or so to see if I got all the steps right. Let me know
what you think and what you want me to do here.
> Master will lose hlog entries while splitting if region has empty
> oldlogfile.log
> --------------------------------------------------------------------------------
>
> Key: HBASE-1994
> URL: https://issues.apache.org/jira/browse/HBASE-1994
> Project: Hadoop HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.21.0
> Reporter: Cosmin Lehene
> Priority: Blocker
> Fix For: 0.21.0
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> I don't know yet how an empty oldlogfile.log can exist, however it happened.
> Master will fail to put the splits in the region oldlogfile.log if an empty
> oldlogfile.log already exists there.
> This is the master log after I artificially reproduced it by placing an empty
> oldlogfile.log in /hbase/.META./1028785192/oldlogfile.log and then killed the
> regionserver that was holding the .META. table
> 2009-11-19 09:08:36,012 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
> Splitting 1 hlog(s) in hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773
> 2009-11-19 09:08:36,012 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog:
> Splitting hlog 1 of 1:
> hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773/hlog.dat.1258637493128,
> length=0
> 2009-11-19 09:08:36,019 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog:
> Adding queue for .META.,,1
> 2009-11-19 09:08:36,037 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog:
> Pushed=795 entries from
> hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773/hlog.dat.1258637493128
> 2009-11-19 09:08:36,038 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog:
> Thread got 795 to process
> 2009-11-19 09:08:36,043 WARN org.apache.hadoop.hbase.regionserver.wal.HLog:
> Old hlog file hdfs://b0:9000/hbase/.META./1028785192/oldlogfile.log already
> exists. Copying existing file to new file
> 2009-11-19 09:08:36,079 WARN org.apache.hadoop.hbase.regionserver.wal.HLog:
> Got while writing region .META.,,1 log java.io.EOFException
> 2009-11-19 09:08:36,081 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
> hlog file splitting completed in 70 millis for
> hdfs://b0:9000/hbase/.logs/b4,60020,1258637492773
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.