[
https://issues.apache.org/jira/browse/HBASE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Himanshu Vashishtha updated HBASE-6774:
---------------------------------------
Attachment: HBase-6774-approach.pdf
Hello,
I have come up with a proposal. Suggestions are welcome.
Thanks.
> Immediate assignment of regions that don't have entries in HLog
> ---------------------------------------------------------------
>
> Key: HBASE-6774
> URL: https://issues.apache.org/jira/browse/HBASE-6774
> Project: HBase
> Issue Type: Improvement
> Components: master, regionserver
> Affects Versions: 0.95.2
> Reporter: Nicolas Liochon
> Assignee: Himanshu Vashishtha
> Attachments: HBase-6774-approach.pdf
>
>
> The algo is today, after a failure detection:
> - split the logs
> - when all the logs are split, assign the regions
> But some regions can have no entries at all in the HLog. There are many
> reasons for this:
> - kind of reference or historical tables. Bulk written sometimes then read
> only.
> - sequential rowkeys. In this case, most of the regions will be read only.
> But they can be in a regionserver with a lot of writes.
> - tables flushed often for safety reasons. I'm thinking about meta here.
> For meta; we can imagine flushing very often. Hence, the recovery for meta,
> in many cases, will be the failure detection time.
> There are different possible algos:
> Option 1)
> A new task is added, in parallel of the split. This task reads all the HLog.
> If there is no entry for a region, this region is assigned.
> Pro: simple
> Cons: We will need to read all the files. Add a read.
> Option 2)
> The master writes in ZK the number of log files, per region.
> When the regionserver starts the split, it reads the full block (64M) and
> decrease the log file counter of the region. If it reaches 0, the assign
> start. At the end of its split, the region server decreases the counter as
> well. This allow to start the assign even if not all the HLog are finished.
> It would allow to make some regions available even if we have an issue in one
> of the log file.
> Pro: parallel
> Cons: add something to do for the region server. Requites to read the whole
> file before starting to write.
> Option 3)
> Add some metadata at the end of the log file. The last log file won't have
> meta data, as if we are recovering, it's because the server crashed. But the
> others will. And last log file should be smaller (half a block on average).
> Option 4) Still some metadata, but in a different file. Cons: write are
> increased (but not that much, we just need to write the region once). Pros:
> if we lose the HLog files (major failure, no replica available) we can still
> continue with the regions that were not written at this stage.
> I think it should be done, even if none of the algorithm above is totally
> convincing yet. It's linked as well to locality and short circuit reads: with
> these two points reading the file twice become much less of an issue for
> example. My current preference would be to open the file twice in the region
> server, once for splitting as of today, once for a quick read looking for
> unused regions. Who knows, may be it would even be faster this way, the quick
> read thread would warm-up the different caches for the splitting thread.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira