[ 
https://issues.apache.org/jira/browse/HBASE-10278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13864711#comment-13864711
 ] 

Himanshu Vashishtha commented on HBASE-10278:
---------------------------------------------

Thanks a lot for the reviews and comments guys.
[~sershe] Yes, merging of seqid-mvcc would help in relaxing the limitation but 
I think it would be good to have WAL switches compatible with 0.96/0.98 without 
any other dependency.
Yes, the switch happens when the current WAL becomes slow. It makes the new WAL 
active (doesn't need rolling of the new WAL). The new WAL takes the inflight 
edits and then starts taking newer edits. Meanwhile the slow old WAL is rolled 
in parallel. Its not taking any writes, so no throttling is required. If the 
hiccup stays for long, the new WAL might switch too. In future, we could use 
some heuristic to monitor switches (current WAL size, last switch time, etc).

[~ram_krish]:
bq. the log roll for WAL A has to be completed by blocking all writes?
WAL A is not taking any new writes at this moment, as WAL B is the active one. 
I don't see any writes blocked by A's rolling. Read the above explanation and 
let me know if I am missing anything in your question.

bq.  If the rollwriter happens and at the same time we start taking writes on 
WAL B the above mentioned scenario happens. so in that case we may have out of 
order edits during log split if this RS crashes right ?.
So, the situation is RS crashes while switching? 
I do see duplicate edits in two WALs (as in-flight edits has to be appended on 
every switch), but I don't see out-of-order edits even in this case. Could you 
please explain how you see out-of-order edits?

Re: Other implementations:
The goal is to implement WAL Switching such that other HLog implementations  
(such as per table MultiWAL) can re-use it. 
Yes, there is some refactoring required to make test classes use HLog as an 
interface (currently, they are calling 14 non-interface methods in FSHLog). 
There are methods which are implementation specific: such as rollLog, 
getNumberOfWALs, etc. A HLog client (such as Regionserver) shouldn't really 
care about that, but implementors do. I plan to do this refactoring here.

> Provide better write predictability
> -----------------------------------
>
>                 Key: HBASE-10278
>                 URL: https://issues.apache.org/jira/browse/HBASE-10278
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Himanshu Vashishtha
>            Assignee: Himanshu Vashishtha
>         Attachments: Multiwaldesigndoc.pdf
>
>
> Currently, HBase has one WAL per region server. 
> Whenever there is any latency in the write pipeline (due to whatever reasons 
> such as n/w blip, a node in the pipeline having a bad disk, etc), the overall 
> write latency suffers. 
> Jonathan Hsieh and I analyzed various approaches to tackle this issue. We 
> also looked at HBASE-5699, which talks about adding concurrent multi WALs. 
> Along with performance numbers, we also focussed on design simplicity, 
> minimum impact on MTTR & Replication, and compatibility with 0.96 and 0.98. 
> Considering all these parameters, we propose a new HLog implementation with 
> WAL Switching functionality.
> Please find attached the design doc for the same. It introduces the WAL 
> Switching feature, and experiments/results of a prototype implementation, 
> showing the benefits of this feature.
> The second goal of this work is to serve as a building block for concurrent 
> multiple WALs feature.
> Please review the doc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to