[ 
https://issues.apache.org/jira/browse/HADOOP-12403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14740240#comment-14740240
 ] 

Hadoop QA commented on HADOOP-12403:
------------------------------------

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 37s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  5s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 14s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 22s | The applied patch generated  
13 new checkstyle issues (total was 11, now 19). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 43s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   0m 54s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | tools/hadoop tests |   1m 11s | Tests passed in 
hadoop-azure. |
| | |  40m 11s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-azure |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12755309/HADOOP-12403.03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f103a70 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HADOOP-Build/7634/artifact/patchprocess/diffcheckstylehadoop-azure.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7634/artifact/patchprocess/newPatchFindbugsWarningshadoop-azure.html
 |
| hadoop-azure test log | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7634/artifact/patchprocess/testrun_hadoop-azure.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7634/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/7634/console |


This message was automatically generated.

> Enable multiple writes in flight for HBase WAL writing
> ------------------------------------------------------
>
>                 Key: HADOOP-12403
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12403
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: tools
>            Reporter: Duo Xu
>            Assignee: Duo Xu
>         Attachments: HADOOP-12403.01.patch, HADOOP-12403.02.patch, 
> HADOOP-12403.03.patch
>
>
> Azure HDI HBase clusters use Azure blob storage as file system. We found that 
> the bottle neck was during writing to write ahead log (WAL). The latest HBase 
> WAL write model (HBASE-8755) uses multiple AsyncSyncer threads to sync data 
> to HDFS. However, our WASB driver is still based on a single thread model. 
> Thus when the sync threads call into WASB layer, every time only one thread 
> will be allowed to send data to Azure storage.This jira is to introduce a new 
> write model in WASB layer to allow multiple writes in parallel.
> 1. Since We use page blob for WAL, this will cause "holes" in the page blob 
> as every write starts on a new page. We use the first two bytes of every page 
> to record the actual data size of the current page.
> 2. When reading WAL, we need to know the actual size of the WAL. This should 
> be the sum of the number represented by the first two bytes of every page. 
> However looping over every page to get the size will be very slow, 
> considering normal WAL size is 128MB and each page is 512 bytes. So during 
> writing, every time a write succeeds, a metadata of the blob called 
> "total_data_uploaded" will be updated.
> 3. Although we allow multiple writes in flight, we need to make sure the sync 
> threads which call into WASB layers return in order. Reading HBase source 
> code FSHLog.java, we find that every sync request is associated with a 
> transaction id. If the sync succeeds, all the transactions prior to this 
> transaction id are assumed to be in Azure Storage. We use a queue to store 
> the sync requests and make sure they return to HBase layer in order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to