[ 
https://issues.apache.org/jira/browse/HADOOP-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464138#comment-16464138
 ] 

Thomas Marquardt commented on HADOOP-15446:
-------------------------------------------

I've attached HADOOP-15446-002.patch with the test fix.  Sorry about that 
Steve.  I copied the tests from TestBlockBlobInputStream because it covers 
random access well, but did not intend to include the performance checks.  For 
Page Blob, it is necessary to read the last page to determine the length of the 
data, so the network I/O is expected.  Each page has a 16-bit header which 
stores the length of the data in the page, and all but the last page is 
guaranteed to be full with data.  Since my storage account and Hadoop 
development machine are in the same region, I don't experience the latency you 
do, and didn't notice that I forgot to remove the latency check.

With HADOOP-15446-002.patch, all WASB test cases are passing against my Azure 
storage account:

*$ mvn -T 1C -Dparallel-tests clean verify*

Tests run: 252, Failures: 0, Errors: 0, Skipped: 11
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
 Tests run: 418, Failures: 0, Errors: 0, Skipped: 55
 Tests run: 126, Failures: 0, Errors: 0, Skipped: 10

> WASB: PageBlobInputStream.skip breaks HBASE replication
> -------------------------------------------------------
>
>                 Key: HADOOP-15446
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15446
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>    Affects Versions: 2.9.0, 3.0.2
>            Reporter: Thomas Marquardt
>            Assignee: Thomas Marquardt
>            Priority: Major
>         Attachments: HADOOP-15446-001.patch, HADOOP-15446-002.patch
>
>
> Page Blobs are primarily used by HBASE.  HBASE replication, which apparently 
> has not been used with WASB until recently, performs non-sequential reads on 
> log files using PageBlobInputStream.  There are bugs in this stream 
> implementation which prevent skip and seek from working properly, and 
> eventually the stream state becomes corrupt and unusable.
> I believe this bug affects all releases of WASB/HADOOP.  It appears to be a 
> day-0 bug in PageBlobInputStream.  There were similar bugs opened in the past 
> (HADOOP-15042) but the issue was not properly fixed, and no test coverage was 
> added.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to