[ 
https://issues.apache.org/jira/browse/HBASE-20604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680674#comment-16680674
 ] 

Duo Zhang commented on HBASE-20604:
-----------------------------------

Still no UT here? And I still do not know what is going on here. Why an Input 
stream could always give you an EOF but if you reset to the beginning then 
everything works fine? Could you please explain more on what's wrong with the 
CryptoInputStream?

Thanks.

> ProtobufLogReader#readNext can incorrectly loop to the same position in the 
> stream until the the WAL is rolled
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-20604
>                 URL: https://issues.apache.org/jira/browse/HBASE-20604
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication, wal
>    Affects Versions: 3.0.0
>            Reporter: Esteban Gutierrez
>            Assignee: Esteban Gutierrez
>            Priority: Critical
>             Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 2.0.3, 1.4.9, 2.1.2, 1.2.9
>
>         Attachments: HBASE-20604.002.patch, HBASE-20604.003.patch, 
> HBASE-20604.004.patch, HBASE-20604.005.patch, HBASE-20604.patch
>
>
> Every time we call {{ProtobufLogReader#readNext}} we consume the input stream 
> associated to the {{FSDataInputStream}} from the WAL that we are reading. 
> Under certain conditions, e.g. when using the encryption at rest 
> ({{CryptoInputStream}}) the stream can return partial data which can cause a 
> premature EOF that cause {{inputStream.getPos()}} to return to the same 
> origina position causing {{ProtobufLogReader#readNext}} to re-try over the 
> reads until the WAL is rolled.
> The side effect of this issue is that {{ReplicationSource}} can get stuck 
> until the WAL is rolled and causing replication delays up to an hour in some 
> cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to