[
https://issues.apache.org/jira/browse/HBASE-16766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Enis Soztutar updated HBASE-16766:
----------------------------------
Attachment: hbase-16766_v1.patch
Something like this.
> Do not rely on InputStream.available()
> ---------------------------------------
>
> Key: HBASE-16766
> URL: https://issues.apache.org/jira/browse/HBASE-16766
> Project: HBase
> Issue Type: Bug
> Components: wal
> Reporter: Enis Soztutar
> Assignee: Enis Soztutar
> Fix For: 2.0.0, 1.4.0
>
> Attachments: hbase-16766_v1.patch
>
>
> ProtobufLogReader relies on InputStream.available() to figure out whether we
> have exhausted the file. However InputStream.available() javadoc states:
> {code}
> * <p> Note that while some implementations of {@code InputStream} will
> return
> * the total number of bytes in the stream, many will not. It is
> * never correct to use the return value of this method to allocate
> * a buffer intended to hold all data in this stream.
> {code}
> HDFS and many other Hadoop FS's, and things like ByteBufferInputStream, etc
> all return remaining bytes, so the code works on top of HDFS. However, on
> other file systems, it may or may not be true that IS.available() returns the
> remaining bytes. In one specific case, the ADLS wrapper FS used implement
> {{available()}} call with the correct semantics, which ended up causing data
> loss in the WAL recovery. We have since fixed ADLS to implement the HDFS
> semantics, but we should fix HBase itself so that we do not rely on
> available() call.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)