[ https://issues.apache.org/jira/browse/HADOOP-12869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187773#comment-15187773 ]
Colin Patrick McCabe commented on HADOOP-12869: ----------------------------------------------- The documentation for {{InputStream#read}} says: {{If len is zero, then no bytes are read and 0 is returned; otherwise, there is an attempt to read at least one byte. If no byte is available because the stream is at end of file, the value -1 is returned; otherwise, at least one byte is read and stored into b.}} It seems clear that an {{InputStream}} which returns {{0}} when {{len != 0}} is not following the spec. Of course, we might have such an implementation somewhere in Hadoop, but it would be a bug. Perhaps we should throw an exception rather than looping, to make users aware of the bug in the wrapped {{InputStream}}. There is no guarantee that looping is the right thing to do in the case of a buggy {{InputStream}} (say, one which returns 0 on EOF rather than -1). I don't see how we can have a unit test for this, unless we deliberately mock up a buggy {{InputStream}}. > CryptoInputStream#read() may return incorrect result > ---------------------------------------------------- > > Key: HADOOP-12869 > URL: https://issues.apache.org/jira/browse/HADOOP-12869 > Project: Hadoop Common > Issue Type: Bug > Components: security > Affects Versions: 2.6.0, 2.7.0, 3.0.0 > Reporter: Dapeng Sun > Assignee: Dapeng Sun > Priority: Critical > Attachments: HADOOP-12869.001.patch, HADOOP-12869.002.patch > > > Here is the comment of {{FilterInputStream#read()}}: > {noformat} > /** > * Reads the next byte of data from this input stream. The value > * byte is returned as an <code>int</code> in the range > * <code>0</code> to <code>255</code>. If no byte is available > * because the end of the stream has been reached, the value > * <code>-1</code> is returned. This method blocks until input data > * is available, the end of the stream is detected, or an exception > * is thrown. > * <p> > * This method > * simply performs <code>in.read()</code> and returns the result. > * > * @return the next byte of data, or <code>-1</code> if the end of the > * stream is reached. > * @exception IOException if an I/O error occurs. > * @see java.io.FilterInputStream#in > */ > public int read() throws IOException { > return in.read(); > } > {noformat} > Here is the implementation of {{CryptoInputStream#read()}} in Hadoop Common: > {noformat} > @Override > public int read() throws IOException { > return (read(oneByteBuf, 0, 1) == -1) ? -1 : (oneByteBuf[0] & 0xff); > > } > {noformat} > The return value of {{read(oneByteBuf, 0, 1)}} maybe 1, -1 and 0: > For {{1}}: we should return the content of {{oneByteBuf}} > For {{-1}}: we should return {{-1}} to stand for the end of stream > For {{0}}: it means we didn't get decryption data back and it is not the end > of the stream, we should continue to decrypt the stream. But it return {{0}} > on {{read()}} in current implementation, it means the decrypted content is > {{0}} and it is incorrect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)