steveloughran commented on code in PR #7367: URL: https://github.com/apache/hadoop/pull/7367#discussion_r1949623287
########## hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md: ########## @@ -175,6 +175,24 @@ must block until at least one byte is returned. Thus, for any data source of length greater than zero, repeated invocations of this `read()` operation will eventually read all the data. +#### Implementation Notes + +1. If the caller passes a `null` buffer, then an unchecked exception is thrown. The base JDK +`InputStream` implementation throws `NullPointerException`. HDFS historically used +`IllegalArgumentException`. Either of these are acceptable. +1. If the caller passes a negative value for `offset`, then `IndexOutOfBoundsException` is thrown. +1. If the caller passes a negative value for `length`, then an unchecked exception is thrown. The +base JDK `InputStream` implementation throws `IndexOutOfBoundsException`. HDFS historically used +`IllegalArgumentException`. Either of these are acceptable. +1. If the caller passes an `offset + length` that would run past the length of `buffer`, then +`IndexOutOfBoundsException` is thrown. +1. A read of `length` 0 is a no-op, and the returned `result` is 0. No exception is thrown, assuming +all other arguments are valid. +1. Reads through any method are expected to return the same data. Review Comment: ooh, good one this. It's implicit in the model of data as an array of bytes, but yes, nice to call out. ########## hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md: ########## @@ -175,6 +175,24 @@ must block until at least one byte is returned. Thus, for any data source of length greater than zero, repeated invocations of this `read()` operation will eventually read all the data. +#### Implementation Notes Review Comment: 1. is there a way to add these specification statements in the python statements? as that's designed to be what people write tests off. 2. Please you the SHALL/MUST/MAYR than other terms "is expected to" etc. Yes, yours is the better prose, but we want no ambiguity here ########## hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractOpenTest.java: ########## @@ -408,4 +411,137 @@ public void testFloatingPointLength() throws Throwable { .isEqualTo(len); } + @Test + public void testInputStreamReadNullBuffer() throws Throwable { + // The JDK base InputStream (and by extension LocalFSFileInputStream) throws + // NullPointerException. Historically, DFSInputStream has thrown IllegalArgumentException + // instead. Allow either behavior. + describe("Attempting to read into a null buffer should throw IllegalArgumentException or " + + "NullPointerException"); + Path path = methodPath(); + FileSystem fs = getFileSystem(); + int len = 4096; + createFile(fs, path, true, + dataset(len, 0x40, 0x80)); + try (FSDataInputStream is = fs.openFile(path).build().get()) { + Assertions.assertThatThrownBy(() -> is.read(null, 0, 10)) + .isInstanceOfAny(IllegalArgumentException.class, NullPointerException.class); Review Comment: not seen this before, interesting. I wonder if we could extend intercept() to take a list of classes. I prefer intercept because it does two things assertj doesn't * returns the assert for future analysis * if there is no exception raised, returns the result of any lambda-expression which doesn't return void. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org