Re: [PR] HADOOP-19400: Expand specification and contract test coverage for InputStream reads. [hadoop]

via GitHub Mon, 10 Feb 2025 09:58:06 -0800


steveloughran commented on code in PR #7367:
URL: https://github.com/apache/hadoop/pull/7367#discussion_r1949623287



##########
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md:
##########
@@ -175,6 +175,24 @@ must block until at least one byte is returned. Thus, for 
any data source
 of length greater than zero, repeated invocations of this `read()` operation
 will eventually read all the data.
 
+#### Implementation Notes
+
+1. If the caller passes a `null` buffer, then an unchecked exception is 
thrown. The base JDK
+`InputStream` implementation throws `NullPointerException`. HDFS historically 
used
+`IllegalArgumentException`. Either of these are acceptable.
+1. If the caller passes a negative value for `offset`, then 
`IndexOutOfBoundsException` is thrown.
+1. If the caller passes a negative value for `length`, then an unchecked 
exception is thrown. The
+base JDK `InputStream` implementation throws `IndexOutOfBoundsException`. HDFS 
historically used
+`IllegalArgumentException`. Either of these are acceptable.
+1. If the caller passes an `offset + length` that would run past the length of 
`buffer`, then
+`IndexOutOfBoundsException` is thrown.
+1. A read of `length` 0 is a no-op, and the returned `result` is 0. No 
exception is thrown, assuming
+all other arguments are valid.
+1. Reads through any method are expected to return the same data.

Review Comment:
   ooh, good one this. It's implicit in the model of data as an array of bytes, 
but yes, nice to call out.



##########
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md:
##########
@@ -175,6 +175,24 @@ must block until at least one byte is returned. Thus, for 
any data source
 of length greater than zero, repeated invocations of this `read()` operation
 will eventually read all the data.
 
+#### Implementation Notes

Review Comment:
   1. is there a way to add these specification statements in the python 
statements? as that's designed to be what people write tests off.
   2. Please you the SHALL/MUST/MAYR than other terms "is expected to" etc. 
Yes, yours is the better prose, but we want no ambiguity here



##########
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractOpenTest.java:
##########
@@ -408,4 +411,137 @@ public void testFloatingPointLength() throws Throwable {
         .isEqualTo(len);
   }
 
+  @Test
+  public void testInputStreamReadNullBuffer() throws Throwable {
+    // The JDK base InputStream (and by extension LocalFSFileInputStream) 
throws
+    // NullPointerException. Historically, DFSInputStream has thrown 
IllegalArgumentException
+    // instead. Allow either behavior.
+    describe("Attempting to read into a null buffer should throw 
IllegalArgumentException or " +
+        "NullPointerException");
+    Path path = methodPath();
+    FileSystem fs = getFileSystem();
+    int len = 4096;
+    createFile(fs, path, true,
+        dataset(len, 0x40, 0x80));
+    try (FSDataInputStream is = fs.openFile(path).build().get()) {
+      Assertions.assertThatThrownBy(() -> is.read(null, 0, 10))
+          .isInstanceOfAny(IllegalArgumentException.class, 
NullPointerException.class);

Review Comment:
   not seen this before, interesting.
   I wonder if we could extend intercept() to take a list of classes.
   
   I prefer intercept because it does two things assertj doesn't
   * returns the assert for future analysis
   * if there is no exception raised, returns the result of any 
lambda-expression which doesn't return void. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Re: [PR] HADOOP-19400: Expand specification and contract test coverage for InputStream reads. [hadoop]

Reply via email to