rdblue commented on code in PR #6532:
URL: https://github.com/apache/iceberg/pull/6532#discussion_r1063851662


##########
python/pyiceberg/avro/decoder.py:
##########
@@ -50,10 +50,10 @@ def read(self, n: int) -> bytes:
             raise ValueError(f"Requested {n} bytes to read, expected positive 
integer.")
         read_bytes = self._input_stream.read(n)
         read_len = len(read_bytes)
-        if read_len <= 0:
-            raise EOFError
+        if read_len < 0:
+            raise EOFError(f"Got negative length: {read_len}")
         elif read_len != n:
-            raise ValueError(f"Read {len(read_bytes)} bytes, expected {n} 
bytes")
+            raise EOFError(f"Read {read_len} bytes, expected {n} bytes")

Review Comment:
   I don't think this is quite correct. If read_len is 0 then `EOFError` is 
correct. But it is perfectly fine for a read to return fewer bytes than 
requested. From [RawIOBase.read 
docs](https://docs.python.org/3/library/io.html#io.RawIOBase.read):
   
   > Read up to size bytes from the object and return them. As a convenience, 
if size is unspecified or -1, all bytes until EOF are returned. Otherwise, only 
one system call is ever made. Fewer than size bytes may be returned if the 
operating system call returns fewer than size bytes.
   
   If fewer bytes are returned this can either loop and call read until the 
required bytes are returned ("read fully"), or it can return a shorter array. 
Since this is a higher-level API, I think it should have read fully behavior.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to