emkornfield commented on a change in pull request #4021:
URL: https://github.com/apache/iceberg/pull/4021#discussion_r798181988



##########
File path: python/src/iceberg/io/base.py
##########
@@ -24,7 +24,40 @@
 """
 
 from abc import ABC, abstractmethod
-from typing import Union
+from typing import Protocol, Union, runtime_checkable
+
+
+@runtime_checkable
+class InputStream(Protocol):
+    def read(self, n: int) -> bytes:
+        ...

Review comment:
       I don't think NativeFile is the right thing to return.  NativeFile is 
arrow's file abstraction and has native C++ implementations for the most part.  
[PythonFile](https://arrow.apache.org/docs/python/generated/pyarrow.PythonFile.html?highlight=pythonfile#pyarrow.PythonFile)
 is the adapter from File-like python objects to Arrow's File format.  I think 
IOBase probably guarantees this and looking at the 
[source](https://github.com/apache/arrow/blob/e9e16c9da7a76718640f2b3f23200a3755790011/python/pyarrow/io.pxi#L679)
 it seems like there are some isinstance checks for IOBase or duck-typed.  
   
   What we probably want to do in whatever code passes these interfaces to 
Arrow is to check if they are an instance of a wrapper around NativeFile and 
pass the native files instead to avoid GIL and non-zero indirection costs as 
mentioned in the PythonFile docs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to