samredai commented on a change in pull request #4021:
URL: https://github.com/apache/iceberg/pull/4021#discussion_r799082313



##########
File path: python/src/iceberg/io/base.py
##########
@@ -24,7 +24,40 @@
 """
 
 from abc import ABC, abstractmethod
-from typing import Union
+from typing import Protocol, Union, runtime_checkable
+
+
+@runtime_checkable
+class InputStream(Protocol):
+    def read(self, n: int) -> bytes:
+        ...

Review comment:
       @emkornfield I think I follow. It sounds like when using FileIO 
instances throughout the library, you're saying we have to support those that 
produce file-like objects (which we pass into `PythonFile` then into pyarrow) 
as well as those that produce `pyarrow.NativeFile` instances which can just be 
passed directly to pyarrow.
   
   In that case, instead of defining Protocols, should we just use typehints 
for the `InputFile` and `OutputFile` classes to specify that either 
[typing.IO](https://docs.python.org/3/library/typing.html#typing.IO) or 
`pyarrow.NativeFile` can be returned?
   ```py
   from abc import ABC, abstractmethod
   from typing import Union, IO
   
   from pyarrow import NativeFile
   
   class InputFile(ABC):
     ...
     def open(self) -> Union[IO, NativeFile]
       ...
   
   class OutputFile(ABC):
     ...
     def create(self) -> Union[IO, NativeFile]:
       ...
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to