kylebarron commented on code in PR #825:
URL: https://github.com/apache/datafusion-python/pull/825#discussion_r1731350333


##########
python/datafusion/context.py:
##########
@@ -586,19 +586,30 @@ def from_pydict(
         """
         return DataFrame(self.ctx.from_pydict(data, name))
 
-    def from_arrow_table(
-        self, data: pyarrow.Table, name: str | None = None
-    ) -> DataFrame:
-        """Create a :py:class:`~datafusion.dataframe.DataFrame` from an Arrow 
table.
+    def from_arrow(self, data: Any, name: str | None = None) -> DataFrame:
+        """Create a :py:class:`~datafusion.dataframe.DataFrame` from an Arrow 
source.
+
+        The Arrow data source can be any object that implements either
+        ``__arrow_c_stream__`` or ``__arrow_c_array__``. For the latter, it 
must return
+        a struct array. Common examples of sources from pyarrow include
 
         Args:
-            data: Arrow table.
+            data: Arrow data source.
             name: Name of the DataFrame.
 
         Returns:
             DataFrame representation of the Arrow table.
         """
-        return DataFrame(self.ctx.from_arrow_table(data, name))
+        return DataFrame(self.ctx.from_arrow(data, name))
+
+    def from_arrow_table(

Review Comment:
   Do you want to apply a `@deprecated` decorator to signify that it may be 
removed in the future?



##########
python/datafusion/context.py:
##########
@@ -586,19 +586,30 @@ def from_pydict(
         """
         return DataFrame(self.ctx.from_pydict(data, name))
 
-    def from_arrow_table(
-        self, data: pyarrow.Table, name: str | None = None
-    ) -> DataFrame:
-        """Create a :py:class:`~datafusion.dataframe.DataFrame` from an Arrow 
table.
+    def from_arrow(self, data: Any, name: str | None = None) -> DataFrame:
+        """Create a :py:class:`~datafusion.dataframe.DataFrame` from an Arrow 
source.
+
+        The Arrow data source can be any object that implements either
+        ``__arrow_c_stream__`` or ``__arrow_c_array__``. For the latter, it 
must return
+        a struct array. Common examples of sources from pyarrow include

Review Comment:
   For both they must emit a struct array. Any Arrow array can be passed 
through an `__arrow_c_stream__`. Canonically, to transfer a DataFrame you have 
a stream of struct arrays where each one is unpacked to be the columns of a 
RecordBatch. But it doesn't have to a struct array: you can also transfer a 
`Series` through an `__arrow_c_stream__`, where each batch in the stream 
iterator is just a primitive array.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to