[GitHub] [arrow] dongjoon-hyun commented on a change in pull request #12153: ARROW-15338: [Python] Add `pyarrow.orc.read_table` API

GitBox Tue, 18 Jan 2022 20:21:11 -0800


dongjoon-hyun commented on a change in pull request #12153:
URL: https://github.com/apache/arrow/pull/12153#discussion_r787338358




##########
File path: python/pyarrow/orc.py
##########
@@ -175,3 +176,33 @@ def write_table(table, where):
     writer = ORCWriter(where)
     writer.write(table)
     writer.close()
+
+
+def read_table(source, columns=None, filesystem=None):
+    """
+    Read a table from ORC format
+
+    Parameters
+    ----------
+    source : str, pyarrow.NativeFile, or file-like object
+        If a string passed, can be a single file name or directory name. For
+        file-like objects, only read a single file. Use pyarrow.BufferReader to
+        read a file contained in a bytes or buffer-like object.
+    columns : list
+        If not None, only these columns will be read from the file. A column
+        name may be a prefix of a nested field, e.g. 'a' will select 'a.b',
+        'a.c', and 'a.d.e'. If empty, no columns will be read. Note
+        that the table will still have the correct num_rows set despite having
+        no columns.

Review comment:
       Since it needs a fix on `ORCFile` itself, can I handle it later in a 
separate PR, @jorisvandenbossche ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] dongjoon-hyun commented on a change in pull request #12153: ARROW-15338: [Python] Add `pyarrow.orc.read_table` API

Reply via email to