[jira] [Commented] (ARROW-12681) [Python] Expose IpcReadOptions to ipc facility

Joris Van den Bossche (Jira) Tue, 26 Oct 2021 05:29:11 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434325#comment-17434325
 ]


Joris Van den Bossche commented on ARROW-12681:
-----------------------------------------------

In the context of ARROW-14470 for the Feather reader, we have been looking a 
bit into the IpcReadOptions.

Some observations / questions:

- For writing, we already expose the IpcWriteOptions in Python (so also 
exposing IpcReadOptions would be consistent with that), although I agree adding 
a {{columns}} keyword would be more user friendly. 
- Typically (for other readers we have), such a {{columns}} keyword for only 
reading a subset is exposed in the "read" function. But for 
RecordBatchFileReader, the options are passed when opening the reader. So in 
the Python API it would rather be {{RecordBatchFileReader(source, 
columns=...).read_all()}} instead of 
{{RecordBatchFileReader(source).read_all(columns=...)}}. Are we OK with that 
discrepancy on the Python side? 

> [Python] Expose IpcReadOptions to ipc facility
> ----------------------------------------------
>
>                 Key: ARROW-12681
>                 URL: https://issues.apache.org/jira/browse/ARROW-12681
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Francois Saint-Jacques
>            Priority: Minor
>
> I would like to be able to read only a subset of columns from a given IPC 
> file. To do this, we need to expose the EXPERIMENTAL (is it still?) 
> IpcReaderOptions.include_fields option. The reason is that the file is on a 
> remote storage and can't mmap thus I want to minimize network transfer.
> I do not know the best way to "pythonize" IpcReaderOptions and would need 
> help on this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-12681) [Python] Expose IpcReadOptions to ipc facility

Reply via email to