[
https://issues.apache.org/jira/browse/ARROW-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901124#comment-16901124
]
Neal Richardson commented on ARROW-5977:
----------------------------------------
All of R's main CSV readers support this. One way they all expose this by
allowing you to provide a null type for some columns when you specify their
types explicitly. A couple of the readers allow you to specify columns by name
or position to keep or drop.
I think this is a good idea not just in the context of reading a CSV itself but
also for the Datasets framework, where we are lazily reading chunks of data as
needed and trying to be efficient with memory usage.
> [C++] [Python] Method for read_csv to limit which columns are read?
> -------------------------------------------------------------------
>
> Key: ARROW-5977
> URL: https://issues.apache.org/jira/browse/ARROW-5977
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++, Python
> Affects Versions: 0.14.0
> Reporter: Jordan Samuels
> Priority: Major
> Labels: csv
>
> In pandas there is pd.read_csv(usecols=...) but I can't see a way to do this
> in pyarrow.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)