[ 
https://issues.apache.org/jira/browse/ARROW-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575600#comment-17575600
 ] 

Antoine Pitrou commented on ARROW-17313:
----------------------------------------

That said, it can also be implemented in a "dump" way: call 
{{RandomAccessFile::GetStream}} to get a subset of the input file, and let the 
CSV reader try to make sense of it.

> Add Byte Range to CSV Reader ReadOptions
> ----------------------------------------
>
>                 Key: ARROW-17313
>                 URL: https://issues.apache.org/jira/browse/ARROW-17313
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, Python
>            Reporter: Ziheng Wang
>            Assignee: Ziheng Wang
>            Priority: Major
>
> Sometimes it's desirable to just read a portion of a CSV. The best way to do 
> that is to pass in a list of byte ranges to CSV read options that specify 
> where in the CSV you want to read. These byte ranges don't necessarily have 
> to be aligned on line break boundaries, the CSV reader should just read until 
> the end of the line, and skip anything before the first line break in a byte 
> range.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to