[ 
https://issues.apache.org/jira/browse/ARROW-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761432#comment-16761432
 ] 

Wes McKinney commented on ARROW-1089:
-------------------------------------

This can be addressed once we are able to stream through Parquet datasets. The 
next obvious ask would be to stream them to IPC format on disk, then memory map 
the result

> [C++/Python] Add API to write an Arrow stream into either the stream or file 
> formats on disk
> --------------------------------------------------------------------------------------------
>
>                 Key: ARROW-1089
>                 URL: https://issues.apache.org/jira/browse/ARROW-1089
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++, Python
>            Reporter: Wes McKinney
>            Priority: Major
>             Fix For: 0.14.0
>
>
> For Arrow streams with unknown size, it would be useful to be able to write 
> the data to disk either as a stream or as the file format (for random access) 
> with minimal overhead; i.e. we would avoid record batch IPC loading and write 
> the raw messages directly to disk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to