[ 
https://issues.apache.org/jira/browse/ARROW-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839627#comment-16839627
 ] 

Chao Sun edited comment on ARROW-5317 at 5/14/19 5:19 PM:
----------------------------------------------------------

[~wesmckinn], [~andygrove] could you add [~FabioBatSilva] into the contributor 
list so we can assign this Jira to him? Thanks.


was (Author: csun):
[~wesmckinn] @andygrove: could you add [~FabioBatSilva] into the contributor 
list so we can assign this Jira to him? Thanks.

> [Rust] [Parquet] impl IntoIterator for SerializedFileReader
> -----------------------------------------------------------
>
>                 Key: ARROW-5317
>                 URL: https://issues.apache.org/jira/browse/ARROW-5317
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Rust
>            Reporter: Fabio Batista da Silva
>            Priority: Minor
>
> This is a follow up to [https://github.com/apache/arrow/issues/4301].
> The current implementation of a row iterator *RowIter* borrows the 
> *FileReader*
>  which the user has to keep the file reader alive for as long as the iterator 
> is alive..
> And make is hard to iterate over multiple *FileReader* / *RowIter*..
> {code:java}
> fn main() {
>     let path1 = Path::new("path-to/1.snappy.parquet");
>     let path2 = Path::new("path-to/2.snappy.parquet");
>     let vec = vec![path1, path2];
>     let it = vec.iter()
>         .map(|p| {
>             File::open(p).unwrap()
>         })
>         .map(|f| {
>             SerializedFileReader::new(f).unwrap()
>         })
>         .flat_map(|reader| -> RowIter {
>             RowIter::from_file(None, &reader).unwrap()
> //|             |                        |
> //|             |                        `reader` is borrowed here
> //|             returns a value referencing data owned by the current function
>         })
>     ;
>     for r in it {
>         println!("{}", r);
>     }
> }
> {code}
> One solution could be to implement a row iterator that takes owners of the 
> reader.
> Perhaps implementing *std::iter::IntoIterator* for the *SerializedFileReader*
> {code:java}
> ....
> .map(|p| {
>     File::open(p).unwrap()
> })
> .map(|f| {
>     SerializedFileReader::new(f).unwrap()
> })
> .flat_map(|r| -> r.into_iter())
> ....
> {code}
>  
> Happy to put a PR out with this..
>  Please let me know if this makes sense and you guys already have some way of 
> doing this..



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to