[jira] [Commented] (ARROW-8258) [Rust] [Parquet] ArrowReader fails on some timestamp types
[ https://issues.apache.org/jira/browse/ARROW-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17070666#comment-17070666 ] Renjie Liu commented on ARROW-8258: --- I think the root cause is here [https://github.com/apache/arrow/blob/master/rust/parquet/src/arrow/array_reader.rs#L220] The array reader only did conversion of data buffer, but left data type incorrect. I'll submit a PR to fix it this week. > [Rust] [Parquet] ArrowReader fails on some timestamp types > -- > > Key: ARROW-8258 > URL: https://issues.apache.org/jira/browse/ARROW-8258 > Project: Apache Arrow > Issue Type: Bug > Components: Rust >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Fix For: 0.17.0 > > > I discovered this bug with this query > {code:java} > > SELECT tpep_pickup_datetime FROM taxi LIMIT 1; > General("InvalidArgumentError(\"column types must match schema types, > expected Timestamp(Microsecond, None) but found UInt64 at column index 0\")") > {code} > The parquet reader detects this schema when reading from the file: > {code:java} > Schema { > fields: [ > Field { name: "tpep_pickup_datetime", data_type: Timestamp(Microsecond, > None), nullable: true, dict_id: 0, dict_is_ordered: false } > ], > metadata: {} > } {code} > The struct array read from the file contains: > {code:java} > [PrimitiveArray > [ > 156731800800, > 156731935700, > 156732009200, > 156732115100, {code} > When the Parquet arrow reader creates the record batch, the following > validation logic fails: > {code:java} > for i in 0..columns.len() { > if columns[i].len() != len { > return Err(ArrowError::InvalidArgumentError( > "all columns in a record batch must have the same > length".to_string(), > )); > } > if columns[i].data_type() != schema.field(i).data_type() { > return Err(ArrowError::InvalidArgumentError(format!( > "column types must match schema types, expected {:?} but found > {:?} at column index {}", > schema.field(i).data_type(), > columns[i].data_type(), > i))); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-3085) [Rust] Add an adapter for parquet.
[ https://issues.apache.org/jira/browse/ARROW-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994054#comment-16994054 ] Renjie Liu commented on ARROW-3085: --- Yes, I think so. > [Rust] Add an adapter for parquet. > -- > > Key: ARROW-3085 > URL: https://issues.apache.org/jira/browse/ARROW-3085 > Project: Apache Arrow > Issue Type: New Feature > Components: Rust >Reporter: Renjie Liu >Assignee: Renjie Liu >Priority: Major > Labels: parquet > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-4059) [Rust] Parquet/Arrow Integration
[ https://issues.apache.org/jira/browse/ARROW-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994052#comment-16994052 ] Renjie Liu commented on ARROW-4059: --- [~nevi_me] No, we can close this now. > [Rust] Parquet/Arrow Integration > > > Key: ARROW-4059 > URL: https://issues.apache.org/jira/browse/ARROW-4059 > Project: Apache Arrow > Issue Type: New Feature > Components: Rust >Reporter: Chao Sun >Priority: Major > > This is the umbrella JIRA for implementing Parquet/Arrow integration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7348) [Rust] Add api to return references of buffer of null bitmap.
Renjie Liu created ARROW-7348: - Summary: [Rust] Add api to return references of buffer of null bitmap. Key: ARROW-7348 URL: https://issues.apache.org/jira/browse/ARROW-7348 Project: Apache Arrow Issue Type: Improvement Reporter: Renjie Liu Assignee: Renjie Liu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7312) [Rust] ArrowError should implement std::error:Error
Renjie Liu created ARROW-7312: - Summary: [Rust] ArrowError should implement std::error:Error Key: ARROW-7312 URL: https://issues.apache.org/jira/browse/ARROW-7312 Project: Apache Arrow Issue Type: Improvement Reporter: Renjie Liu Assignee: Renjie Liu ArrowError should implement this trait so that other crates can handle error from this crate more friendly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6890) [Rust] [Parquet] ArrowReader fails with seg fault
[ https://issues.apache.org/jira/browse/ARROW-6890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987900#comment-16987900 ] Renjie Liu commented on ARROW-6890: --- [~andygrove] Are you going to retry with new version of arrow reader? > [Rust] [Parquet] ArrowReader fails with seg fault > - > > Key: ARROW-6890 > URL: https://issues.apache.org/jira/browse/ARROW-6890 > Project: Apache Arrow > Issue Type: Bug > Components: Rust >Affects Versions: 1.0.0 >Reporter: Andy Grove >Assignee: Renjie Liu >Priority: Major > Fix For: 1.0.0 > > > ArrowReader fails with seg fault when trying to read an unsupported type, > like Utf8. We should have it return an Err instead of causing a segmentation > fault. > > See [https://github.com/apache/arrow/pull/5641] for a reproducible test. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-3706) [Rust] Add record batch reader trait.
[ https://issues.apache.org/jira/browse/ARROW-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982219#comment-16982219 ] Renjie Liu commented on ARROW-3706: --- [~nevi_me] Yes, please close it. > [Rust] Add record batch reader trait. > - > > Key: ARROW-3706 > URL: https://issues.apache.org/jira/browse/ARROW-3706 > Project: Apache Arrow > Issue Type: New Feature > Components: Rust >Reporter: Renjie Liu >Assignee: Renjie Liu >Priority: Major > > Add an RecordBatchReader trait. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7113) [Rust] Buffer should accept memory owned by others
Renjie Liu created ARROW-7113: - Summary: [Rust] Buffer should accept memory owned by others Key: ARROW-7113 URL: https://issues.apache.org/jira/browse/ARROW-7113 Project: Apache Arrow Issue Type: Improvement Reporter: Renjie Liu Assignee: Renjie Liu Currently rust Buffer always assume that the memory passed to it is owned by itself, and frees the memory when Buffer is dropped. This is inconvenient when used in cross language environments such as jni. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6971) [Rust] Replace "RecordBatchReader" with "BatchIterator"
[ https://issues.apache.org/jira/browse/ARROW-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957506#comment-16957506 ] Renjie Liu commented on ARROW-6971: --- I have discussed with [~andygrove] . RecordBatchReader can't impl Sync+Send because it used some unsafe techniques. > [Rust] Replace "RecordBatchReader" with "BatchIterator" > --- > > Key: ARROW-6971 > URL: https://issues.apache.org/jira/browse/ARROW-6971 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust, Rust - DataFusion >Affects Versions: 0.15.0 >Reporter: Paddy Horan >Assignee: Paddy Horan >Priority: Minor > Fix For: 1.0.0 > > > As part of the recent reader work we introduced > {code:java} > // arrow::record_batch::RecordBatchReader{code} > but in datafusion we have > {code:java} > // datafusion::physical_plan::BatchIterator > {code} > These two trait are almost identical (BatchIterator implements Send + Sync > whereas RecordBatchReader does not). I propose we replace RecordBatchReader > with BatchIterator (i.e. move it to arrow as it's generally useful outside of > datafusion) and update parquet and data fusion accordingly. > [~andygrove] [~liurenjie1024] do you see any issues with this? > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6948) [Rust] [Parquet] Fix bool array support in arrow reader.
Renjie Liu created ARROW-6948: - Summary: [Rust] [Parquet] Fix bool array support in arrow reader. Key: ARROW-6948 URL: https://issues.apache.org/jira/browse/ARROW-6948 Project: Apache Arrow Issue Type: Bug Reporter: Renjie Liu Assignee: Renjie Liu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-4392) [Rust] Implement high-level Parquet writer
[ https://issues.apache.org/jira/browse/ARROW-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949180#comment-16949180 ] Renjie Liu commented on ARROW-4392: --- [~andygrove] Yes it looks interesting to me. I'll take this if [~csun] is not available. > [Rust] Implement high-level Parquet writer > -- > > Key: ARROW-4392 > URL: https://issues.apache.org/jira/browse/ARROW-4392 > Project: Apache Arrow > Issue Type: New Feature > Components: Rust >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Major > Fix For: 1.0.0 > > > We only have low-level parquet writer at the moment, which requires user to > specify values, definition levels, repetition levels, etc. This is > inconvenient. Instead, we should offer high-level Parquet writer that hide > these details. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6845) Setup process to generate random data for integration tests
Renjie Liu created ARROW-6845: - Summary: Setup process to generate random data for integration tests Key: ARROW-6845 URL: https://issues.apache.org/jira/browse/ARROW-6845 Project: Apache Arrow Issue Type: Improvement Reporter: Renjie Liu Assignee: Renjie Liu -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6774) [Rust] Reading parquet file is slow
[ https://issues.apache.org/jira/browse/ARROW-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16945027#comment-16945027 ] Renjie Liu commented on ARROW-6774: --- This is part of a reader for reading parquet files into arrow arrays. It's almost complete, and we have still one PR ([https://github.com/apache/arrow/pull/5523]) waiting for review, which contains documentations and examples. > [Rust] Reading parquet file is slow > --- > > Key: ARROW-6774 > URL: https://issues.apache.org/jira/browse/ARROW-6774 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust >Affects Versions: 0.15.0 >Reporter: Adam Lippai >Priority: Major > > Using the example at > [https://github.com/apache/arrow/tree/master/rust/parquet] is slow. > The snippet > {code:none} > let reader = SerializedFileReader::new(file).unwrap(); > let mut iter = reader.get_row_iter(None).unwrap(); > let start = Instant::now(); > while let Some(record) = iter.next() {} > let duration = start.elapsed(); > println!("{:?}", duration); > {code} > Runs for 17sec for a ~160MB parquet file. > If there is a more effective way to load a parquet file, it would be nice to > add it to the readme. > P.S.: My goal is to construct an ndarray from it, I'd be happy for any tips. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6700) [Rust] [DataFusion] Use new parquet arrow reader
[ https://issues.apache.org/jira/browse/ARROW-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938198#comment-16938198 ] Renjie Liu commented on ARROW-6700: --- [~andygrove] Could you assign this ticket to me? > [Rust] [DataFusion] Use new parquet arrow reader > > > Key: ARROW-6700 > URL: https://issues.apache.org/jira/browse/ARROW-6700 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust, Rust - DataFusion >Reporter: Andy Grove >Priority: Major > Fix For: 1.0.0 > > > Once [https://github.com/apache/arrow/pull/5378] is merged, DataFusion should > be updated to use this new array reader support instead of the current > parquet reader code in the DataFusion crate. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6069) [Rust] [Parquet] Implement Converter to convert record reader to arrow primitive array.
Renjie Liu created ARROW-6069: - Summary: [Rust] [Parquet] Implement Converter to convert record reader to arrow primitive array. Key: ARROW-6069 URL: https://issues.apache.org/jira/browse/ARROW-6069 Project: Apache Arrow Issue Type: Sub-task Reporter: Renjie Liu Assignee: Renjie Liu -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ARROW-5901) [Rust] Implement PartialEq to compare array and json values
Renjie Liu created ARROW-5901: - Summary: [Rust] Implement PartialEq to compare array and json values Key: ARROW-5901 URL: https://issues.apache.org/jira/browse/ARROW-5901 Project: Apache Arrow Issue Type: New Feature Reporter: Renjie Liu Assignee: Renjie Liu Useful in tests -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5823) [Rust] Fix build break.
Renjie Liu created ARROW-5823: - Summary: [Rust] Fix build break. Key: ARROW-5823 URL: https://issues.apache.org/jira/browse/ARROW-5823 Project: Apache Arrow Issue Type: Bug Reporter: Renjie Liu Assignee: Renjie Liu Rust build breaks because some changes in array builder. However this error is not detected in ci scripts because missing --all-targets in cargo build command. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5792) [Rust] [Parquet] A visitor trait for parquet types.
Renjie Liu created ARROW-5792: - Summary: [Rust] [Parquet] A visitor trait for parquet types. Key: ARROW-5792 URL: https://issues.apache.org/jira/browse/ARROW-5792 Project: Apache Arrow Issue Type: New Feature Reporter: Renjie Liu Assignee: Renjie Liu Useful in dealing with parquet types. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5755) [Rust] [Parquet] Add derived clone for Type
Renjie Liu created ARROW-5755: - Summary: [Rust] [Parquet] Add derived clone for Type Key: ARROW-5755 URL: https://issues.apache.org/jira/browse/ARROW-5755 Project: Apache Arrow Issue Type: New Feature Reporter: Renjie Liu Assignee: Renjie Liu Add clone for Type -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5463) [Rust] Implement AsRef for Buffer
Renjie Liu created ARROW-5463: - Summary: [Rust] Implement AsRef for Buffer Key: ARROW-5463 URL: https://issues.apache.org/jira/browse/ARROW-5463 Project: Apache Arrow Issue Type: New Feature Components: Rust Reporter: Renjie Liu Assignee: Renjie Liu Implement AsRef ArrowNativeType for Buffer -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5316) [Rust] Interfaces for gandiva bindings.
Renjie Liu created ARROW-5316: - Summary: [Rust] Interfaces for gandiva bindings. Key: ARROW-5316 URL: https://issues.apache.org/jira/browse/ARROW-5316 Project: Apache Arrow Issue Type: Sub-task Reporter: Renjie Liu Assignee: Renjie Liu Create interfaces to demonstrate high level design and ideas. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5315) [Rust] Gandiva binding.
Renjie Liu created ARROW-5315: - Summary: [Rust] Gandiva binding. Key: ARROW-5315 URL: https://issues.apache.org/jira/browse/ARROW-5315 Project: Apache Arrow Issue Type: New Feature Reporter: Renjie Liu Add gandiva binding for rust. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5298) [Rust] Add debug implementation for Buffer
Renjie Liu created ARROW-5298: - Summary: [Rust] Add debug implementation for Buffer Key: ARROW-5298 URL: https://issues.apache.org/jira/browse/ARROW-5298 Project: Apache Arrow Issue Type: Improvement Components: Rust Reporter: Renjie Liu Assignee: Renjie Liu Default debug implementation is not good enough for debugging. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5281) [Rust] [Parquet] Move DataPageBuilder to test_common
Renjie Liu created ARROW-5281: - Summary: [Rust] [Parquet] Move DataPageBuilder to test_common Key: ARROW-5281 URL: https://issues.apache.org/jira/browse/ARROW-5281 Project: Apache Arrow Issue Type: Improvement Reporter: Renjie Liu Assignee: Renjie Liu DataPageBuilder is a helpful tool for mocking test page data, it's worthy to move it to test_common so that other parts can reuse it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5281) [Rust] [Parquet] Move DataPageBuilder to test_common
[ https://issues.apache.org/jira/browse/ARROW-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Renjie Liu updated ARROW-5281: -- Component/s: Rust > [Rust] [Parquet] Move DataPageBuilder to test_common > > > Key: ARROW-5281 > URL: https://issues.apache.org/jira/browse/ARROW-5281 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust >Reporter: Renjie Liu >Assignee: Renjie Liu >Priority: Major > > DataPageBuilder is a helpful tool for mocking test page data, it's worthy to > move it to test_common so that other parts can reuse it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5162) [Rust] [Parquet] Rename mod reader to arrow.
Renjie Liu created ARROW-5162: - Summary: [Rust] [Parquet] Rename mod reader to arrow. Key: ARROW-5162 URL: https://issues.apache.org/jira/browse/ARROW-5162 Project: Apache Arrow Issue Type: Improvement Reporter: Renjie Liu Assignee: Renjie Liu Rename mod to arrow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5127) [Rust] [Parquet] Add page iterator
Renjie Liu created ARROW-5127: - Summary: [Rust] [Parquet] Add page iterator Key: ARROW-5127 URL: https://issues.apache.org/jira/browse/ARROW-5127 Project: Apache Arrow Issue Type: Sub-task Reporter: Renjie Liu Assignee: Renjie Liu Adds a page iterator for column reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5126) [Rust] [Parquet] Convert parquet column desc to arrow data type
Renjie Liu created ARROW-5126: - Summary: [Rust] [Parquet] Convert parquet column desc to arrow data type Key: ARROW-5126 URL: https://issues.apache.org/jira/browse/ARROW-5126 Project: Apache Arrow Issue Type: New Feature Components: Rust Reporter: Renjie Liu Assignee: Renjie Liu -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4634) [Rust] [Parquet] Reorganize test_common mod to allow more test util codes.
Renjie Liu created ARROW-4634: - Summary: [Rust] [Parquet] Reorganize test_common mod to allow more test util codes. Key: ARROW-4634 URL: https://issues.apache.org/jira/browse/ARROW-4634 Project: Apache Arrow Issue Type: Improvement Components: Rust Reporter: Renjie Liu Assignee: Renjie Liu Currently test_common mod is just one file, and when we need to add more test utils into it, things may messed up, so I propose to make test_common a directory with multi sub mods. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4525) [Rust] [Parquet] Convert ArrowError to ParquetError
Renjie Liu created ARROW-4525: - Summary: [Rust] [Parquet] Convert ArrowError to ParquetError Key: ARROW-4525 URL: https://issues.apache.org/jira/browse/ARROW-4525 Project: Apache Arrow Issue Type: Bug Components: Rust Reporter: Renjie Liu Assignee: Renjie Liu We need to enable conversion from ArrowError to ParquetError. This is useful when integrating arrow with parquet, e.g. when reading parquet data into arrow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-4061) [Rust] [Parquet] Implement "spaced" version for non-dictionary encoding/decoding
[ https://issues.apache.org/jira/browse/ARROW-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753720#comment-16753720 ] Renjie Liu commented on ARROW-4061: --- [~csun] Thanks. > [Rust] [Parquet] Implement "spaced" version for non-dictionary > encoding/decoding > > > Key: ARROW-4061 > URL: https://issues.apache.org/jira/browse/ARROW-4061 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Major > > To support Parquet/Arrow encoding/decoding, we need to implement a "spaced" > version where slots for null values should be filled with undefined bytes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-4365) [Rust] [Parquet] Implement RecordReader
[ https://issues.apache.org/jira/browse/ARROW-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Renjie Liu updated ARROW-4365: -- Issue Type: Sub-task (was: Bug) Parent: ARROW-4059 > [Rust] [Parquet] Implement RecordReader > --- > > Key: ARROW-4365 > URL: https://issues.apache.org/jira/browse/ARROW-4365 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust >Reporter: Renjie Liu >Assignee: Renjie Liu >Priority: Minor > > RecordReader reads logical records into memory, this is the prerequisite for > ColumnReader -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-4061) [Rust] [Parquet] Implement "spaced" version for non-dictionary encoding/decoding
[ https://issues.apache.org/jira/browse/ARROW-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753662#comment-16753662 ] Renjie Liu commented on ARROW-4061: --- Hi, [~csun] Are you working on this? This is a blocker for other parts of arrow reader. I can take this if you are not available. > [Rust] [Parquet] Implement "spaced" version for non-dictionary > encoding/decoding > > > Key: ARROW-4061 > URL: https://issues.apache.org/jira/browse/ARROW-4061 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Major > > To support Parquet/Arrow encoding/decoding, we need to implement a "spaced" > version where slots for null values should be filled with undefined bytes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4365) [Rust] [Parquet] Implement RecordReader
Renjie Liu created ARROW-4365: - Summary: [Rust] [Parquet] Implement RecordReader Key: ARROW-4365 URL: https://issues.apache.org/jira/browse/ARROW-4365 Project: Apache Arrow Issue Type: Bug Components: Rust Reporter: Renjie Liu Assignee: Renjie Liu RecordReader reads logical records into memory, this is the prerequisite for ColumnReader -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4219) [Rust] [Parquet] Implement ArrowReader
Renjie Liu created ARROW-4219: - Summary: [Rust] [Parquet] Implement ArrowReader Key: ARROW-4219 URL: https://issues.apache.org/jira/browse/ARROW-4219 Project: Apache Arrow Issue Type: Sub-task Components: Rust Reporter: Renjie Liu Assignee: Renjie Liu ArrowReader reads parquet into arrow. In this ticket our goal is to implement get_schema and read row groups into record batch iterator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-4218) [Rust] [Parquet] Implement ColumnReader
[ https://issues.apache.org/jira/browse/ARROW-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Renjie Liu updated ARROW-4218: -- Summary: [Rust] [Parquet] Implement ColumnReader (was: [Rust][Parquet]Implement ColumnReader) > [Rust] [Parquet] Implement ColumnReader > --- > > Key: ARROW-4218 > URL: https://issues.apache.org/jira/browse/ARROW-4218 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust >Reporter: Renjie Liu >Assignee: Renjie Liu >Priority: Major > > ColumnReader reads columns in parquet file into arrow array, this's this the > first step for reading parquet into arrow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4218) [Rust][Parquet]Implement ColumnReader
Renjie Liu created ARROW-4218: - Summary: [Rust][Parquet]Implement ColumnReader Key: ARROW-4218 URL: https://issues.apache.org/jira/browse/ARROW-4218 Project: Apache Arrow Issue Type: Sub-task Components: Rust Reporter: Renjie Liu Assignee: Renjie Liu ColumnReader reads columns in parquet file into arrow array, this's this the first step for reading parquet into arrow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-4060) [Rust] Add Parquet/Arrow schema converter
[ https://issues.apache.org/jira/browse/ARROW-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Renjie Liu reassigned ARROW-4060: - Assignee: Renjie Liu > [Rust] Add Parquet/Arrow schema converter > - > > Key: ARROW-4060 > URL: https://issues.apache.org/jira/browse/ARROW-4060 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust >Reporter: Chao Sun >Assignee: Renjie Liu >Priority: Major > > We should support conversion from Parquet to Arrow schema. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3706) [Rust] Add record batch reader trait.
Renjie Liu created ARROW-3706: - Summary: [Rust] Add record batch reader trait. Key: ARROW-3706 URL: https://issues.apache.org/jira/browse/ARROW-3706 Project: Apache Arrow Issue Type: New Feature Components: Rust Reporter: Renjie Liu Assignee: Renjie Liu Fix For: 0.12.0 Add an RecordBatchReader trait. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3085) [Rust] Add an adapter for parquet.
Renjie Liu created ARROW-3085: - Summary: [Rust] Add an adapter for parquet. Key: ARROW-3085 URL: https://issues.apache.org/jira/browse/ARROW-3085 Project: Apache Arrow Issue Type: New Feature Reporter: Renjie Liu Assignee: Renjie Liu Fix For: 0.11.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-2852) [Rust] Mark Array as Sync and Send
Renjie Liu created ARROW-2852: - Summary: [Rust] Mark Array as Sync and Send Key: ARROW-2852 URL: https://issues.apache.org/jira/browse/ARROW-2852 Project: Apache Arrow Issue Type: Bug Components: Rust Affects Versions: 0.9.0 Reporter: Renjie Liu Assignee: Renjie Liu Since arrays are immutable containers, it would be safe to mark it as Sync and Send. This is useful for processing in multithread environments. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-2435) [Rust] Add memory pool abstraction.
[ https://issues.apache.org/jira/browse/ARROW-2435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Renjie Liu reassigned ARROW-2435: - Assignee: Renjie Liu > [Rust] Add memory pool abstraction. > --- > > Key: ARROW-2435 > URL: https://issues.apache.org/jira/browse/ARROW-2435 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust >Affects Versions: 0.9.0 >Reporter: Renjie Liu >Assignee: Renjie Liu >Priority: Major > Labels: pull-request-available > > Add memory pool abstraction as the c++ api. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-2435) [Rust] Add memory pool abstraction.
Renjie Liu created ARROW-2435: - Summary: [Rust] Add memory pool abstraction. Key: ARROW-2435 URL: https://issues.apache.org/jira/browse/ARROW-2435 Project: Apache Arrow Issue Type: Improvement Components: Rust Affects Versions: 0.9.0 Reporter: Renjie Liu Add memory pool abstraction as the c++ api. -- This message was sent by Atlassian JIRA (v7.6.3#76005)