[GitHub] [arrow] markhildreth edited a comment on issue #6972: ARROW-8287: [Rust] Add "pretty" util to help with printing tabular output of RecordBatches

2020-04-23 Thread GitBox


markhildreth edited a comment on issue #6972:
URL: https://github.com/apache/arrow/pull/6972#issuecomment-618506903


   From a purely practical standpoint, this PR is ready for further review and 
merging. If approved, I would probably add some minor JIRA issue for the 
following:
   * Trying to avoid the type inference issue.
   * Creating a type that can be used to iterate over multiple batches with 
statically guaranteed same schemas (looks like `RecordBatchReader` is close to 
what we would want).
   
   Additionally, I’ll create a new issue and PR to cover moving DataFusion to 
this using this API.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] markhildreth edited a comment on issue #6972: ARROW-8287: [Rust] Add "pretty" util to help with printing tabular output of RecordBatches

2020-04-23 Thread GitBox


markhildreth edited a comment on issue #6972:
URL: https://github.com/apache/arrow/pull/6972#issuecomment-618506903


   From a purely practical standpoint, this PR is ready for further review and 
merging. If approved, I would probably add some minor JIRA issue for the 
following:
   * Trying to avoid the type inference issue.
   * Creating a type that can be used to iterate over multiple batches with 
statically guaranteed same schemas.
   
   Additionally, I’ll create a new issue and PR to cover moving DataFusion to 
this using this API.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] markhildreth edited a comment on issue #6972: ARROW-8287: [Rust] Add "pretty" util to help with printing tabular output of RecordBatches

2020-04-23 Thread GitBox


markhildreth edited a comment on issue #6972:
URL: https://github.com/apache/arrow/pull/6972#issuecomment-618501806


   @andygrove Thanks for the feedback. I have updated the PR with a less leaky 
API. I also tweaked the parquet test to workaround the new type inference 
changes.
   
   @nevi-me True for a singular batch, but it's possible to create multiple 
batches from different sources. For example
   
   ```
   let schema1 = ...
   let csv1 = ...
   let batch1 = ...
   
   let schema2 = ... // different schema
   let csv2 = ...
   let batch2 = ...
   
   print_batches(&[batch1, batch2]);
   ```
   
   As I said, this is probably not something to worry about too much right now, 
but I'll probably add an issue for later to revisit if that's alright. 
Interestingly, this code wouldn't even necessarily crash; you would just get an 
odd-looking table:
   
   ```
   +---+---+---+---+
   | city  | lat   | lng   |   |
   +---+---+---+---+
   | Elgin, Scotland, the UK   | 57.653484 | -3.335724 |   |
   | Stoke-on-Trent, Staffordshire, the UK | 53.002666 | -2.179404 |   |
   | Solihull, Birmingham, UK  | 52.412811 | -1.778197 |   |
   | Cardiff, Cardiff county, UK   | 51.481583 | -3.17909  |   |
   | 1 | 1.1   | 1.11  | true  |
   | 2 | 2.2   | 2.22  | true  |
   | 3 | 0 | 3.33  | true  |
   | 4 | 4.4   |   | false |
   | 5 | 6.6   |   | false |
   +---+---+---+---+
   
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] markhildreth edited a comment on issue #6972: ARROW-8287: [Rust] Add "pretty" util to help with printing tabular output of RecordBatches

2020-04-23 Thread GitBox


markhildreth edited a comment on issue #6972:
URL: https://github.com/apache/arrow/pull/6972#issuecomment-618501806


   @andygrove Thanks for the feedback. I have updated the PR with a less leaky 
API. I also fixed the type inference problem that was caused by the new 
dependency.
   
   @nevi-me True for a singular batch, but it's possible to create multiple 
batches from different sources. For example
   
   ```
   let schema1 = ...
   let csv1 = ...
   let batch1 = ...
   
   let schema2 = ... // different schema
   let csv2 = ...
   let batch2 = ...
   
   print_batches(&[batch1, batch2]);
   ```
   
   As I said, this is probably not something to worry about too much right now, 
but I'll probably add an issue for later to revisit if that's alright. 
Interestingly, this code wouldn't even necessarily crash; you would just get an 
odd-looking table:
   
   ```
   +---+---+---+---+
   | city  | lat   | lng   |   |
   +---+---+---+---+
   | Elgin, Scotland, the UK   | 57.653484 | -3.335724 |   |
   | Stoke-on-Trent, Staffordshire, the UK | 53.002666 | -2.179404 |   |
   | Solihull, Birmingham, UK  | 52.412811 | -1.778197 |   |
   | Cardiff, Cardiff county, UK   | 51.481583 | -3.17909  |   |
   | 1 | 1.1   | 1.11  | true  |
   | 2 | 2.2   | 2.22  | true  |
   | 3 | 0 | 3.33  | true  |
   | 4 | 4.4   |   | false |
   | 5 | 6.6   |   | false |
   +---+---+---+---+
   
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org