jorgecarleitao commented on issue #12102:
URL: https://github.com/apache/arrow/issues/12102#issuecomment-1008330881


   Hi. Thanks for the pingĀ @nealrichardson .
   
   Thanks for the initiative, @multimeric , super cool!
   
   Note that the C data interface is designed for _intra_ process communication 
- R would be running on the same process as Polars.
   
   Polars uses Rust's an unofficial implementation, so we have to use its API 
here. Say you have a Polars DataFrame in Rust. You can extract any of its 
series via the index operator `[]`. A series is just a vector of Arrow arrays, 
which you get via 
[`.chunks`](https://docs.rs/polars/latest/polars/series/trait.SeriesTrait.html#method.chunks).
 At this point we can disregard Polars and just focus on Arrow. To export each 
of the arrays, you need 3 steps:
   
   1. allocate two empty ffi interfaces (two Rust Boxes with the ffi-compatible 
structs)
   2. write the array and schema to them
   3. call the corresponding function to import the two from R
   
   * [Step 
1](https://github.com/jorgecarleitao/arrow2/blob/main/arrow-pyarrow-integration-testing/src/lib.rs#L75)
   * [Step 
2](https://github.com/jorgecarleitao/arrow2/blob/main/arrow-pyarrow-integration-testing/src/lib.rs#L81)
   * [Step 3 (in 
Python)](https://github.com/jorgecarleitao/arrow2/blob/main/arrow-pyarrow-integration-testing/src/lib.rs#L89)
   
   I am not very familiar with R, but I think that Step 3 amounts to call  
`Array$import_from_c` from R. Note that all of these steps are `O(1)` and thus 
incur no performance cost (a core idea of the Arrow format).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to