sgilmore10 opened a new pull request, #41817:
URL: https://github.com/apache/arrow/pull/41817

   ### Rationale for this change
   
   This pull requests adds two new APIs for importing and exporting 
`arrow.tabular.RecordBatch` instances using the C Data Interface format.
   
   **Example:**
   ```matlab
   >> T = table((1:3)', ["A"; "B"; "C"]);
   >> expected = arrow.recordBatch(T)
   
   expected = 
   
     Arrow RecordBatch with 3 rows and 2 columns:
   
       Schema:
   
           Var1: Float64 | Var2: String
   
       First Row:
   
           1 | "A"
   
   >> cArray = arrow.c.Array();
   >> cSchema = arrow.c.Schema();
   
   % Export the RecordBatch to C Data Interface Format
   >> expected.export(cArray.Address, cSchema.Address);
   
   % Import the RecordBatch from C Data Interface Format
   >> actual = arrow.tabular.RecordBatch.import(cArray, cSchema)
   
   actual = 
   
     Arrow RecordBatch with 3 rows and 2 columns:
   
       Schema:
   
           Var1: Float64 | Var2: String
   
       First Row:
   
           1 | "A"
   
   % The RecordBatch is the same after round-tripping to the C Data Interface 
format
   >> isequal(actual, expected)
   
   ans =
   
     logical
   
      1
   
   ```
   ### What changes are included in this PR?
   
   1. Added a new method `arrow.tabular.RecordBatch.export` for exporting 
`RecordBatch` objects to the C Data Interface format.
   2. Added a new static method `arrow.tabular.RecordBatch.import` for 
importing `RecordBatch` objects from the C Data Interface format.
   3. Added a new internal class `arrow.c.internal.RecordBatchImporter` for 
importing `RecordBatch` objects from the C Data Interface format.
   
   ### Are these changes tested?
   
   Yes.
   
   1. Added a new test file `matlab/test/arrow/c/tRoundtripRecordBatch.m` which 
has basic round-trip tests for importing and exporting `RecordBatch` objects.
   
   ### Are there any user-facing changes?
   
   Yes.
   
   1. Two new user-facing methods were added to `arrow.tabular.RecordBatch`. 
The first is `arrow.tabular.RecordBatch.export(cArrowArrayAddress, 
cArrowSchemaAddress)`. The second is `arrow.tabular.RecordBatch.import(cArray, 
cSchema)`. These APIs can be used to export/import `RecordBatch` objects using 
the C Data Interface format.
   
   
   ### Future Directions
   
   1. Add integration tests for sharing data between MATLAB/mlarrow and 
Python/pyarrow running in the same process using the [MATLAB interface to 
Python](https://www.mathworks.com/help/matlab/call-python-libraries.html).
   2. Add support for the Arrow [C stream interface 
format](https://arrow.apache.org/docs/format/CStreamInterface.html).
   
   ### Notes
   
   1. Thanks to @kevingurney for the help with this feature! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to