sgilmore10 opened a new pull request, #41817:
URL: https://github.com/apache/arrow/pull/41817
### Rationale for this change
This pull requests adds two new APIs for importing and exporting
`arrow.tabular.RecordBatch` instances using the C Data Interface format.
**Example:**
```matlab
>> T = table((1:3)', ["A"; "B"; "C"]);
>> expected = arrow.recordBatch(T)
expected =
Arrow RecordBatch with 3 rows and 2 columns:
Schema:
Var1: Float64 | Var2: String
First Row:
1 | "A"
>> cArray = arrow.c.Array();
>> cSchema = arrow.c.Schema();
% Export the RecordBatch to C Data Interface Format
>> expected.export(cArray.Address, cSchema.Address);
% Import the RecordBatch from C Data Interface Format
>> actual = arrow.tabular.RecordBatch.import(cArray, cSchema)
actual =
Arrow RecordBatch with 3 rows and 2 columns:
Schema:
Var1: Float64 | Var2: String
First Row:
1 | "A"
% The RecordBatch is the same after round-tripping to the C Data Interface
format
>> isequal(actual, expected)
ans =
logical
1
```
### What changes are included in this PR?
1. Added a new method `arrow.tabular.RecordBatch.export` for exporting
`RecordBatch` objects to the C Data Interface format.
2. Added a new static method `arrow.tabular.RecordBatch.import` for
importing `RecordBatch` objects from the C Data Interface format.
3. Added a new internal class `arrow.c.internal.RecordBatchImporter` for
importing `RecordBatch` objects from the C Data Interface format.
### Are these changes tested?
Yes.
1. Added a new test file `matlab/test/arrow/c/tRoundtripRecordBatch.m` which
has basic round-trip tests for importing and exporting `RecordBatch` objects.
### Are there any user-facing changes?
Yes.
1. Two new user-facing methods were added to `arrow.tabular.RecordBatch`.
The first is `arrow.tabular.RecordBatch.export(cArrowArrayAddress,
cArrowSchemaAddress)`. The second is `arrow.tabular.RecordBatch.import(cArray,
cSchema)`. These APIs can be used to export/import `RecordBatch` objects using
the C Data Interface format.
### Future Directions
1. Add integration tests for sharing data between MATLAB/mlarrow and
Python/pyarrow running in the same process using the [MATLAB interface to
Python](https://www.mathworks.com/help/matlab/call-python-libraries.html).
2. Add support for the Arrow [C stream interface
format](https://arrow.apache.org/docs/format/CStreamInterface.html).
### Notes
1. Thanks to @kevingurney for the help with this feature!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]