paleolimbot opened a new pull request, #464:
URL: https://github.com/apache/arrow-nanoarrow/pull/464
This PR implements building columns buffer-wise for the types where this
makes sense. (Still working out the details of how to inject null handling
here).
```python
import nanoarrow as na
from nanoarrow import visitor
import pyarrow as pa
batch = pa.record_batch({"col1": [1, 2, 3], "col2": ["a", "b", "c"]})
batch_with_nulls = pa.record_batch({"col1": [1, None, 3], "col2": ["a", "b",
None]})
# Either builds a buffer or a list depending on column types
visitor.to_columns(batch)
#> (['col1', 'col2'],
#> [nanoarrow.c_lib.CBuffer(int64[24 b] 1 2 3), ['a', 'b', 'c']])
# One can inject a null handler (a few experimental ones provided)
visitor.to_columns(batch_with_nulls,
handle_nulls=visitor.nulls_as_masked_array())
#> (['col1', 'col2'],
#> [masked_array(data=[1, --, 3],
#> mask=[False, True, False],
#> fill_value=999999,
#> dtype=int64),
#> ['a', 'b', None]])
visitor.to_columns(batch_with_nulls,
handle_nulls=visitor.nulls_as_sentinel())
#> (['col1', 'col2'], [array([ 1., nan, 3.]), ['a', 'b', None]])
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]