jorisvandenbossche commented on issue #39816:
URL: https://github.com/apache/arrow/issues/39816#issuecomment-1913097288
I quickly tested your MRE vs pyarrow using nanoarrow-python to inspect the
data:
PyArrow:
```
import pyarrow as pa
schema = pa.schema([("interval", pa.month_day_nano_interval())])
tbl = pa.Table.from_arrays([pa.array(
[
None,
pa.scalar((1, 1, 1), type=pa.month_day_nano_interval()),
pa.scalar((42, 42, 42), type=pa.month_day_nano_interval()),
None,
]
)], schema=schema)
In [5]: stream = na.c_array_stream(tbl)
In [6]: arr = s.get_next().child(0)
In [7]: arr
Out[7]:
<nanoarrow.c_lib.CArray interval_month_day_nano>
- length: 4
- offset: 0
- null_count: 2
- buffers: (140484108394496, 140484108394560)
- dictionary: NULL
- children[0]:
In [8]: na.c_array_view(ar)
Out[8]:
<nanoarrow.c_lib.CArrayView>
- storage_type: 'interval_month_day_nano'
- length: 4
- offset: 0
- null_count: 2
- buffers[2]:
- <bool validity[1 b] 01100000>
- <interval_month_day_nano data[64 b] (0, 0, 0) (1, 1, 1) (42, 42, 42) (0,
...>
- dictionary: NULL
- children[0]:
```
Your MRE:
```
In [1]: import nanoarrow_mre
In [2]: capsule = nanoarrow_mre.get_interval_capsule()
In [3]: import nanoarrow as na
In [4]: stream = na.c_lib.CArrayStream._import_from_c_capsule(capsule)
In [5]: stream
Out[5]:
<nanoarrow.c_lib.CArrayStream>
- get_schema(): struct<interval_column: interval_month_day_nano>
In [6]: arr = stream.get_next().child(0)
In [7]: arr
Out[7]:
<nanoarrow.c_lib.CArray interval_month_day_nano>
- length: 4
- offset: 0
- null_count: 2
- buffers: (94736573435584, 94736573573504)
- dictionary: NULL
- children[0]:
In [8]: na.c_array_view(arr)
Out[8]:
<nanoarrow.c_lib.CArrayView>
- storage_type: 'interval_month_day_nano'
- length: 4
- offset: 0
- null_count: 2
- buffers[2]:
- <bool validity[1 b] 00111111>
- <interval_month_day_nano data[64 b] (0, 0, 0) (1, 1, 1) (42, 42, 42) (0,
...>
- dictionary: NULL
- children[0]:
```
So the data itself looks good (the (1, 1, 1) and (42, 42, 42) are still
there), but it's the validity bitmap that is wrong. It masks the (1,1,1) value,
and does not mask the 4th value, making this (0, 0, 0) visible.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]