AlenkaF commented on issue #15068:
URL: https://github.com/apache/arrow/issues/15068#issuecomment-4397699665
I can replicate this issue on `main` with a smaller example Copilot drafted
for me:
```python
import io
import pyarrow as pa
class MyExtType(pa.ExtensionType):
def __init__(self):
pa.ExtensionType.__init__(self, pa.null(), "test.ext_type")
def __arrow_ext_serialize__(self):
return b""
@classmethod
def __arrow_ext_deserialize__(cls, storage_type, serialized):
return MyExtType()
pa.register_extension_type(MyExtType())
# Build a table with an ext(null) column alongside other columns
ext_array = pa.ExtensionArray.from_storage(MyExtType(), pa.nulls(3))
table = pa.table({"ext": ext_array, "ints": [1, 2, 3]})
# IPC roundtrip
buf = io.BytesIO()
with pa.ipc.new_file(buf, table.schema) as writer:
writer.write_table(table)
buf.seek(0)
result = pa.ipc.open_file(buf).read_all()
```
with result being:
```python
In [13]: result
Out[13]:
pyarrow.Table
ext: extension<test.ext_type<MyExtType>>
ints: int64
----
ext: [3 nulls]
ints: [<Invalid array: Buffer #1 too small in array of type int64 and length
3: expected at least 24 byte(s), got 0
/Users/alenkafrim/Repos/arrow/cpp/src/arrow/array/validate.cc:118
ValidateLayout(*data.type)>]
```
I will add this issue to the umbrella issue I am working on so we can
prioritize the extension type support in PyArrow.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]