This is an automated email from the ASF dual-hosted git repository.
jorisvandenbossche pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new 6a9e2d53b5 GH-38575: [Python] Include metadata when creating pa.schema
from PyCapsule (#41538)
6a9e2d53b5 is described below
commit 6a9e2d53b5cdd0f387bfcd44e9549f122fac93e5
Author: Jacob Hayes <[email protected]>
AuthorDate: Fri May 17 03:07:02 2024 -0400
GH-38575: [Python] Include metadata when creating pa.schema from PyCapsule
(#41538)
### Rationale for this change
Fixes the dropped `pa.schema` metadata reported in #38575, which was
introduced in #37797.
### What changes are included in this PR?
Passes through the `metadata` to the short-circuited `Schema` created with
`_import_from_c_capsule`.
### Are these changes tested?
Yes - added `metadata` to the existing test.
### Are there any user-facing changes?
I'm not sure this quite rises to the `(b) a bug that caused incorrect or
invalid data to be produced,` condition, but I added that note to be safe since
the resulting schema is "incorrect" (and broke some round-trip tests on my end
after a pyarrow update):
**This PR contains a "Critical Fix".**
* GitHub Issue: #38575
Lead-authored-by: Jacob Hayes <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
---
python/pyarrow/tests/test_types.py | 5 ++++-
python/pyarrow/types.pxi | 5 ++++-
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/python/pyarrow/tests/test_types.py
b/python/pyarrow/tests/test_types.py
index 4f66a6f416..f7b6040f51 100644
--- a/python/pyarrow/tests/test_types.py
+++ b/python/pyarrow/tests/test_types.py
@@ -1331,10 +1331,13 @@ def test_schema_import_c_schema_interface():
def __arrow_c_schema__(self):
return self.schema.__arrow_c_schema__()
- schema = pa.schema([pa.field("field_name", pa.int32())])
+ schema = pa.schema([pa.field("field_name", pa.int32())], metadata={"a":
"b"})
+ assert schema.metadata == {b"a": b"b"}
wrapped_schema = Wrapper(schema)
assert pa.schema(wrapped_schema) == schema
+ assert pa.schema(wrapped_schema).metadata == {b"a": b"b"}
+ assert pa.schema(wrapped_schema, metadata={"a": "c"}).metadata == {b"a":
b"c"}
def test_field_import_c_schema_interface():
diff --git a/python/pyarrow/types.pxi b/python/pyarrow/types.pxi
index 018099ae7e..480f19c81d 100644
--- a/python/pyarrow/types.pxi
+++ b/python/pyarrow/types.pxi
@@ -5332,7 +5332,10 @@ def schema(fields, metadata=None):
if isinstance(fields, Mapping):
fields = fields.items()
elif hasattr(fields, "__arrow_c_schema__"):
- return Schema._import_from_c_capsule(fields.__arrow_c_schema__())
+ result = Schema._import_from_c_capsule(fields.__arrow_c_schema__())
+ if metadata is not None:
+ result = result.with_metadata(metadata)
+ return result
for item in fields:
if isinstance(item, tuple):