This is an automated email from the ASF dual-hosted git repository. jorisvandenbossche pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push: new 6a9e2d53b5 GH-38575: [Python] Include metadata when creating pa.schema from PyCapsule (#41538) 6a9e2d53b5 is described below commit 6a9e2d53b5cdd0f387bfcd44e9549f122fac93e5 Author: Jacob Hayes <jacob.r.ha...@gmail.com> AuthorDate: Fri May 17 03:07:02 2024 -0400 GH-38575: [Python] Include metadata when creating pa.schema from PyCapsule (#41538) ### Rationale for this change Fixes the dropped `pa.schema` metadata reported in #38575, which was introduced in #37797. ### What changes are included in this PR? Passes through the `metadata` to the short-circuited `Schema` created with `_import_from_c_capsule`. ### Are these changes tested? Yes - added `metadata` to the existing test. ### Are there any user-facing changes? I'm not sure this quite rises to the `(b) a bug that caused incorrect or invalid data to be produced,` condition, but I added that note to be safe since the resulting schema is "incorrect" (and broke some round-trip tests on my end after a pyarrow update): **This PR contains a "Critical Fix".** * GitHub Issue: #38575 Lead-authored-by: Jacob Hayes <jacob.r.ha...@gmail.com> Co-authored-by: Joris Van den Bossche <jorisvandenboss...@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenboss...@gmail.com> --- python/pyarrow/tests/test_types.py | 5 ++++- python/pyarrow/types.pxi | 5 ++++- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/python/pyarrow/tests/test_types.py b/python/pyarrow/tests/test_types.py index 4f66a6f416..f7b6040f51 100644 --- a/python/pyarrow/tests/test_types.py +++ b/python/pyarrow/tests/test_types.py @@ -1331,10 +1331,13 @@ def test_schema_import_c_schema_interface(): def __arrow_c_schema__(self): return self.schema.__arrow_c_schema__() - schema = pa.schema([pa.field("field_name", pa.int32())]) + schema = pa.schema([pa.field("field_name", pa.int32())], metadata={"a": "b"}) + assert schema.metadata == {b"a": b"b"} wrapped_schema = Wrapper(schema) assert pa.schema(wrapped_schema) == schema + assert pa.schema(wrapped_schema).metadata == {b"a": b"b"} + assert pa.schema(wrapped_schema, metadata={"a": "c"}).metadata == {b"a": b"c"} def test_field_import_c_schema_interface(): diff --git a/python/pyarrow/types.pxi b/python/pyarrow/types.pxi index 018099ae7e..480f19c81d 100644 --- a/python/pyarrow/types.pxi +++ b/python/pyarrow/types.pxi @@ -5332,7 +5332,10 @@ def schema(fields, metadata=None): if isinstance(fields, Mapping): fields = fields.items() elif hasattr(fields, "__arrow_c_schema__"): - return Schema._import_from_c_capsule(fields.__arrow_c_schema__()) + result = Schema._import_from_c_capsule(fields.__arrow_c_schema__()) + if metadata is not None: + result = result.with_metadata(metadata) + return result for item in fields: if isinstance(item, tuple):