This is an automated email from the ASF dual-hosted git repository.

jorisvandenbossche pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/main by this push:
     new 6a9e2d53b5 GH-38575: [Python] Include metadata when creating pa.schema 
from PyCapsule (#41538)
6a9e2d53b5 is described below

commit 6a9e2d53b5cdd0f387bfcd44e9549f122fac93e5
Author: Jacob Hayes <jacob.r.ha...@gmail.com>
AuthorDate: Fri May 17 03:07:02 2024 -0400

    GH-38575: [Python] Include metadata when creating pa.schema from PyCapsule 
(#41538)
    
    ### Rationale for this change
    
    Fixes the dropped `pa.schema` metadata reported in #38575, which was 
introduced in #37797.
    
    ### What changes are included in this PR?
    
    Passes through the `metadata` to the short-circuited `Schema` created with 
`_import_from_c_capsule`.
    
    ### Are these changes tested?
    
    Yes - added `metadata` to the existing test.
    
    ### Are there any user-facing changes?
    
    I'm not sure this quite rises to the `(b) a bug that caused incorrect or 
invalid data to be produced,` condition, but I added that note to be safe since 
the resulting schema is "incorrect" (and broke some round-trip tests on my end 
after a pyarrow update):
    
    **This PR contains a "Critical Fix".**
    
    * GitHub Issue: #38575
    
    Lead-authored-by: Jacob Hayes <jacob.r.ha...@gmail.com>
    Co-authored-by: Joris Van den Bossche <jorisvandenboss...@gmail.com>
    Signed-off-by: Joris Van den Bossche <jorisvandenboss...@gmail.com>
---
 python/pyarrow/tests/test_types.py | 5 ++++-
 python/pyarrow/types.pxi           | 5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/python/pyarrow/tests/test_types.py 
b/python/pyarrow/tests/test_types.py
index 4f66a6f416..f7b6040f51 100644
--- a/python/pyarrow/tests/test_types.py
+++ b/python/pyarrow/tests/test_types.py
@@ -1331,10 +1331,13 @@ def test_schema_import_c_schema_interface():
         def __arrow_c_schema__(self):
             return self.schema.__arrow_c_schema__()
 
-    schema = pa.schema([pa.field("field_name", pa.int32())])
+    schema = pa.schema([pa.field("field_name", pa.int32())], metadata={"a": 
"b"})
+    assert schema.metadata == {b"a": b"b"}
     wrapped_schema = Wrapper(schema)
 
     assert pa.schema(wrapped_schema) == schema
+    assert pa.schema(wrapped_schema).metadata == {b"a": b"b"}
+    assert pa.schema(wrapped_schema, metadata={"a": "c"}).metadata == {b"a": 
b"c"}
 
 
 def test_field_import_c_schema_interface():
diff --git a/python/pyarrow/types.pxi b/python/pyarrow/types.pxi
index 018099ae7e..480f19c81d 100644
--- a/python/pyarrow/types.pxi
+++ b/python/pyarrow/types.pxi
@@ -5332,7 +5332,10 @@ def schema(fields, metadata=None):
     if isinstance(fields, Mapping):
         fields = fields.items()
     elif hasattr(fields, "__arrow_c_schema__"):
-        return Schema._import_from_c_capsule(fields.__arrow_c_schema__())
+        result = Schema._import_from_c_capsule(fields.__arrow_c_schema__())
+        if metadata is not None:
+            result = result.with_metadata(metadata)
+        return result
 
     for item in fields:
         if isinstance(item, tuple):

Reply via email to