Re: [PR] [SPARK-55159][PYTHON] Consolidate pandas-to-Arrow conversion utilities in serializers [spark]

via GitHub Thu, 12 Feb 2026 12:03:35 -0800


Yicong-Huang commented on code in PR #54125:
URL: https://github.com/apache/spark/pull/54125#discussion_r2800817048



##########
python/pyspark/sql/tests/test_conversion.py:
##########
@@ -144,6 +149,199 @@ def test_wrap_struct_empty_batch(self):
         self.assertEqual(wrapped.num_columns, 1)
 
 
[email protected](not have_pyarrow, pyarrow_requirement_message)
+class PandasToArrowConversionTests(unittest.TestCase):
+    def test_convert(self):
+        """Test basic DataFrame/Series to Arrow RecordBatch conversion."""
+        import pandas as pd
+        import pyarrow as pa
+
+        # Basic DataFrame conversion
+        df = pd.DataFrame({"a": [1, 2, 3], "b": [1.0, 2.0, 3.0]})
+        schema = StructType([StructField("a", IntegerType()), StructField("b", 
DoubleType())])
+        result = PandasToArrowConversion.convert(df, schema)

Review Comment:
   The `convert` API is stable. Future PRs will focus on internal improvements 
(e.g., SPARK-55502 to eliminate `is_udtf`, elevating `coerce_arrow_array` to 
batch-level). The tests cover the core paths and will remain valid through 
those changes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-55159][PYTHON] Consolidate pandas-to-Arrow conversion utilities in serializers [spark]

Reply via email to