LucaCanali commented on code in PR #35391:
URL: https://github.com/apache/spark/pull/35391#discussion_r933096674


##########
python/pyspark/sql/tests/test_pandas_udf_scalar.py:
##########
@@ -134,6 +134,30 @@ def test_pandas_udf_nested_arrays(self):
         result = df.select(tokenize("vals").alias("hi"))
         self.assertEqual([Row(hi=[["hi", "boo"]]), Row(hi=[["bye", "boo"]])], 
result.collect())
 
+    def test_pandas_array_struct(self):
+        # SPARK-38098: Support Array of Struct for Pandas UDFs and toPandas
+        # import numpy as np
+
+        @pandas_udf("Array<struct<col1:string, col2:long, col3:double>>")
+        def return_cols(cols):
+            # self.assertEqual(type(cols), pd.Series)
+            # self.assertEqual(type(cols[0]), np.ndarray)
+            # self.assertEqual(type(cols[0][0]), dict)

Review Comment:
   Thank you @ueshin for the review and comments.
   I have added the proposed modifications.
   As for `import numpy as np`, I have now added it explicitly, however 
externally to the udf, just to be consistent with the other tests there that 
use numpy.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to