HeartSaVioR commented on code in PR #38796:
URL: https://github.com/apache/spark/pull/38796#discussion_r1032072872
##########
sql/core/src/test/scala/org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite.scala:
##########
@@ -803,4 +803,49 @@ class FlatMapGroupsInPandasWithStateSuite extends
StateStoreMetricsTest {
total = Seq(1), updated = Seq(1), droppedByWatermark = Seq(0), removed
= Some(Seq(1)))
)
}
+
+ test("SPARK-41260: applyInPandasWithState - NumPy instances to JVM rows in
state") {
+ assume(shouldTestPandasUDFs)
+
+ val pythonScript =
+ """
+ |import pandas as pd
+ |import numpy as np
+ |from pyspark.sql.types import StructType, StructField, StringType
+ |
+ |tpe = StructType([
+ | StructField("key", StringType()),
+ | StructField("valueAsString", StringType())])
+ |
+ |def func(key, pdf_iter, state):
+ | np_value = np.int64(1) # NumPy instance
Review Comment:
Could we please imagine/check some more types which is used widely and add
them to the test here as well? No need to be exhaustive but single check seems
to be too specific.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]