Re: [PR] [SPARK-44746][Python] Add more Python UDTF documentation for functions that accept input tables [spark]

via GitHub Tue, 05 Mar 2024 13:08:47 -0800


dtenedor commented on code in PR #45375:
URL: https://github.com/apache/spark/pull/45375#discussion_r1513503230



##########
examples/src/main/python/sql/udtf.py:
##########
@@ -210,6 +210,75 @@ def eval(self, row: Row):
     # +---+
 
 
+def python_udtf_table_argument_with_partitioning(spark: SparkSession) -> None:
+
+    from pyspark.sql.functions import udtf
+    from pyspark.sql.types import Row
+
+    # Define and register a UDTF.
+    @udtf(returnType="a: string, b: int")
+    class FilterUDTF:
+        def __init__(self):
+            self.key = ""
+            self.max = 0
+
+        def eval(self, row: Row):
+            self.key = row["a"]
+            self.max = max(self.max, row["b"])
+
+        def terminate(self):
+            yield self.key, self.max
+
+    spark.udtf.register("filter_udtf", FilterUDTF)
+
+    # Create an input table with some example values.
+    spark.sql("DROP TABLE IF EXISTS values_table")
+    spark.sql("CREATE TABLE values_table (a STRING, b INT)")
+    spark.sql("INSERT INTO values_table VALUES ('abc', 2), ('abc', 4), ('def', 
6), ('def', 8)")
+    spark.table("values_table").show()
+

Review Comment:
   Sure, this is done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-44746][Python] Add more Python UDTF documentation for functions that accept input tables [spark]

Reply via email to