Re: [PR] [SPARK-50392][PYTHON] DataFrame conversion to table argument in Spark Classic [spark]

via GitHub Fri, 27 Dec 2024 00:27:07 -0800


xinrong-meng commented on code in PR #49055:
URL: https://github.com/apache/spark/pull/49055#discussion_r1898335713



##########
python/pyspark/sql/tests/test_udtf.py:
##########
@@ -1064,6 +1074,71 @@ def eval(self, row: Row):
         func = udtf(TestUDTF, returnType="a: int")
         return func
 
+    def test_df_asTable_chaining_methods(self):
+        class TestUDTF:
+            def eval(self, row: Row):
+                yield row["key"], row["value"]
+
+            def terminate(self):
+                if False:
+                    yield
+
+        func = udtf(TestUDTF, returnType="key: int, value: string")
+        df = self.spark.createDataFrame([(1, "a"), (1, "b"), (2, "c"), (2, 
"d")], ["key", "value"])
+        assertDataFrameEqual(
+            func(df.asTable().orderBy(df.value)),

Review Comment:
   Multiple `partition by`s are not supported  as
   
   ```
   [PARSE_SYNTAX_ERROR] Syntax error at or near 'PARTITION'. SQLSTATE: 42601 
(line 1, pos 87)
   
   == SQL ==
   
   SELECT * FROM test_udtf(TABLE (SELECT id FROM range(0, 8)) PARTITION BY id 
ORDER BY id PARTITION BY id)
   
---------------------------------------------------------------------------------------^^^
   ```
   or
   ```
   == SQL ==
   SELECT * FROM test_udtf(TABLE (SELECT id FROM range(0, 8)) PARTITION BY id 
PARTITION BY id)
   
---------------------------------------------------------------------------^^^
   
   ```
   
   let me adjust that



##########
python/pyspark/sql/tests/test_udtf.py:
##########
@@ -1064,6 +1074,71 @@ def eval(self, row: Row):
         func = udtf(TestUDTF, returnType="a: int")
         return func
 
+    def test_df_asTable_chaining_methods(self):
+        class TestUDTF:
+            def eval(self, row: Row):
+                yield row["key"], row["value"]
+
+            def terminate(self):
+                if False:
+                    yield
+
+        func = udtf(TestUDTF, returnType="key: int, value: string")
+        df = self.spark.createDataFrame([(1, "a"), (1, "b"), (2, "c"), (2, 
"d")], ["key", "value"])
+        assertDataFrameEqual(
+            func(df.asTable().orderBy(df.value)),

Review Comment:
   Multiple `partition by`s are not supported  as
   
   ```
   [PARSE_SYNTAX_ERROR] Syntax error at or near 'PARTITION'. SQLSTATE: 42601 
(line 1, pos 87)
   
   == SQL ==
   
   SELECT * FROM test_udtf(TABLE (SELECT id FROM range(0, 8)) PARTITION BY id 
ORDER BY id PARTITION BY id)
   
---------------------------------------------------------------------------------------^^^
   ```
   or
   ```
   == SQL ==
   SELECT * FROM test_udtf(TABLE (SELECT id FROM range(0, 8)) PARTITION BY id 
PARTITION BY id)
   
---------------------------------------------------------------------------^^^
   
   ```
   
   Adjusted.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-50392][PYTHON] DataFrame conversion to table argument in Spark Classic [spark]

Reply via email to