Re: [PR] [SPARK-45620][PYTHON] Fix user-facing APIs related to Python UDTF to use camelCase [spark]

via GitHub Mon, 23 Oct 2023 00:57:51 -0700


LuciferYang commented on PR #43470:
URL: https://github.com/apache/spark/pull/43470#issuecomment-1774628434


   Seems there are scala side test failed after this one merged:
   
   - https://github.com/apache/spark/actions/runs/6608112828/job/17946379914
   - https://github.com/apache/spark/actions/runs/6608763491/job/17947949938
   
   <img width="1228" alt="image" 
src="https://github.com/apache/spark/assets/1475305/b52bcbdf-faa7-46c5-a99f-0116e4a9788a";>
   
   I test `PythonUDTFSuite` locally:
   
   1. before this pr
   
   ```
   // [SPARK-44753][PYTHON][CONNECT] XML: pyspark sql xml reader writer
   git reset --hard 9f675c54a56e8165e24e84a83c186c949ced5be8
   build/sbt clean "sql/testOnly 
org.apache.spark.sql.execution.python.PythonUDTFSuite"
   ```
   then 
   
   ```
   [info] Run completed in 8 seconds, 301 milliseconds.
   [info] Total number of tests run: 9
   [info] Suites: completed 1, aborted 0
   [info] Tests: succeeded 9, failed 0, canceled 0, ignored 0, pending 0
   [info] All tests passed.
   ```
   
   2. after this pr
   
   ```
   // [SPARK-45620][PYTHON] Fix user-facing APIs related to Python UDTF to use 
camelCase
   git reset --hard e3ba9cf0403ade734f87621472088687e533b2cd
   build/sbt clean "sql/testOnly 
org.apache.spark.sql.execution.python.PythonUDTFSuite"
   ```
   
   then 
   
   ```
   15:46:02.673 WARN 
org.apache.spark.sql.catalyst.analysis.SimpleTableFunctionRegistry: The 
function testudtf replaced a previously registered function.
   [info] - SPARK-44503: Specify PARTITION BY and ORDER BY for TABLE arguments 
*** FAILED *** (420 milliseconds)
   [info]   org.apache.spark.sql.AnalysisException: 
[TABLE_VALUED_FUNCTION_FAILED_TO_ANALYZE_IN_PYTHON] Failed to analyze the 
Python user defined table function: Traceback (most recent call last):
   [info]   File 
"/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/sql/worker/analyze_udtf.py",
 line 119, in main
   [info]     result = handler.analyze(*args, **kwargs)  # type: 
ignore[attr-defined]
   [info]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   [info]   File "<string>", line 21, in analyze
   [info] AttributeError: 'AnalyzeArgument' object has no attribute 'is_table'
   [info]  SQLSTATE: 38000; line 8 pos 5
   [info]   at 
org.apache.spark.sql.errors.QueryCompilationErrors$.tableValuedFunctionFailedToAnalyseInPythonError(QueryCompilationErrors.scala:1985)
   [info]   at 
org.apache.spark.sql.execution.python.UserDefinedPythonTableFunctionAnalyzeRunner.receiveFromPython(UserDefinedPythonFunction.scala:229)
   [info]   at 
org.apache.spark.sql.execution.python.UserDefinedPythonTableFunctionAnalyzeRunner.receiveFromPython(UserDefinedPythonFunction.scala:186)
   [info]   at 
org.apache.spark.sql.execution.python.PythonPlannerRunner.runInPython(PythonPlannerRunner.scala:103)
   
   ...
   [info] - SPARK-45402: Add UDTF API for 'analyze' to return a buffer to 
consume on class creation *** FAILED *** (39 milliseconds)
   [info]   org.apache.spark.sql.AnalysisException: 
[TABLE_VALUED_FUNCTION_FAILED_TO_ANALYZE_IN_PYTHON] Failed to analyze the 
Python user defined table function: Traceback (most recent call last):
   [info]   File 
"/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/sql/worker/analyze_udtf.py",
 line 119, in main
   [info]     result = handler.analyze(*args, **kwargs)  # type: 
ignore[attr-defined]
   [info]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   [info]   File "<string>", line 16, in analyze
   [info] AttributeError: 'AnalyzeArgument' object has no attribute 'data_type'
   [info]  SQLSTATE: 38000; line 1 pos 14
   ...
   [info] Run completed in 8 seconds, 26 milliseconds.
   [info] Total number of tests run: 9
   [info] Suites: completed 1, aborted 0
   [info] Tests: succeeded 7, failed 2, canceled 0, ignored 0, pending 0
   [info] *** 2 TESTS FAILED ***
   [error] Failed tests:
   [error]      org.apache.spark.sql.execution.python.PythonUDTFSuite
   [error] (sql / Test / testOnly) sbt.TestsFailedException: Tests unsuccessful
   ```
   
   GA of this pr passed seems due to this PR did not touch any Scala code, all 
tests on the Scala side were skipped.
   
   - https://github.com/apache/spark/actions/runs/6607626526/job/17945278871
   
   <img width="1708" alt="image" 
src="https://github.com/apache/spark/assets/1475305/32485985-c02e-43c2-8085-0792618cbbfc";>
   
   
   Could you take a look? Thanks @ueshin 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-45620][PYTHON] Fix user-facing APIs related to Python UDTF to use camelCase [spark]

Reply via email to