khalidmammadov opened a new pull request, #37748:
URL: https://github.com/apache/spark/pull/37748
### What changes were proposed in this pull request?
PySpark atan2, hypot, pow and pmod functions marked as accepting float type
as argument but produce error when called together. For example:
```
>>> from pyspark.sql.functions import *
>>> df = spark.range(1)
>>> df.select(atan2(1.1, 2.1)).first()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File
"/home/khalid/dev/p/SparkTest/spark_venv3.10/lib/python3.10/site-packages/pyspark/sql/functions.py",
line 962, in atan2
return _invoke_binary_math_function("atan2", col1, col2)
File
"/home/khalid/dev/p/SparkTest/spark_venv3.10/lib/python3.10/site-packages/pyspark/sql/functions.py",
line 111, in _invoke_binary_math_function
return _invoke_function(
File
"/home/khalid/dev/p/SparkTest/spark_venv3.10/lib/python3.10/site-packages/pyspark/sql/functions.py",
line 85, in _invoke_function
return Column(jf(*args))
File
"/home/khalid/dev/p/SparkTest/spark_venv3.10/lib/python3.10/site-packages/pyspark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py",
line 1321, in __call__
File
"/home/khalid/dev/p/SparkTest/spark_venv3.10/lib/python3.10/site-packages/pyspark/sql/utils.py",
line 190, in deco
return f(*a, **kw)
File
"/home/khalid/dev/p/SparkTest/spark_venv3.10/lib/python3.10/site-packages/pyspark/python/lib/py4j-0.10.9.5-src.zip/py4j/protocol.py",
line 330, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling
z:org.apache.spark.sql.functions.atan2. Trace:
py4j.Py4JException: Method atan2([class java.lang.Double, class
java.lang.Double]) does not exist
at
py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
at
py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:339)
at py4j.Gateway.invoke(Gateway.java:276)
at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at
py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.lang.Thread.run(Thread.java:748)
```
Although, this call could be written outside the Spark and supplied as a
literal it's still good to be consistent with SQL API and help user to avoid
seeing error.
after the fix:
```
>>> from pyspark.sql.functions import *
>>> df = spark.range(1)
>>> df.select(atan2(1.1, 2.1)).first()
Row(ATAN2(1.1, 2.1)=0.4825132950224769)
```
### Why are the changes needed?
Improve user experience. Bug fix
### Does this PR introduce _any_ user-facing change?
Yes, no errors
### How was this patch tested?
```
./python/run-tests
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]