[jira] [Commented] (SPARK-37348) PySpark pmod function

Hyukjin Kwon (Jira) Tue, 16 Nov 2021 20:40:06 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-37348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444938#comment-17444938
 ]


Hyukjin Kwon commented on SPARK-37348:
--------------------------------------

The question should be how often it is used in Python world, and how common it 
is. There are many expressions that are not existent in Python, Scala etc. 
(e.g., regex_* expressions) but they are not implemented on purpose.

> PySpark pmod function
> ---------------------
>
>                 Key: SPARK-37348
>                 URL: https://issues.apache.org/jira/browse/SPARK-37348
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 3.2.0
>            Reporter: Tim Schwab
>            Priority: Minor
>
> Because Spark is built on the JVM, in PySpark, F.lit(-1) % F.lit(2) returns 
> -1. However, the modulus is often desired instead of the remainder.
>  
> There is a [PMOD() function in Spark 
> SQL|https://spark.apache.org/docs/latest/api/sql/#pmod], but [not in 
> PySpark|https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql.html#functions].
>  So at the moment, the two options for getting the modulus is to use 
> F.expr("pmod(A, B)"), or create a helper function such as:
>  
> {code:java}
> def pmod(dividend, divisor):
>     return F.when(dividend < 0, (dividend % divisor) + 
> divisor).otherwise(dividend % divisor){code}
>  
>  
> Neither are optimal - pmod should be native to PySpark as it is in Spark SQL.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-37348) PySpark pmod function

Reply via email to