rwpenney commented on a change in pull request #30745:
URL: https://github.com/apache/spark/pull/30745#discussion_r579826841
##########
File path: python/pyspark/sql/functions.py
##########
@@ -222,6 +222,45 @@ def sum_distinct(col):
return _invoke_function_over_column("sum_distinct", col)
+def product(col, scale=1.0):
+ """
+ Aggregate function: returns the product of the values in a group.
+
+ .. versionadded:: 3.2.0
+
+ Parameters
+ ----------
+ col : str, :class:`Column`
+ column containing values to be multiplied together
+ scale : float
Review comment:
Agreed, the `scale` parameter isn't there to improve precision. (Perhaps
my "0.01" example wasn't ideal, 1/128 might have been a better choice that is
less likely to cause noise in the least significant bits.) The aim, as you say,
is to allow the user to compute the *scaled* product, on the assumption that
they can allow for the *overall* scaling (i.e. the 0.01^N or 2^(-7N) in some
other way at subsequent stages of their calculations. Clearly, there's no point
in having this scaling within `product()` if the user just multiplies the
result by (scale ^ -N).
Again, you're right that the user could just use `product(0.01 * $"col")`,
but it seemed worth offering an overload that invites the user to consider
using a scaling when they can predict something about the order-of-magnitude of
their product, and might make their intention a little clearer than just
multiplying `$"col"` by 0.01. This also allows a little bit more optimization
if `scale` is the result of a calculation or a configuration parameter, and
where it could turn out to be 1.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]