zhengruifeng commented on code in PR #53548:
URL: https://github.com/apache/spark/pull/53548#discussion_r3308876717
##########
python/pyspark/sql/functions/builtin.py:
##########
@@ -26968,6 +26968,138 @@ def kll_sketch_agg_double(
return _invoke_function_over_columns(fn, col, lit(k))
+@_try_remote_functions
+def kll_merge_agg_bigint(
+ col: "ColumnOrName",
+ k: Optional[Union[int, Column]] = None,
+) -> Column:
+ """
+ Aggregate function: merges binary KllLongsSketch representations and
returns the
+ merged sketch. The optional k parameter controls the size and accuracy of
the merged
+ sketch (range 8-65535). If k is not specified, the merged sketch adopts
the k value
+ from the first input sketch.
+
+ .. versionadded:: 4.1.0
Review Comment:
normally we don't cherry-pick new feature to old branches, and after this
PR, these new functions actually started since 4.1.2.
It is causing confusion when I audit public APIs by comparing 4.2.0 vs 4.1.0.
will fix in https://github.com/apache/spark/pull/56135
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]