Yikun commented on a change in pull request #34212: URL: https://github.com/apache/spark/pull/34212#discussion_r793225136
########## File path: python/pyspark/pandas/series.py ########## @@ -4483,6 +4487,181 @@ def replace( return self._with_new_scol(current) # TODO: dtype? + def combine( + self, + other: Union[Scalar, "Series"], + func: Callable, + fill_value: Optional[Any] = None, + ) -> "Series": + """ + Combine the Series with a Series or scalar according to `func`. + + Combine the Series and `other` using `func` to perform elementwise + selection for combined Series. + `fill_value` is assumed when value is missing at some index + from one of the two objects being combined. + + .. versionadded:: 3.3.0 + + .. note:: This API executes the function once to infer the type which is + potentially expensive, for instance, when the dataset is created after + aggregations or sorting. + + To avoid this, specify return type in ``func``, for instance, as below: + + >>> def foo(x, y) -> np.int32: Review comment: or maybe just give a `max` exmaple, it would be fluent when user see below doctest. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org