[GitHub] [spark] itholic opened a new pull request, #39790: [SPARK-42094][PS] Support `fill_value` for `ps.Series.(add|radd)`

via GitHub Sun, 29 Jan 2023 00:49:21 -0800


itholic opened a new pull request, #39790:
URL: https://github.com/apache/spark/pull/39790


   ### What changes were proposed in this pull request?
   
   This PR proposes to support `fill_value` for `ps.Series.(add|radd)`.
   
   **Note**: Currently I made `fill_value` to only support when `other` is 
scalar value (int, str, float), because when `other` is container type it 
requires combine `self` and `other` to compare the value for each position to 
check If data in both corresponding Series (or list, tuple) locations is 
missing, which is potentially very expensive in pandas API on Spark.
   
   ### Why are the changes needed?
   
   To basic support for `fill_value` for `ps.Series.(add|radd)`.
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes.
   
   **Before**
   ```python
   >>> psser  # pandas-on-Spark Series
   0    1.0
   1    2.0
   2    NaN
   3    4.0
   dtype: float64
   
   >>> psser.add(6, fill_value=20)
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   TypeError: add() got an unexpected keyword argument 'fill_value'
   ```
   
   **After**
   ```python
   >>> psser  # pandas-on-Spark Series
   0    1.0
   1    2.0
   2    NaN
   3    4.0
   dtype: float64
   
   >>> pser.add(6, fill_value=20)
   0    11.0
   1    12.0
   2    26.0
   3    14.0
   dtype: float64
   ```
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] itholic opened a new pull request, #39790: [SPARK-42094][PS] Support `fill_value` for `ps.Series.(add|radd)`

Reply via email to