(spark) branch master updated: [SPARK-46553][PS] `FutureWarning` for `interpolate` with object dtype

gurwls223 Tue, 02 Jan 2024 00:46:11 -0800

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new c5dd72cff124 [SPARK-46553][PS] `FutureWarning` for `interpolate` with 
object dtype
c5dd72cff124 is described below

commit c5dd72cff1243507f47623c0f697873977b23380
Author: Haejoon Lee <haejoon....@databricks.com>
AuthorDate: Tue Jan 2 17:45:52 2024 +0900

    [SPARK-46553][PS] `FutureWarning` for `interpolate` with object dtype
    
    ### What changes were proposed in this pull request?
    
    This PR proposes to issue a `FutureWarning` for 
`(DataFrame|Series).interpolate` with object dtype.
    
    ### Why are the changes needed?
    
    To match the behavior with Pandas. Using object dtype for `interpolate` is 
deprecated and will raise exception in the future version, so we should issue 
the proper warning such as Pandas does.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Given DataFrame below,
    
    ```python
    >>> psdf = ps.DataFrame({"A": ['a', 'b', 'c'], "B": [1, 2, 3]})
    >>> psdf
       A  B
    0  a  1
    1  b  2
    2  c  3
    ```
    
    **Before**
    
    ```python
    >>> psdf.interpolate()  # Excluding column with object dtype without any 
warning unlike pandas
       B
    0  1
    1  2
    2  3
    ```
    
    **After**
    ```python
    >>> psdf.interpolate()  # Issuing a proper warning
    FutureWarning: DataFrame.interpolate with object dtype is deprecated and 
will raise in a future version. Call df.infer_objects(copy=False) before 
interpolating instead.
      warnings.warn(
       B
    0  1
    1  2
    2  3
    ```
    
    ### How was this patch tested?
    
    No behavior changes, so the existing CI should pass.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #44550 from itholic/SPARK-46553.
    
    Authored-by: Haejoon Lee <haejoon....@databricks.com>
    Signed-off-by: Hyukjin Kwon <gurwls...@apache.org>
---
 python/pyspark/pandas/frame.py  | 7 +++++++
 python/pyspark/pandas/series.py | 6 ++++++
 2 files changed, 13 insertions(+)

diff --git a/python/pyspark/pandas/frame.py b/python/pyspark/pandas/frame.py
index 9846dc0ae10b..a7edac5509b1 100644
--- a/python/pyspark/pandas/frame.py
+++ b/python/pyspark/pandas/frame.py
@@ -6126,6 +6126,13 @@ defaultdict(<class 'list'>, {'col..., 'col...})]
             raise ValueError("invalid limit_direction: 
'{}'".format(limit_direction))
         if (limit_area is not None) and (limit_area not in ["inside", 
"outside"]):
             raise ValueError("invalid limit_area: '{}'".format(limit_area))
+        for dtype in self.dtypes.values:
+            if dtype == "object":
+                warnings.warn(
+                    "DataFrame.interpolate with object dtype is deprecated and 
will raise in a "
+                    "future version. Convert to a specific numeric type before 
interpolating.",
+                    FutureWarning,
+                )
 
         numeric_col_names = []
         for label in self._internal.column_labels:
diff --git a/python/pyspark/pandas/series.py b/python/pyspark/pandas/series.py
index 6d7a7c1f2e56..a35e19545d5a 100644
--- a/python/pyspark/pandas/series.py
+++ b/python/pyspark/pandas/series.py
@@ -2231,6 +2231,12 @@ class Series(Frame, IndexOpsMixin, Generic[T]):
         limit_direction: Optional[str] = None,
         limit_area: Optional[str] = None,
     ) -> "Series":
+        if self.dtype == "object":
+            warnings.warn(
+                "Series.interpolate with object dtype is deprecated and will 
raise in a "
+                "future version. Convert to a specific numeric type before 
interpolating.",
+                FutureWarning,
+            )
         if method not in ["linear"]:
             raise NotImplementedError("interpolate currently works only for 
method='linear'")
         if (limit is not None) and (not limit > 0):


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-46553][PS] `FutureWarning` for `interpolate` with object dtype

Reply via email to