This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 10c0777  [SPARK-37263][PYTHON] Add PandasAPIOnSparkAdviceWarning class
10c0777 is described below

commit 10c0777eb4cb05cdcdf776959cb09efae7577b20
Author: itholic <[email protected]>
AuthorDate: Fri Nov 12 10:10:16 2021 +0900

    [SPARK-37263][PYTHON] Add PandasAPIOnSparkAdviceWarning class
    
    ### What changes were proposed in this pull request?
    
    This PR proposes add warning class `PandasAPIOnSparkAdviceWarning`, so that 
users can manually turn the warning off by using `warnings.simplefilter`.
    
    The `PandasAPIOnSparkAdviceWarning` is issued by default as below:
    
    ```python
    >>> psdf.to_pandas()
    
/Users/haejoon.lee/Desktop/git_store/spark/python/pyspark/pandas/utils.py:971: 
PandasAPIOnSparkAdviceWarning: `to_pandas` loads all data into the driver's 
memory. It should only be used if the resulting pandas DataFrame is expected to 
be small.
      warnings.warn(message, PandasAPIOnSparkAdviceWarning)
       A
    0  1
    1  2
    2  3
    3  4
    
    ```
    
    For silencing the advice warning message, you can use 
`warnings.simplefilter` with specifying the `PandasAPIOnSparkAdviceWarning` 
class  as below:
    
    ```python
    >>> from pyspark.pandas.utils import PandasAPIOnSparkAdviceWarning
    >>> with warnings.catch_warnings():
    ...     warnings.simplefilter('ignore', PandasAPIOnSparkAdviceWarning)
    ...     psdf.to_pandas()
    ...
       A
    0  1
    1  2
    2  3
    3  4
    ```
    
    ### Why are the changes needed?
    
    Sometimes the messages are too verbose to display, so someone might not 
need to see the advice log.
    
    ### Does this PR introduce _any_ user-facing change?
    
    The `UserWarning` for log_advice is changed to 
`PandasAPIOnSparkAdviceWarning`.
    
    ### How was this patch tested?
    
    Manually test
    
    Closes #34550 from itholic/SPARK-37263.
    
    Authored-by: itholic <[email protected]>
    Signed-off-by: Hyukjin Kwon <[email protected]>
---
 python/pyspark/pandas/utils.py | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/python/pyspark/pandas/utils.py b/python/pyspark/pandas/utils.py
index e07d416..be71d70 100644
--- a/python/pyspark/pandas/utils.py
+++ b/python/pyspark/pandas/utils.py
@@ -65,6 +65,10 @@ ERROR_MESSAGE_CANNOT_COMBINE = (
 SPARK_CONF_ARROW_ENABLED = "spark.sql.execution.arrow.pyspark.enabled"
 
 
+class PandasAPIOnSparkAdviceWarning(Warning):
+    pass
+
+
 def same_anchor(
     this: Union["DataFrame", "IndexOpsMixin", "InternalFrame"],
     that: Union["DataFrame", "IndexOpsMixin", "InternalFrame"],
@@ -964,7 +968,7 @@ def log_advice(message: str) -> None:
     for the existing pandas/PySpark users who may not be familiar with 
distributed environments
     or the behavior of pandas.
     """
-    warnings.warn(message, UserWarning)
+    warnings.warn(message, PandasAPIOnSparkAdviceWarning)
 
 
 def _test() -> None:

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to