[jira] [Commented] (SPARK-33730) Standardize warning types

Shril Kumar (Jira) Fri, 11 Dec 2020 02:37:34 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-33730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247827#comment-17247827
 ]


Shril Kumar commented on SPARK-33730:
-------------------------------------

I was inspecting the use of the print statements in PySpark. Please confirm if 
my assumption is correct.

 
{code:java}
# pyspark/worker.py

try:
    (soft_limit, hard_limit) = resource.getrlimit(total_memory)
    msg = "Current mem limits: {0} of max {1}\n".format(soft_limit, hard_limit)
    print(msg, file=sys.stderr)    # convert to bytes
    new_limit = memory_limit_mb * 1024 * 1024    if soft_limit == 
resource.RLIM_INFINITY or new_limit < soft_limit:
        msg = "Setting mem limits to {0} of max {1}\n".format(new_limit, 
new_limit)
        print(msg, file=sys.stderr)
        resource.setrlimit(total_memory, (new_limit, new_limit))except 
(resource.error, OSError, ValueError) as e:
    # not all systems support resource limits, so warn instead of failing
    print("WARN: Failed to set memory limit: {0}\n".format(e), file=sys.stderr)
{code}
Do you want these print statements to be changed to warnings.warn?

 

 

> Standardize warning types
> -------------------------
>
>                 Key: SPARK-33730
>                 URL: https://issues.apache.org/jira/browse/SPARK-33730
>             Project: Spark
>          Issue Type: Sub-task
>          Components: PySpark
>    Affects Versions: 3.1.0
>            Reporter: Hyukjin Kwon
>            Priority: Major
>
> We should use warnings properly per 
> [https://docs.python.org/3/library/warnings.html#warning-categories]
> In particular,
>  - we should use {{FutureWarning}} instead of {{DeprecationWarning}} for the 
> places we should show the warnings to end-users by default.
>  - we should __maybe__ think about customizing stacklevel 
> ([https://docs.python.org/3/library/warnings.html#warnings.warn]) like pandas 
> does.
>  - ...
> Current warnings are a bit messy and somewhat arbitrary.
> To be more explicit, we'll have to fix:
> {code:java}
> pyspark/context.py:                    warnings.warn(
> pyspark/context.py:                warnings.warn(
> pyspark/ml/classification.py:                warnings.warn("weightCol is 
> ignored, "
> pyspark/ml/clustering.py:        warnings.warn("Deprecated in 3.0.0. It will 
> be removed in future versions. Use "
> pyspark/mllib/classification.py:        warnings.warn(
> pyspark/mllib/feature.py:            warnings.warn("Both withMean and withStd 
> are false. The model does nothing.")
> pyspark/mllib/regression.py:        warnings.warn(
> pyspark/mllib/regression.py:        warnings.warn(
> pyspark/mllib/regression.py:        warnings.warn(
> pyspark/rdd.py:        warnings.warn("mapPartitionsWithSplit is deprecated; "
> pyspark/rdd.py:        warnings.warn(
> pyspark/shell.py:    warnings.warn("Failed to initialize Spark session.")
> pyspark/shuffle.py:            warnings.warn("Please install psutil to have 
> better "
> pyspark/sql/catalog.py:        warnings.warn(
> pyspark/sql/catalog.py:        warnings.warn(
> pyspark/sql/column.py:            warnings.warn(
> pyspark/sql/column.py:            warnings.warn(
> pyspark/sql/context.py:            warnings.warn(
> pyspark/sql/context.py:        warnings.warn(
> pyspark/sql/context.py:        warnings.warn(
> pyspark/sql/context.py:        warnings.warn(
> pyspark/sql/context.py:        warnings.warn(
> pyspark/sql/dataframe.py:        warnings.warn(
> pyspark/sql/dataframe.py:                warnings.warn("to_replace is a dict 
> and value is not None. value will be ignored.")
> pyspark/sql/functions.py:    warnings.warn("Deprecated in 2.1, use degrees 
> instead.", DeprecationWarning)
> pyspark/sql/functions.py:    warnings.warn("Deprecated in 2.1, use radians 
> instead.", DeprecationWarning)
> pyspark/sql/functions.py:    warnings.warn("Deprecated in 2.1, use 
> approx_count_distinct instead.", DeprecationWarning)
> pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
> pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
> pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
> pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
> pyspark/sql/pandas/conversion.py:                    warnings.warn(msg)
> pyspark/sql/pandas/functions.py:        warnings.warn(
> pyspark/sql/pandas/group_ops.py:        warnings.warn(
> pyspark/sql/session.py:                warnings.warn("Fall back to non-hive 
> support because failing to access HiveConf, "
> {code}
> PySpark prints warnings via using {{print}} in some places as well. We should 
> also see if we should switch and replace to {{warnings.warn}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33730) Standardize warning types

Reply via email to