This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 04836babb7a [SPARK-41989][PYTHON] Avoid breaking logging config from 
pyspark.pandas
04836babb7a is described below

commit 04836babb7a1a2aafa7c65393c53c42937ef75a4
Author: Stefaan Lippens <[email protected]>
AuthorDate: Thu Jan 12 18:24:30 2023 +0900

    [SPARK-41989][PYTHON] Avoid breaking logging config from pyspark.pandas
    
    ### What changes were proposed in this pull request?
    
    See https://issues.apache.org/jira/browse/SPARK-41989 for in depth 
explanation
    
    Short summary: `pyspark/pandas/__init__.py` uses, at import time,  
`logging.warning()`  which might silently call `logging.basicConfig()`.
    So by importing `pyspark.pandas` (directly or indirectly) a user might 
unknowingly break their own logging setup (e.g. when based on  
`logging.basicConfig()` or related). `logging.getLogger(...).warning()`  does 
not trigger this behavior.
    
    ### Does this PR introduce _any_ user-facing change?
    
    User-defined logging setups will be more predictable.
    
    ### How was this patch tested?
    
    Manual testing so far.
    I'm not sure it's worthwhile to cover this with a unit test
    
    Closes #39516 from soxofaan/SPARK-41989-pyspark-pandas-logging-setup.
    
    Authored-by: Stefaan Lippens <[email protected]>
    Signed-off-by: Hyukjin Kwon <[email protected]>
---
 python/pyspark/pandas/__init__.py | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/python/pyspark/pandas/__init__.py 
b/python/pyspark/pandas/__init__.py
index 5864025ad2a..980aeab2bee 100644
--- a/python/pyspark/pandas/__init__.py
+++ b/python/pyspark/pandas/__init__.py
@@ -47,9 +47,7 @@ if (
     LooseVersion(pyarrow.__version__) >= LooseVersion("2.0.0")
     and "PYARROW_IGNORE_TIMEZONE" not in os.environ
 ):
-    import logging
-
-    logging.warning(
+    warnings.warn(
         "'PYARROW_IGNORE_TIMEZONE' environment variable was not set. It is 
required to "
         "set this environment variable to '1' in both driver and executor 
sides if you use "
         "pyarrow>=2.0.0. "


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to