This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 195da56 [SPARK-37624][PYTHON][DOCS] Suppress warnings for live
pandas-on-Spark quickstart notebooks
195da56 is described below
commit 195da5623c36ce54e85bc6584f7b49107899ffff
Author: Hyukjin Kwon <[email protected]>
AuthorDate: Mon Dec 13 17:55:45 2021 +0900
[SPARK-37624][PYTHON][DOCS] Suppress warnings for live pandas-on-Spark
quickstart notebooks
### What changes were proposed in this pull request?
This PR proposes to suppress warnings, in live pandas-on-Spark quickstart
notebooks, as below:
<img width="1109" alt="Screen Shot 2021-12-13 at 2 02 45 PM"
src="https://user-images.githubusercontent.com/6477701/145756407-03b9d94e-a082-42a1-8052-d14b4989ae86.png">
<img width="1103" alt="Screen Shot 2021-12-13 at 2 02 25 PM"
src="https://user-images.githubusercontent.com/6477701/145756419-b5dfa729-96fa-4646-bb26-40302d33632b.png">
<img width="1100" alt="Screen Shot 2021-12-13 at 2 02 20 PM"
src="https://user-images.githubusercontent.com/6477701/145756420-b8e1b105-495b-4b1c-ab3c-4d47793ba80e.png">
<img width="1027" alt="Screen Shot 2021-12-13 at 1 32 05 PM"
src="https://user-images.githubusercontent.com/6477701/145756424-bf93a4a1-2587-49fb-9abb-e2f6de032e48.png">
### Why are the changes needed?
This is a user-facing quickstart that is interactive shall, and showing a
lot of warnings makes difficult to follow the output. Note that we also set a
lower log4j level to interpreters.
### Does this PR introduce _any_ user-facing change?
Users will see clean output in live quickstart notebook:
https://mybinder.org/v2/gh/apache/spark/9e614e265f?filepath=python%2Fdocs%2Fsource%2Fgetting_started%2Fquickstart_ps.ipynb
### How was this patch tested?
I manually tested at
https://mybinder.org/v2/gh/HyukjinKwon/spark/cleanup-ps-quickstart?labpath=python%2Fdocs%2Fsource%2Fgetting_started%2Fquickstart_ps.ipynb
Closes #34875 from HyukjinKwon/cleanup-ps-quickstart.
Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
binder/postBuild | 9 +++++++++
python/docs/source/getting_started/quickstart_ps.ipynb | 10 +++++-----
2 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/binder/postBuild b/binder/postBuild
index 146243b..733eafe 100644
--- a/binder/postBuild
+++ b/binder/postBuild
@@ -33,3 +33,12 @@ else
fi
pip install plotly "pyspark[sql,ml,mllib,pandas_on_spark]$SPECIFIER$VERSION"
+
+# Set 'PYARROW_IGNORE_TIMEZONE' to surpress warnings from PyArrow.
+echo "export PYARROW_IGNORE_TIMEZONE=1" >> ~/.profile
+
+# Surpress warnings from Spark jobs, and UI progress bar.
+mkdir -p ~/.ipython/profile_default/startup
+echo """from pyspark.sql import SparkSession
+SparkSession.builder.config('spark.ui.showConsoleProgress',
'false').getOrCreate().sparkContext.setLogLevel('FATAL')
+""" > ~/.ipython/profile_default/startup/00-init.py
diff --git a/python/docs/source/getting_started/quickstart_ps.ipynb
b/python/docs/source/getting_started/quickstart_ps.ipynb
index 87796ae..494e08d 100644
--- a/python/docs/source/getting_started/quickstart_ps.ipynb
+++ b/python/docs/source/getting_started/quickstart_ps.ipynb
@@ -1619,7 +1619,7 @@
"metadata": {},
"outputs": [],
"source": [
- "prev = spark.conf.get(\"spark.sql.execution.arrow.enabled\") # Keep its
default value.\n",
+ "prev = spark.conf.get(\"spark.sql.execution.arrow.pyspark.enabled\") #
Keep its default value.\n",
"ps.set_option(\"compute.default_index_type\", \"distributed\") # Use
default index prevent overhead.\n",
"import warnings\n",
"warnings.filterwarnings(\"ignore\") # Ignore warnings coming from Arrow
optimizations."
@@ -1639,7 +1639,7 @@
}
],
"source": [
- "spark.conf.set(\"spark.sql.execution.arrow.enabled\", True)\n",
+ "spark.conf.set(\"spark.sql.execution.arrow.pyspark.enabled\", True)\n",
"%timeit ps.range(300000).to_pandas()"
]
},
@@ -1657,7 +1657,7 @@
}
],
"source": [
- "spark.conf.set(\"spark.sql.execution.arrow.enabled\", False)\n",
+ "spark.conf.set(\"spark.sql.execution.arrow.pyspark.enabled\", False)\n",
"%timeit ps.range(300000).to_pandas()"
]
},
@@ -1668,7 +1668,7 @@
"outputs": [],
"source": [
"ps.reset_option(\"compute.default_index_type\")\n",
- "spark.conf.set(\"spark.sql.execution.arrow.enabled\", prev) # Set its
default value back."
+ "spark.conf.set(\"spark.sql.execution.arrow.pyspark.enabled\", prev) #
Set its default value back."
]
},
{
@@ -14486,4 +14486,4 @@
},
"nbformat": 4,
"nbformat_minor": 1
-}
\ No newline at end of file
+}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]