Re: [PR] [SPARK-46686][PYTHON][CONNECT] Basic support of SparkSession based Python UDF profiler [spark]

via GitHub Wed, 17 Jan 2024 11:08:59 -0800


xinrong-meng commented on code in PR #44697:
URL: https://github.com/apache/spark/pull/44697#discussion_r1456338510



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -2927,6 +2927,17 @@ object SQLConf {
       // show full stacktrace in tests but hide in production by default.
       .createWithDefault(Utils.isTesting)
 
+  val PYTHON_UDF_PROFILER =
+    buildConf("spark.sql.pyspark.udf.profiler")
+      .doc("Configure the Python/Pandas UDF profiler by enabling or disabling 
it " +
+        "with the option to choose between \"perf\" and \"memory\" types, " +
+        "or unsetting the config disables the profiler. This is disabled by 
default.")
+      .version("4.0.0")
+      .stringConf
+      .transform(_.toLowerCase(Locale.ROOT))
+      .checkValues(Set("perf", "memory"))

Review Comment:
   I noticed there are multiple user-facing references to the current "perf" 
profiler: [Python Profilers for UDFs 
](https://spark.apache.org/docs/latest/api/python/development/debugging.html#id7),
 [Workers profiling](https://www.databricks.com/blog/how-profile-pyspark). It 
would be great we could make them consistent.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-46686][PYTHON][CONNECT] Basic support of SparkSession based Python UDF profiler [spark]

Reply via email to