Luca Canali created SPARK-34265:
-----------------------------------
Summary: Add SQLMetrics instrumentation to Python UDF
Key: SPARK-34265
URL: https://issues.apache.org/jira/browse/SPARK-34265
Project: Spark
Issue Type: Improvement
Components: PySpark, SQL
Affects Versions: 3.1.1
Reporter: Luca Canali
This proposes to add SQLMetrics instrumentation for Python UDF. This is aimed
at improving monitoring and performance troubleshooting of Python code called
by Spark, via UDF, Pandas UDF or with MapPartittions.
The introduced metrics are exposed to the end users via the WebUI interface, in
the SQL tab for execution steps related to Python UDF execution, namely
BatchEvalPython, ArrowEvalPython, AggregateInPandas, FlaMapGroupsInPandas,
FlatMapsCoGroupsInPandas, WindowsInPandas.
See also the attached screenshot.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]