Luca Canali created SPARK-34265:
-----------------------------------

             Summary: Add SQLMetrics instrumentation to Python UDF
                 Key: SPARK-34265
                 URL: https://issues.apache.org/jira/browse/SPARK-34265
             Project: Spark
          Issue Type: Improvement
          Components: PySpark, SQL
    Affects Versions: 3.1.1
            Reporter: Luca Canali


This proposes to add SQLMetrics instrumentation for Python UDF. This is aimed 
at improving monitoring and performance troubleshooting of Python code called 
by Spark, via UDF, Pandas UDF or with MapPartittions.
The introduced metrics are exposed to the end users via the WebUI interface, in 
the SQL tab for execution steps related to Python UDF execution, namely 
BatchEvalPython, ArrowEvalPython, AggregateInPandas, FlaMapGroupsInPandas, 
FlatMapsCoGroupsInPandas, WindowsInPandas.
See also the attached screenshot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to