[ 
https://issues.apache.org/jira/browse/SPARK-34344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280760#comment-17280760
 ] 

Arpan Bhandari commented on SPARK-34344:
----------------------------------------

[~hyukjin.kwon] : I have updated the diescription , let me know if u need more 
details on this.

> Have functionality to trace back Spark SQL queries from the application ID 
> that got submitted on YARN
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-34344
>                 URL: https://issues.apache.org/jira/browse/SPARK-34344
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Shell, Spark Submit
>    Affects Versions: 1.6.3, 2.3.0, 2.4.5
>            Reporter: Arpan Bhandari
>            Priority: Major
>
> We need to have Application Id from resource manager mapped to the specific 
> spark sql query that got executed with respect to that application Id so that 
> back tracing is possible.
> For example : if i run a query using spark shell : 
> spark.sql("select dt.d_year,item.i_brand_id brand_id,item.i_brand 
> brand,sum(ss_ext_sales_price) sum_agg from date_dim dt,store_sales,item where 
> dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = 
> item.i_item_sk and item.i_manufact_id = 436 and dt.d_moy=12 group by 
> dt.d_year,item.i_brand,item.i_brand_id order by dt.d_year,sum_agg 
> desc,brand_id limit 100").show();
> When  i see the event logs or the history server i don't see the query 
> anywhere, but the query plan is there, so it becomes difficult to trace back 
> what query actually got submitted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to