[ 
https://issues.apache.org/jira/browse/SPARK-26363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720259#comment-16720259
 ] 

ASF GitHub Bot commented on SPARK-26363:
----------------------------------------

gengliangwang opened a new pull request #23310: [SPARK-26363][WebUI] Remove 
redundant field `executorLogs` in TaskData
URL: https://github.com/apache/spark/pull/23310
 
 
   ## What changes were proposed in this pull request?
   
   In https://github.com/apache/spark/pull/21688, a new filed `executorLogs` is 
added to `TaskData` in `api.scala`:
   1. The field should not belong to `TaskData` (from the meaning of wording).
   2. This is redundant with ExecutorSummary. 
   3. For each row in the task table, the executor log value is lookup in KV 
store every time, which can be avoided for better performance.  
   
![image](https://user-images.githubusercontent.com/1097932/49946230-841c7680-ff29-11e8-8b83-d8f7553bfe5e.png)
   
   
   This PR propose to reuse the executor details of request "/allexecutors" , 
so that we can have a cleaner api data structure, and redundant KV store 
queries are avoided.
   (Before https://github.com/apache/spark/pull/21688 ,  stage page used a hash 
map to avoid duplicated executor logs lookup. But I think reusing the result of 
"allexecutors" is better.)
   ## How was this patch tested?
   
   Manual check

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove redundant field `executorLogs` in TaskData
> -------------------------------------------------
>
>                 Key: SPARK-26363
>                 URL: https://issues.apache.org/jira/browse/SPARK-26363
>             Project: Spark
>          Issue Type: Improvement
>          Components: Web UI
>    Affects Versions: 3.0.0
>            Reporter: Gengliang Wang
>            Priority: Major
>
> In https://github.com/apache/spark/pull/21688, a new filed `executorLogs` is 
> added to `TaskData` in `api.scala`:
> 1. The field should not belong to `TaskData` (from the meaning of wording).
> 2. This is redundant with ExecutorSummary. 
> 3. For each row in the task table, the executor log value is lookup in KV 
> store every time, which can be avoided for better performance in large scale.
> This PR propose to reuse the executor details of request "/allexecutors" , so 
> that we can have a cleaner api data structure, and redundant KV store queries 
> are avoided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to