[jira] [Commented] (TEZ-3351) Handle ATS performance issues for Hive-LLAP.

Hitesh Shah (JIRA) Mon, 18 Jul 2016 13:54:05 -0700

    [ 
https://issues.apache.org/jira/browse/TEZ-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383038#comment-15383038
 ]


Hitesh Shah commented on TEZ-3351:
----------------------------------

General questions:
   - Is ATSHistoryLogLevel sufficient to control the amount of data pushed to 
ATS? What if dags have 5000 counters each? Would that be handled easily by ATS? 
Do we need a knob to stop publishing counters at certain levels? Additionally, 
why do we need so many logging levels? 

Comments on the patch: 
   - ATSHistoryLogLevel sounds ATS specific. ATS is actually a plugin and 
therefore it seems wrong to add a plugin specific API to the generic APIs of 
TezClient and DAGClient. A generic API to set logging level makes sense though. 
  - "private static final int DAGS_PER_GROUP = 1000;"  - why a 1000? What kind 
of perf impact will we see if we scale this value upwards or downwards? Can 
this be changed dynamically per app? 
  - Why does a TIMELINE_GROUPID related vars belong to TezDAGId? 
 
{code}
    } else {
367           // dagId does not exist, lets check at AM level.
368           if 
(!amAtsHistoryLogLevel.shouldLog(eventType.getAtsHistoryLogLevel())) {
369             return false;
370           }
371         }
{code}
  - this approach is incorrect for recovery cases. If the dag object exists, it 
should be easy enough to retrieve the log level and add it back to the map. 
This does raise an issue for cache misses though. 

  - testATSLogLevelNone() - not sure how this is actually testing that there is 
no data being generated by the history logger?
  - there needs to be additional testing for a level that does not match 
ALL/NONE  
  - ATSVHistoryLoggingService has not been changed?

  -  "Find better way to wait for the events to be drained." - timing based 
tests have a tendency to be flaky. Would be good to change this to more 
definitive. 


 

 

> Handle ATS performance issues for Hive-LLAP.
> --------------------------------------------
>
>                 Key: TEZ-3351
>                 URL: https://issues.apache.org/jira/browse/TEZ-3351
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Harish Jaiprakash
>            Assignee: Harish Jaiprakash
>         Attachments: TEZ-3351.01.patch, TEZ-3351.WIP.01.patch
>
>
> With Hive-LLAP, we have subsecond queries and hence can run lots of queries 
> in a small interval. This can overload ATS with lot of events and create 
> large number of file hdfs. This jira is to reduce performance impact of ATS 
> logging by Tez.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-3351) Handle ATS performance issues for Hive-LLAP.

Reply via email to