[ 
https://issues.apache.org/jira/browse/TEZ-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-1698:
----------------------------------
    Attachment: TEZ-1698.1.patch

Both ResourceCalculatorPlugin & ResourceCalculatorProcessTree ends up opening 
lots of file handles and both of them are in YARN.  Attaching a simple patch 
which would allow users to disable resource calculator in TaskCounterUpdater.  

> Use ResourceCalculatorPlugin instead of ResourceCalculatorProcessTree in Tez
> ----------------------------------------------------------------------------
>
>                 Key: TEZ-1698
>                 URL: https://issues.apache.org/jira/browse/TEZ-1698
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.5.2
>            Reporter: Gopal V
>            Assignee: Rajesh Balamohan
>         Attachments: ProcfsBasedProcessTree.png, ProcfsFiles.png, 
> TEZ-1698.1.patch
>
>
> ResourceCalculatorProcessTree scraps all of /proc/ for PIDs which are part of 
> the current task's process group.
> This is mostly wasted in Tez, since unlike YARN which has to do this since it 
> has the PID for the container-executor process (bash) and has to trace the 
> bash -> java spawn inheritance.
> !ProcfsBasedProcessTree.png!
> The latency effect of this is less clearly visible with the profiler turned 
> on as this is primarily related to rate of syscalls + overhead in the kernel 
> (via the following codepath in YARN).
> !ProcfsFiles.png!
> {code}
>  private List<String> getProcessList() {
>     String[] processDirs = (new File(procfsDir)).list();
> ...
>     for (String dir : processDirs) {
>       try {
>         if ((new File(procfsDir, dir)).isDirectory()) {
>           processList.add(dir);
>         }
> ...
>   public void updateProcessTree() {
>     if (!pid.equals(deadPid)) {
>       // Get the list of processes
>       List<String> processList = getProcessList();
> ...
>       for (String proc : processList) {
>         // Get information for each process
>         ProcessInfo pInfo = new ProcessInfo(proc);
>         if (constructProcessInfo(pInfo, procfsDir) != null) {
>           allProcessInfo.put(proc, pInfo);
>           if (proc.equals(this.pid)) {
>             me = pInfo; // cache 'me'
>             processTree.put(proc, pInfo);
>           }
>         }
>       }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to