[ 
https://issues.apache.org/jira/browse/TEZ-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106151#comment-14106151
 ] 

Siddharth Seth commented on TEZ-1363:
-------------------------------------

In TezLocalAMRMClientAsync
- containerRequests - a LinkedMap / map would likely work better (if a local 
job ever has lots of tasks)
- LocalContainerAllocator should be a daemon. Also the unnecessary fields from 
the constructor could be dropped.
- RegisterApplicationMasterResponse should contain min / max capability since 
the TaskScheduler may have some error checking w.r.t this. Not too sure about 
ACLs being null.
- getTopPriority - would be more efficient if Priority is maintained 
independently instead of scanning the entire list.
- getMatchingRequests - Does capability / resourceName need to be considered ?
- getMatchingRequestsForTopPriority - should be using the top priority to 
return a match.
A lot of this logic already exists in TezAMRMClientAsync, in conjunction with 
AMRMClientAsync. Is there a way to re-use the logic ? (I doubt AMRMClientAsycn 
allows overriding methods which make the final request out to YARN after 
updating it's local structures). 

In LocalContainerAllocator
{code}
+  void handleRemoveContainerRequest(RemoveContainerRequest request) {
+    LOG.info("Processing remove container request: " + request.req);
+    // Otherwise the allocation will happen.
+    if (!containerRequestQueue.remove(request.req)) {
+      LOG.warn("Unable to find and remove container request: " + request.req);
+    }
+  }
{code}
Don't think this will work. request.req is a CookieContainerRequest - the 
corresponding AddContainerRequest ("ContainerRequest") in the queue will have 
to be removed. A map based removal would be better - instead of having to scan 
through the queue. Alternately, could just retain a list of cancelled ones and 
match against the launch list.

{code}
+      NodeId nodeId = NodeId.newInstance("127.0.0.1", 0);
+      String nodeHttpAddress = "127.0.0.1:0";
{code}
Should be picked up from the environment, to stay consistent.

LocalContainerLauncher would have to inform the LocalContainerAllocator about 
NM_STOP_REQUESTS which have been sent out - otherwise I don't think 
LocalContainerAllocator would find out about completed containers. Similarly, 
will have to look at what happens in case of failures. I'm not sure 
S_CONTAINER_DEALLOCATE message from AMContainer actually makes it over to 
AMRMClient (via the TaskScheduler).

Merging the LocalContainerLauncher / LocalContainerAllocator may help with this 
- or at least passing messages between them.



> Make use of the regular scheduler when running in LocalMode
> -----------------------------------------------------------
>
>                 Key: TEZ-1363
>                 URL: https://issues.apache.org/jira/browse/TEZ-1363
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Siddharth Seth
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-1363-v1.patch, TEZ-1363-v2.patch, TEZ-1363-v3.patch
>
>
> In TEZ-708, we decided to introduce a new scheduler for local mode - to keep 
> things simple initially, and get local mode working.
> Eventually, however, scheduling should go through the regular task scheduler 
> - which should be able to get containers from YARN / LocalAllocator / other 
> sources - and treat them as a regular container for scheduling purposes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to