[ 
https://issues.apache.org/jira/browse/TEZ-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated TEZ-3000:
-------------------------
    Attachment: TEZ-3000.patch

Linked another similar jira. Here is the draft path. There are couple issues.

* {{TestTaskSchedulerHelpers#CapturingEventHandler}} uses LinkedList, not 
thread safe.
* The interaction between {{YarnTaskSchedulerService}} and 
{{TaskSchedulerManager}} is async via {{TaskSchedulerContextImplWrapper}}. The 
test code calls {{drainableAppCallback.drain();}} before verification of 
{{taskAllocated}} in most places. But it is missing in some places.
* {{TestTaskSchedulerHelpers}}'s serviceStart changes the TaskScheduler to 
another spy object. This caused {{NODE_LOCAL_ASSIGNER}} which is created during 
{{YarnTaskSchedulerService}} object creation time to refers to the actual 
{{YarnTaskSchedulerService}}, while the test code refers to the spy object. 
That causes some mismatch in synchronization as it appears the spy object uses 
different synchronized lock different from the actual object.
* It seems cleaner to move "setting 
{{YarnTaskSchedulerService#drainedDelayedContainersForTest}} to false" from 
test code to the time when the delayed container is added to the queue.
* Some clean up in the test code.

These fixes have been tested by running the tests for many hours.

> TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption fails
> -----------------------------------------------------------------
>
>                 Key: TEZ-3000
>                 URL: https://issues.apache.org/jira/browse/TEZ-3000
>             Project: Apache Tez
>          Issue Type: Sub-task
>    Affects Versions: 0.8.1-alpha
>            Reporter: Jeff Zhang
>         Attachments: TEZ-3000.patch
>
>
> https://builds.apache.org/job/PreCommit-TEZ-Build/1381//testReport/org.apache.tez.dag.app.rm/TestContainerReuse/testReuseWithTaskSpecificLaunchCmdOption/
> {code}
> Error Message
> Wanted but not invoked:
> taskSchedulerManagerForTest.taskAllocated(
>     0,
>     Mock for TaskAttempt, hashCode: 829607670,
>     <any>,
>     Container: [ContainerId: container_1_0001_01_000002, NodeId: host3:0, 
> NodeHttpAddress: host3:0, Resource: <memory:1024, vCores:1>, Priority: 1, 
> Token: null, ]
> );
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:686)
> However, there were other interactions with this mock:
> taskSchedulerManagerForTest.init(
>     Configuration: core-default.xml, core-site.xml, yarn-default.xml, 
> yarn-site.xml
> );
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:537)
> taskSchedulerManagerForTest.setConfig(
>     Configuration: core-default.xml, core-site.xml, yarn-default.xml, 
> yarn-site.xml
> );
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:537)
> taskSchedulerManagerForTest.serviceInit(
>     Configuration: core-default.xml, core-site.xml, yarn-default.xml, 
> yarn-site.xml
> );
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:537)
> taskSchedulerManagerForTest.start();
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:538)
> taskSchedulerManagerForTest.serviceStart();
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:538)
> taskSchedulerManagerForTest.instantiateSchedulers(
>     "host",
>     0,
>     "",
>     Mock for AppContext, hashCode: 321692161
> );
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:538)
> taskSchedulerManagerForTest.getContainerSignatureMatcher();
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:538)
> taskSchedulerManagerForTest.getConfig();
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:538)
> taskSchedulerManagerForTest.setApplicationRegistrationData(
>     0,
>     Mock for Resource, hashCode: 1463810428,
>     Mock for Map, hashCode: 689203364,
>     null
> );
> -> at 
> org.apache.tez.dag.app.rm.TaskSchedulerContextImpl.setApplicationRegistrationData(TaskSchedulerContextImpl.java:92)
> taskSchedulerManagerForTest.getSpyTaskScheduler();
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:540)
> taskSchedulerManagerForTest.handleEvent(
>     EventType: S_TA_LAUNCH_REQUEST
> );
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:577)
> taskSchedulerManagerForTest.handleEvent(
>     EventType: S_TA_LAUNCH_REQUEST
> );
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:580)
> taskSchedulerManagerForTest.taskAllocated(
>     0,
>     Mock for TaskAttempt, hashCode: 365305781,
>     EventType: S_TA_LAUNCH_REQUEST,
>     Container: [ContainerId: container_1_0001_01_000001, NodeId: host1:0, 
> NodeHttpAddress: host1:0, Resource: <memory:1024, vCores:1>, Priority: 1, 
> Token: null, ]
> );
> -> at 
> org.apache.tez.dag.app.rm.TaskSchedulerContextImpl.taskAllocated(TaskSchedulerContextImpl.java:65)
> taskSchedulerManagerForTest.handleEvent(
>     EventType: S_TA_ENDED
> );
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:594)
> taskSchedulerManagerForTest.containerBeingReleased(
>     0,
>     container_1_0001_01_000001
> );
> -> at 
> org.apache.tez.dag.app.rm.TaskSchedulerContextImpl.containerBeingReleased(TaskSchedulerContextImpl.java:75)
> taskSchedulerManagerForTest.handleEvent(
>     EventType: S_TA_LAUNCH_REQUEST
> );
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:627)
> taskSchedulerManagerForTest.handleEvent(
>     EventType: S_TA_LAUNCH_REQUEST
> );
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:630)
> taskSchedulerManagerForTest.taskAllocated(
>     0,
>     Mock for TaskAttempt, hashCode: 1149896995,
>     EventType: S_TA_LAUNCH_REQUEST,
>     Container: [ContainerId: container_1_0001_01_000002, NodeId: host2:0, 
> NodeHttpAddress: host2:0, Resource: <memory:1024, vCores:1>, Priority: 1, 
> Token: null, ]
> );
> -> at 
> org.apache.tez.dag.app.rm.TaskSchedulerContextImpl.taskAllocated(TaskSchedulerContextImpl.java:65)
> taskSchedulerManagerForTest.handleEvent(
>     EventType: S_TA_ENDED
> );
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:642)
> taskSchedulerManagerForTest.containerBeingReleased(
>     0,
>     container_1_0001_01_000002
> );
> -> at 
> org.apache.tez.dag.app.rm.TaskSchedulerContextImpl.containerBeingReleased(TaskSchedulerContextImpl.java:75)
> taskSchedulerManagerForTest.handleEvent(
>     EventType: S_TA_LAUNCH_REQUEST
> );
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:679)
> taskSchedulerManagerForTest.handleEvent(
>     EventType: S_TA_LAUNCH_REQUEST
> );
> -> at 
> org.apache.tez.dag.app.rm.TestContainerReuse.testReuseWithTaskSpecificLaunchCmdOption(TestContainerReuse.java:680)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to