[
https://issues.apache.org/jira/browse/TWILL-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623244#comment-14623244
]
ASF GitHub Bot commented on TWILL-116:
--------------------------------------
Github user hsaputra commented on a diff in the pull request:
https://github.com/apache/incubator-twill/pull/52#discussion_r34409804
--- Diff:
twill-yarn/src/test/java/org/apache/twill/yarn/EchoServerTestRun.java ---
@@ -107,6 +112,30 @@ public void run() {
controller.changeInstances("EchoServer", 2);
Assert.assertTrue(waitForSize(echoServices, 2, 120));
+ // Test restart on instances for runnable
+
+ TimeUnit.SECONDS.sleep(6L);
--- End diff --
I could move the test to wait for right resource report and return null if
timeout reached and it always have the same containers.
```
@Test
public void testEchoServer() {
...
// Test restart on instances for runnable
Map<Integer, String> instanceIdToContainerId = Maps.newHashMap();
ResourceReport report = waitForResourceReport(controller, "EchoServer",
5L,
TimeUnit.SECONDS, 2, null);
Collection<TwillRunResources> runResources =
report.getRunnableResources("EchoServer");
for (TwillRunResources twillRunResources : runResources) {
instanceIdToContainerId.put(twillRunResources.getInstanceId(),
twillRunResources.getContainerId());
}
controller.restartAllInstances("EchoServer");
Assert.assertTrue(waitForSize(echoServices, 2, 120));
report = waitForResourceReport(controller, "EchoServer", 5L,
TimeUnit.SECONDS, 2,
instanceIdToContainerId);
Assert.assertTrue(report != null);
...
}
/**
* Need helper method here to wait for getting resource report because
{@link TwillController#getResourceReport()}
* could return null if the application has not fully started.
*
* To avoid sleep, if instanceIdToContainerId is passed, then compare
the container ids to ones before.
*/
@Nullable
private ResourceReport waitForResourceReport(TwillController controller,
String runnable, long timeout,
TimeUnit timeoutUnit,
int numOfResources,
@Nullable
Map<Integer, String> instanceIdToContainerId) {
Stopwatch stopwatch = new Stopwatch();
stopwatch.start();
do {
ResourceReport report = controller.getResourceReport();
if (report == null || report.getRunnableResources(runnable) == null) {
Uninterruptibles.sleepUninterruptibly(100, TimeUnit.MILLISECONDS);
} else if (report.getRunnableResources(runnable) == null ||
report.getRunnableResources(runnable).size() != numOfResources) {
Uninterruptibles.sleepUninterruptibly(100, TimeUnit.MILLISECONDS);
} else {
if (instanceIdToContainerId == null) {
return report;
} else {
Collection<TwillRunResources> runResources =
report.getRunnableResources(runnable);
boolean isSameContainer = false;
for (TwillRunResources twillRunResources : runResources) {
int instanceId = twillRunResources.getInstanceId();
if
(twillRunResources.getContainerId().equals(instanceIdToContainerId.get(instanceId)))
{
// found same container id lets wait again.
isSameContainer = true;
break;
}
}
if (!isSameContainer) {
return report;
} else {
Uninterruptibles.sleepUninterruptibly(100,
TimeUnit.MILLISECONDS);
}
}
}
} while (stopwatch.elapsedTime(timeoutUnit) < timeout);
return null;
}
```
> Support for restart instances of runnable in an application
> -----------------------------------------------------------
>
> Key: TWILL-116
> URL: https://issues.apache.org/jira/browse/TWILL-116
> Project: Apache Twill
> Issue Type: New Feature
> Components: core
> Reporter: Albert Shau
> Assignee: Henry Saputra
> Fix For: 0.6.0-incubating
>
> Attachments: TWILL-116-design-4.pdf, TWILL-116-design-5.pdf,
> TWILL-116-design-6.pdf, TWILL-116-design-7.pdf, TWILL-116-design-final-2.pdf
>
>
> Once an application is running, it would be good to be able to stop, start,
> and restart a specific runnable of the application without affecting other
> runnables.
> For example, I may be running multiple services in a single application, with
> each service as a different runnable. One of my services gets into an invalid
> state. I now want to restart just that runnable and not the other ones that
> are running properly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)