markusthoemmes commented on a change in pull request #3132: Separate container
removal from job completion.
URL:
https://github.com/apache/incubator-openwhisk/pull/3132#discussion_r159183508
##########
File path:
core/invoker/src/main/scala/whisk/core/containerpool/ContainerProxy.scala
##########
@@ -248,16 +250,20 @@ class ContainerProxy(
// Sending the message to self on a failure will cause the message
// to ultimately be sent back to the parent (which will retry it)
// when container removal is done.
- case Failure(_) => self ! job
+ case Failure(_) =>
+ rescheduleJob = true
+ self ! job
}
.flatMap(_ => initializeAndRun(data.container, job))
.map(_ => WarmedData(data.container, job.msg.user.namespace,
job.action, Instant.now))
.pipeTo(self)
goto(Running)
- // timeout or removing
- case Event(StateTimeout | Remove, data: WarmedData) =>
destroyContainer(data.container)
+ // container is reclaimed by the pool or it has become too old
+ case Event(StateTimeout | Remove, data: WarmedData) =>
+ rescheduleJob = true // to supress sending message to the pool and not
double count
+ destroyContainer(data.container)
Review comment:
I don't think it's guaranteed that the job is in fact rescheduled in this
case, which would lead to a nasty lock situation if too many containers are
reclaimed due to timeouts.
In fact I think this specific situation needs to be resolved differently,
since at the point of this transition you won't know whether or not you'll
resend a job?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services