travigd opened a new pull request #4978:
URL: https://github.com/apache/openwhisk/pull/4978


   <!--- Provide a concise summary of your changes in the Title -->
   
   ## Description
   <!--- Provide a detailed description of your changes. -->
   <!--- Include details of what problem you are solving and how your changes 
are tested. -->
   
   The motivation for this change is essentially that some activations trigger 
the creation of new containers, which has the potentially to be a very long 
operation depending (e.g., if the creation of the container triggered a 
scale-up in cluster size), and OpenWhisk will block (with respect to that 
activation) until the container is created. This is not ideal since oftentimes, 
other containers become free and ready to service the request in the meantime.
   
   This change:
   * Immediately moves cold start containers into the busy queue and waits for 
them to send `NeedWork` to the `ContainerPool` (this means that the activation 
will run on the next available container that can service the request)
   * Adds a `PreRun` event that is sent to the container to tell it how to 
initialize
   
   ### Current Issues
   This doesn't currently pass all the tests, but I suspect I'm at the limit of 
what I can do here (as I don't have lots of other context around the codebase 
and the project overall).
   
   * There feels like a weird parallel-but-not-quite pathway between prewarm 
and cold-start containers now (since now, both have an initialization step, but 
the prewarm has the stem-cell-differentiation step as well). It would be nice 
to simplify that if possible.
   
   
   > Quick note on my terminology, since I think I understand how these words 
are used, but want to make sure:
   > * action = "specification for how to handle a request" (eg, using blackbox 
image or nodejs runtime)
   > * activation = "specific request that is handled by an action"
   
   The run buffer only considers the current head of the run queue (for 
activations that couldn't be immediately sent to containers), which is not 
ideal. Imagine this scenario:
   * Run A (using image 1) triggers a cold start, A is enqueued on the run 
buffer
   * Run B (using image 2) triggers a cold start, B is enqueued on the run 
buffer (position 1)
   * Container 2 finishes initialization (maybe it didn't require pulling an 
image while image 1 did have to be pulled)
   * The ContainerPool gets `NeedWork` from container 2, moves container 2 into 
the free pool, triggers `processBufferOrFeed()`
   * `processBufferOrFeed` re-sends the first run on the buffer, which is run A
   * There are no containers available to service run A (since it's still 
initializing)
   * Run A is re-enqueued and nothing else happens (even though run B could 
have been serviced)
   * Container 1 finishes initializing, sends `NeedWork`
   * Run A is de-queued again, and now handled
   * Run B is never actually handled???
   
   One solution to this might be to add another layer between ContainerPool and 
ContainerProxy which would be something like `ActionPool`. The `ActionPool` 
would handle the creation of new containers and serving requests in the order 
in which they're received, but that gets messy when you need to restrict the 
total size of the ContainerPool (and it's also messy when there are resource 
contention issues - what happens when two different actions want to scale up 
the number of containers but the container pool is at the max size? right now, 
it would attempt to reap some old containers to service the next activation, so 
it's fair with respect to activation order). 
   
   A (potentially simpler) solution might just be to have per-action buffers, 
so when you get a `NeedWork` corresponding to action 1, dequeue from that 
specific buffer.
   
   ## Related issue and scope
   <!--- Please include a link to a related issue if there is one. -->
   #4974 
   
   ## My changes affect the following components
   <!--- Select below all system components are affected by your change. -->
   <!--- Enter an `x` in all applicable boxes. -->
   - [ ] API
   - [ ] Controller
   - [ ] Message Bus (e.g., Kafka)
   - [ ] Loadbalancer
   - [ ] Invoker
   - [ ] Intrinsic actions (e.g., sequences, conductors)
   - [ ] Data stores (e.g., CouchDB)
   - [ ] Tests
   - [ ] Deployment
   - [ ] CLI
   - [ ] General tooling
   - [ ] Documentation
   
   ## Types of changes
   <!--- What types of changes does your code introduce? Use `x` in all the 
boxes that apply: -->
   - [ ] Bug fix (generally a non-breaking change which closes an issue).
   - [ ] Enhancement or new feature (adds new functionality).
   - [ ] Breaking change (a bug fix or enhancement which changes existing 
behavior).
   
   ## Checklist:
   <!--- Please review the points below which help you make sure you've covered 
all aspects of the change you're making. -->
   
   - [ ] I signed an [Apache 
CLA](https://github.com/apache/openwhisk/blob/master/CONTRIBUTING.md).
   - [ ] I reviewed the [style 
guides](https://github.com/apache/openwhisk/wiki/Contributing:-Git-guidelines#code-readiness)
 and followed the recommendations (Travis CI will check :).
   - [ ] I added tests to cover my changes.
   - [ ] My changes require further changes to the documentation.
   - [ ] I updated the documentation where necessary.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to