Re: Proposal on a future architecture of OpenWhisk

David P Grove Fri, 20 Jul 2018 08:51:41 -0700

Rethinking the architecture to more fully exploit the capabilities of the
underlying container orchestration platforms is pretty exciting.  I think
there are lots of interesting ideas to explore about how best to schedule
the workload.


As brought out in the architecture proposal [1], although it is logically
an orthogonal issue, improving the log processing for user containers is a
key piece of this roadmap.  The initial experiences with the
KubernetesContainerFactory indicate that post-facto log enrichment to add
the activation id to each log line is a serious bottleneck.  It adds
complexity to the system and measurably reduces system performance by
delaying the re-use of action containers until the logs can be extracted
and processing.

I believe what we really want is to be using an openwhisk-aware log driver
that will dynamically inject the current activation id into every log line
as soon as it is written.  Then the user container logs, already properly
enriched when they are generated, can be fed directly into the platform
logging system with no post-processing needed.

If the low-level container runtime is docker 17.09 or better, I think we
could probably achieve this by writing a logging driver plugin [2] that
extends docker's json logging driver.  For non-blackbox containers, I think
we "just" need the /run method to update a shared location that is
accessible to the logging driver plugin with the current activation id
before it invokes the user code.  As log lines are produced, that location
is read and the string with the activation id gets injected into the json
formatted log line as it is produced.   For blackbox containers, we could
have our dockerskeleton do the same thing, but the user would have to opt
in somehow to the protocol if they were using their own action runner.
Warning:  I haven't looked into how flushing works with these drivers, so
I'm not sure that this really works....we need to make sure we don't enrich
a log line with the wrong activation id because of delayed flushing.

If we're running on Kubernetes, we might decide that instead of using a
logging driver plugin, to use a streaming sidecar container as shown in [3]
and have the controller interact with the sidecar to update the current
activation id (or have the sidecar read it from a shared memory location
that is updated by /run to minimize the differences between deployment
platforms).  I'm not sure this really works as well, since the sidecar
might fall behind in processing the logs, so we might still need a
handshake somewhere.

A third option would be to extend our current sentineled log design by also
writing a "START_WHISK_ACTIVATION_LOG <ACTIVATION_ID>" line in the /run
method before invoking the user code.  We'd still have to post-process the
log files, but it could be decoupled from the critical path since the
post-processor would have the activation id available to it in the log
files (and thus would not need to handshake with the controller at all,
thus we could offload all logging to a node-level log processing/forwarding
agent).

Option 3 would be really easy to implement and is independent of the
details of the low-level log driver, but doesn't eliminate the need to
post-process the logs. It just makes it easier to move that processing off
any critical path.

Thoughts?

--dave

[1] https://cwiki.apache.org/confluence/display/OPENWHISK/OpenWhisk+future
+architecture
[2] https://docs.docker.com/v17.09/engine/admin/logging/plugins/
[3] https://kubernetes.io/docs/concepts/cluster-administration/logging/

Re: Proposal on a future architecture of OpenWhisk

Reply via email to