Dear whiskers. I am starting this thread to discuss the better way to collect action logs. After this commit( https://github.com/apache/incubator-openwhisk/commit/57475367b509fd2d4c14f5678d0c26642c52cc91) is merged, TPS for blocking call is significantly decreased. It's about 1/10 of previous results.
I think the logic behind that PR is quite fair. But still, such a huge performance degradation would not be the one that we want. I performed simple benchmarks and figured out this is because log collection takes a long time. (Performed benchmarking with 0 log limit and observed high TPS.) One more related issue is the log collection with the action concurrency. AFAIK, logs for each invocation are differentiated by some special string, so-called "sentinel". This way does fit well when there is no concurrency(the concurrency limit is 1). But if we enable the concurrency, logs are interleaved and the sentinel is no longer effective as is. So currently, if we want to enable concurrency, we need to disable log collection or use different LogStore implementation with external log collection capability. Above issues imply the needs for a new way to collect and manage logs. The current solution depends on container logs, ContainerProxy accesses directly to the container log file and collect them as an AkkaStream. Since it introduces disk IO, it's relatively slow. Without deep consideration, I think one option is to include logs in the activation response from the action container. With concurrency, each invocation will keep its own buffered stream to write logs and create an activation response including it. Each currency invocation will create its own buffered stream, logs for each invocation can be segregated. It happens in the memory only, I expect it would show better performance in general. One issue is the log limit. The current maximum log limit is 10MB. But I have been curious there is the real case that one function generates 10MB logs for each invocation. If yes, would it be really meaningful for users? I think it would not be easy to look into it. If we can reduce the maximum limit to 1MB or less, it might be effective to collect logs in this way. Since I did not ponder on it, there might be some side effects. Please share any idea or feedbacks. Thanks Regards Dominic.
