Thanks James for sharing with the dev list, I think you did a great job finding out what was this mystery behind polling and large payloads.
It was a hard nut to crack. +1 -- Carlos On Mon, Sep 24, 2018 at 3:03 PM James W Dubee <[email protected]> wrote: > Hey Rodric, > > Sure, I split up the two changes into different PRs. The defect fix is now > located here: https://github.com/apache/incubator-openwhisk/pull/4040. > I'll use the PR from my original email for removal of DB polling. > > Regards, > James Dubee > > > [image: Inactive hide details for Rodric Rabbah ---09/21/2018 02:34:42 > PM---Thanks James for the explanation and patches. It sounds lik]Rodric > Rabbah ---09/21/2018 02:34:42 PM---Thanks James for the explanation and > patches. It sounds like there should be two separate PRs, one t > > From: Rodric Rabbah <[email protected]> > To: [email protected] > Date: 09/21/2018 02:34 PM > Subject: Re: Proposal to Remove Artifact Store Polling for Blocking > Invocations > ------------------------------ > > > > > Thanks James for the explanation and patches. It sounds like there should > be two separate PRs, one to address the bug and the other to remove > polling. What do you think? > > -r > > > On Sep 21, 2018, at 1:09 PM, James W Dubee <[email protected]> wrote: > > > > > > > > > > > > Hello OpenWhisk developers, > > > > When a blocking action is invoked, the controller waits for that action's > > response from the invoker and also polls the artifact store for the same > > response. Usually blocking invocation responses are obtained from the > > invoker. However, there are instances when the invocation response is > > retrieved from the artifact store instead. From observation, the most > > likely scenario for a blocking activation to be retrieve from the > artifact > > store is when an action generates a response that exceeds the maximum > > allowed Kafka message size for the "completed" topic. However, this > > situation should not occur as large action responses are meant to be > > truncated by the invoker to the allowed maximum Kafka message size for > the > > corresponding topic. > > > > Currently artifact store polling for activation records is masking a bug > > involving large action responses. While OpenWhisk provides a > configuration > > value, whisk.activation.payload.max, for what one would assume would > allow > > for adjustments to be made to the maximum activation record size, this > > configuration value only adjusts the Kafka topic that is used to schedule > > actions for invocation. Instead the Kafka topic used to communicate the > > completion of an action always uses the default value for > > KAFKA_MESSAGE_MAX_BYTES, which is ~1MB. Additionally, the invoker > truncates > > action responses to the whisk.activation.payload.max value even though > > whisk.activation.payload.max is not being applied properly to the > > "completed" Kafka topic. More over, this truncation does not account for > > data added to the action response by the Kafka producer during > > serialization, so an action response may fail to be sent to the > "completed" > > topic even if its actual action response size adheres to the topic's size > > limitations. As a result, any action response plus the size of > > serialization done by the Kafka producer that exceeds ~1MB will be > > retrieved via artifact store polling. > > > > Performance degradation appears to occur when an activation recorded is > > retrieved via artifact store polling. Artifact store polling occurs every > > 15 seconds for a blocking invocation. Since the response of an action > that > > generates a payload greater than ~1MB can not be sent through the > > "completed" Kafka topic, that action's activation record must be > retrieved > > via polling. Even though such an action may complete in milliseconds, the > > end user will not get back the activation response for at least 15 > seconds > > due to the polling logic in the controller. > > > > I have submitted a pull request to remove the polling mechanism and also > > fix the large action response bug. The pull request can be found here: > > https://github.com/apache/incubator-openwhisk/pull/4033. > > > > Regards, > > James Dubee > > > > >
