Re: change the default action context to omit api key

2019-02-13 Thread Christian Bickel
Hi Rodric,

I agree, that the key should not be passed to the action, if it is not
required.
But in my opinion, existing actions should continue to work without any
update. But I'm OK, if all newly created actions have the default, that
they don't have an API-key.

Greetings
Christian

Am Mi., 13. Feb. 2019 um 22:53 Uhr schrieb Rodric Rabbah :

> Thanks for the quick initial feedback.
> I've opened a PR that excludes just the API key.
> https://github.com/apache/incubator-openwhisk/pull/4284
>
> This will be a breaking change - actions that are deployed already and
> which need the key will need to be updated.
> I added the annotation `-a provide-api-key true`.
>
> I think the default should be no key but we have to address the change in
> behavior.
>
> -r
>
> On Wed, Feb 13, 2019 at 4:32 PM David P Grove  wrote:
>
> > +1 my thoughts exactly.
> >
> > --dave
> >
> > Tyson Norris  wrote on 02/13/2019 04:28:35
> PM:
> > >
> > > I agree the api_key is bad, when not using e.g. OW npm within the
> > > action. +1 for using an annotation to enable this.
> > >
> > > activation_id is required to do the right thing for logging with
> > > concurrency enabled - but I'm also not sure what risk it is to
> > > include that? It will be in the response header anyways still right?
> > >
> > > Namespace + action - similar to activation_id, this is already
> > > available to the client and may have some convenience for action
> > > devs (especially with logging concurrent actiavitons __ )
> > >
> > > From my perspective, I would just change the api_key to be
> > > explicitly passed, and leave the rest as-is.
> > >
> >
>


Don't differentiate between blackbox and whitebox images

2018-11-28 Thread Christian Bickel
Hi,

in the past we divided the invokers into blackbox and whitebox invokers.
This has the effect that one of these two types could be overloaded while
the other type still has free capacity.

Does anyone see issues on using every invoker for everything?
Or does anyone see any action items that need to be addressed before we
invoke every action-type on every invoker?

If not I'll open a PR to not differentiate between these types anymore.

Greetings
Christian Bickel


Re: Relieve CouchDB on high load

2018-11-26 Thread Christian Bickel
Hi Rodric,

thanks a lot for your feedback. I changed my PR to use the quota instead of
switching it off completely.

Now the controller checks, how many activations have been written in the
current minute. The remaining quota is passed on. On trying to save an
activation (of a trigger, a sequence or an activation) this quota is
checked and only saved, if it is above 0.

Greetings
Christian

Am Mi., 24. Okt. 2018 um 17:38 Uhr schrieb Rodric Rabbah :

> Thanks for the additional information Christian - I took a look at the PR
> and added a few comments. I think you can approach this as another quota
> (number of activations stored per minute or per hour) and setting the quota
> to 0 would disable all stores.
>
> I also don't like the way this was done in the artifact store - it's a fast
> hack, but bleeds through the abstractions. We should not do this in my
> opinion. At the least, the invoker (and controller) wasted effort and held
> up resources gathering logs etc only to be thrown away later.
>
> -r
>


Re: Relieve CouchDB on high load

2018-10-24 Thread Christian Bickel
Hi Markus,

thanks a lot for your response.

I definitely agree, that all limits should be reviewed and reworked.

The reason why I've chosen this limit as per-namespace-limit and not as
per-action-limit is, to give the operator of Openwhisk the ability to
protect the own database.
If the limit would be implemented on a per-action basis, the user could
always change it back and flood the database during very high load
intervals.

but adding this as per-action-limit, like ningyougan proposed in the PR,
makes definetly sense. But I would see it as separate PR.

Greetings
Christian

Am Di., 23. Okt. 2018 um 16:23 Uhr schrieb Markus Thömmes <
markusthoem...@apache.org>:

> Hi Christian,
>
> given the recent work towards getting logs to a separate store to relieve
> CouchDB and to make it possible to move it into a store that's more
> appropriate for logging load and some other bits that are already
> implemented (removing the DB based polling, being able to disable log
> collection), I think it makes sense to be able to disable activation
> writing as well! In that sense, yes I do agree on going forward with the
> proposal  + implementation.
>
> One thing that has me worried a bit is the way we're handling global,
> namespaced and per-action limits today. I'm wondering if at some point
> we'll need to consolidate those and make them cascade reliably. For this
> specific knob, for example, I think it would make sense to have one at the
> system level and on the action/trigger level accordingly. For some of our
> limits that might be true today already, I feel like it might not be for
> all though (although I admittedly haven't double-checked).
>
> Cheers,
> Markus
>
> Am Di., 23. Okt. 2018 um 14:22 Uhr schrieb Christian Bickel <
> cbic...@apache.org>:
>
> > Hi developpers,
> >
> > in some performance tests in the past we've seen, that there are the
> > following issues with Cloudant/CouchDB:
> > - not all activations can be stored (there are error logs in the invoker,
> > that storing was not possible)
> > - during bursts, CouchDB needs to process each document to update all
> views
> > in the activations-DB. If CouchDB is not able to process them
> immediately,
> > because of a queue, calling these views returns an error or the result
> will
> > be outdated (on calling them with stale). This has the impact, that there
> > is a delay for all users of the system until their activation appear
> after
> > calling `activation list`.
> >
> > To not have negative impact of some high-load users on the system the
> > proposal is, to not store activations in activations-store for some
> > specified namespaces.
> > My proposal of an implementation is to put this flag into the
> > limits-document in subjects database.
> > This means, it can only be set by the administrator with wskadmin.
> >
> > I already opened a PR with this proposal:
> > https://github.com/apache/incubator-openwhisk/pull/4078
> >
> > Do you agree on going forward with this proposal and implementation?
> >
> > Thanks a lot in advance for your feedback.
> >
> > Greetings
> > Christian
> >
>


Relieve CouchDB on high load

2018-10-23 Thread Christian Bickel
Hi developpers,

in some performance tests in the past we've seen, that there are the
following issues with Cloudant/CouchDB:
- not all activations can be stored (there are error logs in the invoker,
that storing was not possible)
- during bursts, CouchDB needs to process each document to update all views
in the activations-DB. If CouchDB is not able to process them immediately,
because of a queue, calling these views returns an error or the result will
be outdated (on calling them with stale). This has the impact, that there
is a delay for all users of the system until their activation appear after
calling `activation list`.

To not have negative impact of some high-load users on the system the
proposal is, to not store activations in activations-store for some
specified namespaces.
My proposal of an implementation is to put this flag into the
limits-document in subjects database.
This means, it can only be set by the administrator with wskadmin.

I already opened a PR with this proposal:
https://github.com/apache/incubator-openwhisk/pull/4078

Do you agree on going forward with this proposal and implementation?

Thanks a lot in advance for your feedback.

Greetings
Christian


Active acks from invoker to controller

2018-09-25 Thread Christian Bickel
Hi,

today, we execute the user-action in the invoker, send the active-ack
back to the controller and collect logs afterwards.
This has the following implication:
- controller receives the active ack, so it thinks the slot on the
invoker is free again.
- BUT the invoker is still collecting logs, which means that the
activation has to wait until log collection is finished.
Especially when log-collection takes long (e.g. because of high CPU
load on the invoker-machine), user-actions have to wait longer and
longer over time.

If this happens, you will read the following message in the invoker:
`Rescheduling Run message, too many message in the pool, freePoolSize:
0 containers and 0 MB, busyPoolSize ...`

But it definitely makes sense to send the active-ack (at least for
blocking activations) to the controller as fast as possible, because
the controller should answer the request as fast as possible.

So my proposal is to differentiate between blocking and non-blocking
activations. The invoker today already knows, if it is blocking or
not.
If the activation is non-blocking, we wait with the active-ack until
log collection is finished.
If the activation is blocking, we send an active-ack with a field,
that logColleaction is not finished yet, like today and a second
active-ack, after log-collection is finished.

With this behaviour, the user gets its response as fast as possible on
blocking activations and the loadbalancer waits with dispatching,
until the slot is freed up.

I also did a test to verify performance.
For this test, I took a system with 100 invokers and space for 32
256MB actions on each invoker. (Two controllers, 1 Kafka)
I used our gatling test `BlockingInvokeOneActionSimulation`. The
action of the test writes one logline and returns the input paramters
again.
The test executed all activations blocking, which means, that two
active-acks have been sent per activation.
I used 2880 parallel connections, which should result in 90% system
utilisation (blackbox-fraction is set to 0).
As you can see, this scenario generates the most possible active-acks.
To the result:
The throughput per second is at 97% compared to the current master.
The response times are also nearly the same.
So there is nearly no regression in the worst case scenario.
In addition, I looked for the log-message I mentioned above in the
invoker. It has not been written in the test with my changes, but
thousands of times on the master.
For non-blocking requests I don't expect any regression, but the
waiting-time on the invoker should be less.

Another valid approach would be, to wait with the active-ack, until
log-collection is finished (independent of blocking or non-blocking).
If the action is executed blocking, we could say, that it's the users
responsibility to not log too much or to set the loglimit to 0, to get
fast responses.

Does anyone have an opinion, which of the two approaches we should
pursue. Or has anyone another idea?

Greetings
Christian


Limit of binary actions

2018-07-09 Thread Christian Bickel
Hey,

a few days a go we found a little bug:

On creating an action, the code is sent to the controller. Afterwards
the controller checks, that the code is not too big. If the code
itself is sent directly (e.g. uploading a single `.js` file), the
limit is 48MB (today).
If an archive is uploaded, it will be encoded with base64. The problem
here is, that base64 has an overhead of 1/3. This means, if you have
an archive, that has a size of e.g. 45MB, 60MB will be sent to the
controller. So the request will be rejected. This is not expected for
the user, as a limit of 48MB was proposed to him.

I opened a PR to fix this behaviour:
https://github.com/apache/incubator-openwhisk/pull/3835

As this change, potentially raises the action size in the database, I
wanted to ask if anyone has any concerns with this PR.

If not, I will go forward with it.

Greetings


Limit of binary actions

2018-07-06 Thread Christian Bickel
Hey,

a few days a go we found a little bug:

On creating an action, the code is sent to the controller. Afterwards
the controller checks, that the code is not too big. If the code
itself is sent directly (e.g. uploading a single `.js` file), the
limit is 48MB (today).
If an archive is uploaded, it will be encoded with base64. The problem
here is, that base64 has an overhead of 1/3. This means, if you have
an archive, that has a size of e.g. 45MB, 60MB will be sent to the
controller. So the request will be rejected. This is not expected for
the user, as a limit of 48MB was proposed to him.

I opened a PR to fix this behaviour:
https://github.com/apache/incubator-openwhisk/pull/3835

As this change, potentially raises the action size in the database, I
wanted to ask if anyone has any concerns with this PR.

If not, I will go forward with it.

Greetings


Re: Performance tests of OpenWhisk

2018-03-19 Thread Christian Bickel
As I said, we only have a very small travis machine.

But I think it's at least worth a try, if we get consistent performance numbers 
in travis, to set a threshold and fail if it's below this threshold.

Greetings
Christian

March 19, 2018 12:03 PM, "Rodric Rabbah"  wrote:

> Great - are you considering also a Travis CI matrix?
> 
> -r


Hot standby controller

2018-02-13 Thread Christian Bickel
Hi,

some time ago, we started working on the scale out of the controllers.
One of our first steps was to deploy two controllers, but use the second one as 
hot standby controller. At this time, this was an important step, because we 
had to handle cache-invalidation first and solve Load-balancing-issues.
In the meantime all these issues are solved, and the default is that all 
controllers are used round robin for months now.

We added a flag if the controllers should be used round robin or the second as 
hot standby. This flag is set to "use all controllers round robin" for months 
now.

In my opinion, we don't need the hot standby ability anymore.
Does anyone has another opinion?

If not, I'll go forward with the following PR 
https://github.com/apache/incubator-openwhisk/pull/3266 
(https://github.com/apache/incubator-openwhisk/pull/3266) and let it merge.

Greetings
Christian Bickel


Removal of controllers without an index

2017-09-01 Thread Christian Bickel
Hi,

In June, we've added the ability to deploy hot-standby controllers (fdbf073 
(https://github.com/apache/incubator-openwhisk/commit/fdbf073a386c33aed1b06ed93eaea39ee4382c7b)).
With this commit, the containername of the controllers has been changed from 
controller to controller0, controller1, ...
To not break the update of existing deployments we left the cleanup of 
containers called controller in the cleanup script of the controller.
Since the change of renaming the containers is long time ago, we can remove 
this old code now.
This will be done with the following PR: 
https://github.com/apache/incubator-openwhisk/pull/2688 
(https://github.com/apache/incubator-openwhisk/pull/2688)

If you still have a running environment with a controller without an index in 
its name and you want to update this environment, you can remove this 
controller with docker rm controller before you update the environment.

Greetings
Christian


Adaption of hosts-files

2017-08-21 Thread Christian Bickel
Hi all,

last week, Markus and me created a pull request, which enables the possibility 
to deploy several controllers and invokers on one machine with ansible.
https://github.com/apache/incubator-openwhisk/pull/2633 
(https://github.com/apache/incubator-openwhisk/pull/2633)

After this PR is merged, it is required, that "ansible_host" is specified for 
each host in the environment's hosts file. Otherwise there will be errors on 
the ansible task `properties.yml` (which is also executed on logs.yml, 
postdeploy.yml and routemgmt.yml).

This is already done for the dafault environments. A change is only needed, if 
you have a own environment.

Thanks for your understanding.

Greetings
Christian


Re: I want to subscribe

2017-06-26 Thread Christian Bickel
Hi SangHeon,

Currently I'm also working on Contorller-HA. I'll put the mail, that I have 
written to the dev list last week about this topic on the bottom of this mail.

But to shortly summarise it up:
- it is already possible to deploy an hot-standby controller.
- the next step will be to share the cache and the state of the controller.
- Some ideas I already have are written down here: 
https://github.com/apache/incubator-openwhisk/wiki/Design-discussion-on-HA-of-controller

If you have some more ideas, please let me know about them and/or write a 
comment to the wiki.
I'm looking forward to work with you.

Greetings
Christian


-
Mail about controller scale-out:

Hi OpenWhisk-developers,

currently we have the problem, that we can only deploy one controller. Reason 
for this is some state, that is not shared across the controllers, which would 
result in a system which is not stable anymore.

To be able, to deploy more than one controller, I already looked into this 
topic.
I've created a little document, to keep you on track and present you my ideas: 
https://github.com/apache/incubator-openwhisk/wiki/Design-discussion-on-HA-of-controller
 
(https://github.com/apache/incubator-openwhisk/wiki/Design-discussion-on-HA-of-controller)
Maybe you also have some ideas/concerns, that I missed so far. Just leave them 
as comment.

There is also already a PR, which will address the first items that are 
described in the document above. 
https://github.com/apache/incubator-openwhisk/pull/2205 
(https://github.com/apache/incubator-openwhisk/pull/2205)

Greetings
Christian



June 26, 2017 1:53 AM, "Matt Rutkowski"  wrote:

> Hi SangHeon,
> 
> You can self subscribe to the list by following the instructions here:
> http://openwhisk.incubator.apache.org/contact
> 
> which has the following instructions:
> 
> To subscribe to the list, send a message to:
> dev-subscr...@openwhisk.incubator.apache.org
> 
> you should get a confirmation almost immediately.
> 
> Kind regards,
> Matt 
> 
> From: SangHeon Lee 
> To: d...@openwhisk.incubator.apache.org
> Date: 06/25/2017 06:47 PM
> Subject: I want to subscribe
> 
> Hi, I want to subscribe openwhisk’s mailing list
> I’m interested in making controller HA
> thanks


Scale out of the controller

2017-06-21 Thread Christian Bickel
Hi OpenWhisk-developers,

currently we have the problem, that we can only deploy one controller. Reason 
for this is some state, that is not shared across the controllers, which would 
result in a system which is not stable anymore.

To be able, to deploy more than one controller, I already looked into this 
topic.
I've created a little document, to keep you on track and present you my ideas: 
https://github.com/apache/incubator-openwhisk/wiki/Design-discussion-on-HA-of-controller
 
(https://github.com/apache/incubator-openwhisk/wiki/Design-discussion-on-HA-of-controller)
Maybe you also have some ideas/concerns, that I missed so far. Just leave them 
as comment.

There is also already a PR, which will address the first items that are 
described in the document above. 
https://github.com/apache/incubator-openwhisk/pull/2205 
(https://github.com/apache/incubator-openwhisk/pull/2205)

Greetings
Christian