Re: Reminder: Tech Interchange meeting tomorrow, Wed Aug 15th

2018-08-14 Thread Dragos Dascalita Haut
Unfortunately I'm in a flight during the meeting and I won't be able to attend.


RE AI Actions I agree with Rodric: the time might have been a bit short. I'm 
incorporating the feedback I'm receiving into the wiki page and hopefully it 
will be in a better shape for our next meeting. If anyone has more thoughts pls 
let me know or update the wiki, or add comments directly.



From: Rodric Rabbah 
Sent: Tuesday, August 14, 2018 6:23:24 AM
To: dev@openwhisk.apache.org
Subject: Re: Reminder: Tech Interchange meeting tomorrow, Wed Aug 15th

Looks like a nice agenda. Thanks Ben for hosting this one. I can’t make it 
unfortunately but will catch the replay.

Dragos’ AI actions might be another although maybe the runway too short.

-r

> On Aug 14, 2018, at 9:09 AM, Ben Browning  wrote:
>
> Greetings!
>
> Our next tech interchange call is Wednesday, August 15th at 11am US
> Eastern - that's tomorrow! Use the attached .ics file if you'd like to
> add a reminder to your calendar.
>
> Call details:
> Web Meeting: Tech Interchange (bi-weekly):
> - Day-Time: Wednesdays, 11AM EDT (Eastern US), 5PM CEST (Central Europe),
> 3PM UTC, 11PM CST (Beijing)
> - Zoom: 
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fzoom.us%2Fmy%2Fasfopenwhiskdata=02%7C01%7Cddascal%40adobe.com%7C0e2a5f620c55405e73a708d601e92409%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636698498170347002sdata=U3zigzoOAhbhy0%2BjmdfNSJ%2BJtzHlAx%2BUiYF8hGazHp0%3Dreserved=0
>
> Based on recent mailing list and Slack discussions, here are some
> proposed discussion topics. If you'd like to speak about one of these
> or have another topic, please email or message me on Slack before our
> meeting. I'll send out an updated agenda shortly before the meeting
> tomorrow.
>
> * 0.9.0 release update (Vincent?)
> * website update (Matt & Priti?)
> * knative update (Ben or Markus)
> * system env vars in user containers (Vadim, Markus, Rodric, Chetan,
> Carlos, Tyson, Dragos?)
> * pluggable API gateways (Henry, Rodric, Dragos?)
> * AI actions (Dragos?)
> * BDD function and performance tests (Rahul, Martin, Markus?)
> * Recap of recent notable changes (?)
> * Anything else - let me know!
>
>
> Thanks!
>
> Ben
> 


Re: logging baby step -- worth pursuing?

2018-08-14 Thread Dragos Dascalita Haut
"...we should be able to fully
process the logs offline and in a streaming manner and get the needed
activation id injected into every logline..."


+1 IIRC for concurrent activations Tyson Norris and Dan McWeeney were going 
down this path as well. Having this natively supported by all OpenWhisk 
runtimes can only make things easier.


From: David P Grove 
Sent: Tuesday, August 14, 2018 2:29:12 PM
To: dev@openwhisk.apache.org
Subject: logging baby step -- worth pursuing?



Even if we think structured logging is the right eventual goal, it could
take a while to get there (especially since it is changing functionality
users may have grown accustomed to).

However, for non-concurrent, non-blackbox runtimes we could make a small,
not-user visible change, that could enable fully offline and streaming log
processing.  We already generate an end-of-log sentinel to stdout/stderr
for these runtimes.  If we also generated a start-of-log sentinel to
stdout/stderr that included the activation id, we should be able to fully
process the logs offline and in a streaming manner and get the needed
activation id injected into every logline.

Is this worth pursuing?   I'm motivated to get log processing out of the
Invoker/ContainerRouter so we can push ahead with some of the scheduler
redesignwithout tackling logging, I don't think we'll be able to assess
the true scalability potential of the new scheduling architectures.

--dave


Re: Proposal on a future architecture of OpenWhisk

2018-08-14 Thread Dragos Dascalita Haut
Markus, I appreciate the enhancements you mentioned in the wiki, and I'm very 
much inline with the ideas you brought in there.



"...having the ContainerManager be a cluster singleton..."

I was just in process to reply with the same idea :)

In addition, I was thinking we can leverage Akka Distributed Data [1] to keep 
all ContainerRouter actors eventually consistent. When creating a new 
container, the ContainerManager can write with a consistency "WriteAll"; it 
would be a little slower but it would improve consistency.


The "edge-case" isn't clear to me b/c I'm coming from the assumption that it 
doesn't matter which ContainerRouter handles the next request, given that all 
actors have the same data. Maybe you can help me understand better the 
edge-case ?


Re Knative approach, can you expand why the execution layer/data plane would be 
replaced entirely by Knative serving ? I think knative serving handles very 
well some cases like API requests, but it's not designed to guarantee 
concurrency restrictions like "1 request at a time per container" - something 
that AI Actions need.


Thanks,

dragos


[1] - https://doc.akka.io/docs/akka/2.5/distributed-data.html



From: David P Grove 
Sent: Tuesday, August 14, 2018 2:15:13 PM
To: dev@openwhisk.apache.org
Subject: Re: Proposal on a future architecture of OpenWhisk




"Markus Thömmes"  wrote on 08/14/2018 10:06:49
AM:
>
> I just published a revision on the initial proposal I made. I still owe a
> lot of sequence diagrams for the container distribution, sorry for taking
> so long on that, I'm working on it.
>
> I did include a clear seperation of concerns into the proposal, where
> user-facing abstractions and the execution (loadbalacing, scaling) of
> functions are loosely coupled. That enables us to exchange the execution
> system while not changing anything in the Controllers at all (to an
> extent). The interface to talk to the execution layer is HTTP.
>

Nice writeup!

For me, the part of the design I'm wondering about is the separation of the
ContainerManager and the ContainerRouter and having the ContainerManager by
a cluster singleton. With Kubernetes blinders on, it seems more natural to
me to fuse the ContainerManager into each of the ContainerRouter instances
(since there is very little to the ContainerManager except (a) talking to
Kubernetes and (b) keeping track of which Containers it has handed out to
which ContainerRouters -- a task which is eliminated if we fuse them).

The main challenge is dealing with your "edge case" where the optimal
number of containers to create to execute a function is less than the
number of ContainerRouters.  I suspect this is actually an important case
to handle well for large-scale deployments of OpenWhisk.  Having 20ish
ContainerRouters on a large cluster seems plausible, and then we'd expect a
long tail of functions where the optimal number of container instances is
less than 20.

I wonder if we can partially mitigate this problem by doing some amount of
smart routing in the Controller.  For example, the first level of routing
could be based on the kind of the action (nodejs:6, python, etc).  That
could then vector to per-runtime ContainerRouters which dynamically
auto-scale based on load.  Since there doesn't have to be a fixed division
of actual execution resources to each ContainerRouter this could work.  It
also lets easily stemcells for multiple runtimes without worrying about
wasting too many resources.

How do you want to deal with design alternatives?  Should I be adding to
the wiki page?  Doing something else?

--dave


logging baby step -- worth pursuing?

2018-08-14 Thread David P Grove


Even if we think structured logging is the right eventual goal, it could
take a while to get there (especially since it is changing functionality
users may have grown accustomed to).

However, for non-concurrent, non-blackbox runtimes we could make a small,
not-user visible change, that could enable fully offline and streaming log
processing.  We already generate an end-of-log sentinel to stdout/stderr
for these runtimes.  If we also generated a start-of-log sentinel to
stdout/stderr that included the activation id, we should be able to fully
process the logs offline and in a streaming manner and get the needed
activation id injected into every logline.

Is this worth pursuing?   I'm motivated to get log processing out of the
Invoker/ContainerRouter so we can push ahead with some of the scheduler
redesignwithout tackling logging, I don't think we'll be able to assess
the true scalability potential of the new scheduling architectures.

--dave


Re: Proposal on a future architecture of OpenWhisk

2018-08-14 Thread David P Grove



"Markus Thömmes"  wrote on 08/14/2018 10:06:49
AM:
>
> I just published a revision on the initial proposal I made. I still owe a
> lot of sequence diagrams for the container distribution, sorry for taking
> so long on that, I'm working on it.
>
> I did include a clear seperation of concerns into the proposal, where
> user-facing abstractions and the execution (loadbalacing, scaling) of
> functions are loosely coupled. That enables us to exchange the execution
> system while not changing anything in the Controllers at all (to an
> extent). The interface to talk to the execution layer is HTTP.
>

Nice writeup!

For me, the part of the design I'm wondering about is the separation of the
ContainerManager and the ContainerRouter and having the ContainerManager by
a cluster singleton. With Kubernetes blinders on, it seems more natural to
me to fuse the ContainerManager into each of the ContainerRouter instances
(since there is very little to the ContainerManager except (a) talking to
Kubernetes and (b) keeping track of which Containers it has handed out to
which ContainerRouters -- a task which is eliminated if we fuse them).

The main challenge is dealing with your "edge case" where the optimal
number of containers to create to execute a function is less than the
number of ContainerRouters.  I suspect this is actually an important case
to handle well for large-scale deployments of OpenWhisk.  Having 20ish
ContainerRouters on a large cluster seems plausible, and then we'd expect a
long tail of functions where the optimal number of container instances is
less than 20.

I wonder if we can partially mitigate this problem by doing some amount of
smart routing in the Controller.  For example, the first level of routing
could be based on the kind of the action (nodejs:6, python, etc).  That
could then vector to per-runtime ContainerRouters which dynamically
auto-scale based on load.  Since there doesn't have to be a fixed division
of actual execution resources to each ContainerRouter this could work.  It
also lets easily stemcells for multiple runtimes without worrying about
wasting too many resources.

How do you want to deal with design alternatives?  Should I be adding to
the wiki page?  Doing something else?

--dave


Re: AI Actions as a first-class citizen in OpenWhisk

2018-08-14 Thread James Thomas
Dragos,

Great wiki page! Since I've been playing with TensorFlow.js on OpenWhisk, I
definitely think there's a sweet spot for running certain ML tasks using
serverless platforms. The suggestions in the wiki page all make sense to me.

Access to the GPU is one of the biggest barriers to using more complex
models and operations (and even training longer-term). Nvida does have a
Docker version that allows you to pass through GPUs (
https://github.com/NVIDIA/nvidia-docker).

Another issue I found was the performance difference between warm and cold
activations when using customised run images. This could have been resolved
it either there was a pre-warmed Node.js image with TF-JS libraries or we
have a mechanism to create action packages larger than 48MB. This might
include creating actions from packages at an external HTTP address or
object storage URI.

On 13 August 2018 at 22:17, Dragos Dascalita Haut  wrote:

> Once you've experienced FaaS, you don't wanna go back. This has been my
> experience with AI and FaaS.
>
>
> In particular, running AI inferences in FaaS proved to be a great match:
>
> - Each function processes one request at a time. A model usually takes 1
> data input and produces 1 data output.
>
> - Enough code to fit into a function. An AI action loads a model, runs the
> inference, and returns the result.
>
> - In addition, FaaS provides a model to scale to 0 and scale to millions
> with the traffic.
>
>
> With OpenWhisk I think we're very close to make AI Actions a first-class
> citizen for developers, and I've created a wiki to explore what it would
> take to get there [1].  Coincidently James Thomas also published today his
> experience with Tensorflow and OpenWhisk [2]
>
>
> I'm interested in your thoughts, and see if there's enough interest in our
> community to make this a reality.
>
>
> Feel free to contribute to the wiki with edits, comments, anything you'd
> wanna add.
>
>
> [1] - https://cwiki.apache.org/confluence/display/OPENWHISK/AI+Actions
>
> [2] - https://medium.com/openwhisk/serverless-machine-learning-
> with-tensorflow-js-4aa24494a9b4
>
>
> Thanks,
>
> dragos
>



-- 
Regards,
James Thomas


Proposal on a future architecture of OpenWhisk

2018-08-14 Thread Markus Thömmes
Hey OpenWhiskers,

I just published a revision on the initial proposal I made. I still owe a
lot of sequence diagrams for the container distribution, sorry for taking
so long on that, I'm working on it.

I did include a clear seperation of concerns into the proposal, where
user-facing abstractions and the execution (loadbalacing, scaling) of
functions are loosely coupled. That enables us to exchange the execution
system while not changing anything in the Controllers at all (to an
extent). The interface to talk to the execution layer is HTTP.

Wanted to get this out as a possible idea on how to incooperate Knative in
the future and how it could look like alongside other implementations.

As always, feedback is very much welcome and appreciated.
https://cwiki.apache.org/confluence/display/OPENWHISK/OpenWhisk+future+architecture

Cheers,
Markus


Re: Reminder: Tech Interchange meeting tomorrow, Wed Aug 15th

2018-08-14 Thread Rodric Rabbah
Looks like a nice agenda. Thanks Ben for hosting this one. I can’t make it 
unfortunately but will catch the replay. 

Dragos’ AI actions might be another although maybe the runway too short. 

-r

> On Aug 14, 2018, at 9:09 AM, Ben Browning  wrote:
> 
> Greetings!
> 
> Our next tech interchange call is Wednesday, August 15th at 11am US
> Eastern - that's tomorrow! Use the attached .ics file if you'd like to
> add a reminder to your calendar.
> 
> Call details:
> Web Meeting: Tech Interchange (bi-weekly):
> - Day-Time: Wednesdays, 11AM EDT (Eastern US), 5PM CEST (Central Europe),
> 3PM UTC, 11PM CST (Beijing)
> - Zoom: https://zoom.us/my/asfopenwhisk
> 
> Based on recent mailing list and Slack discussions, here are some
> proposed discussion topics. If you'd like to speak about one of these
> or have another topic, please email or message me on Slack before our
> meeting. I'll send out an updated agenda shortly before the meeting
> tomorrow.
> 
> * 0.9.0 release update (Vincent?)
> * website update (Matt & Priti?)
> * knative update (Ben or Markus)
> * system env vars in user containers (Vadim, Markus, Rodric, Chetan,
> Carlos, Tyson, Dragos?)
> * pluggable API gateways (Henry, Rodric, Dragos?)
> * AI actions (Dragos?)
> * BDD function and performance tests (Rahul, Martin, Markus?)
> * Recap of recent notable changes (?)
> * Anything else - let me know!
> 
> 
> Thanks!
> 
> Ben
> 


Reminder: Tech Interchange meeting tomorrow, Wed Aug 15th

2018-08-14 Thread Ben Browning
Greetings!

Our next tech interchange call is Wednesday, August 15th at 11am US
Eastern - that's tomorrow! Use the attached .ics file if you'd like to
add a reminder to your calendar.

Call details:
Web Meeting: Tech Interchange (bi-weekly):
- Day-Time: Wednesdays, 11AM EDT (Eastern US), 5PM CEST (Central Europe),
3PM UTC, 11PM CST (Beijing)
- Zoom: https://zoom.us/my/asfopenwhisk

Based on recent mailing list and Slack discussions, here are some
proposed discussion topics. If you'd like to speak about one of these
or have another topic, please email or message me on Slack before our
meeting. I'll send out an updated agenda shortly before the meeting
tomorrow.

* 0.9.0 release update (Vincent?)
* website update (Matt & Priti?)
* knative update (Ben or Markus)
* system env vars in user containers (Vadim, Markus, Rodric, Chetan,
Carlos, Tyson, Dragos?)
* pluggable API gateways (Henry, Rodric, Dragos?)
* AI actions (Dragos?)
* BDD function and performance tests (Rahul, Martin, Markus?)
* Recap of recent notable changes (?)
* Anything else - let me know!


Thanks!

Ben
BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Apple Inc.//Mac OS X 10.13.5//EN
CALSCALE:GREGORIAN
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
DTSTART:20070311T02
TZNAME:EDT
TZOFFSETTO:-0400
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
DTSTART:20071104T02
TZNAME:EST
TZOFFSETTO:-0500
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CREATED:20180730T181946Z
UID:6276A5E9-206E-4D72-8AC7-39724322F2E3
DTEND;TZID=America/New_York:20180801T12
TRANSP:OPAQUE
X-APPLE-TRAVEL-ADVISORY-BEHAVIOR:AUTOMATIC
SUMMARY:Apache OpenWhisk Tech Interchange (bi-weekly)
DTSTART;TZID=America/New_York:20180815T11
DTSTAMP:20180730T182915Z
LOCATION:https://zoom.us/my/asfopenwhisk
SEQUENCE:0
URL;VALUE=URI:https://zoom.us/my/asfopenwhisk
END:VEVENT
END:VCALENDAR


Re: BDD Test Cases Contribution to complement existing test cases and coverage

2018-08-14 Thread Martin Gencur

Hi Rahul/all,
this is certainly an interesting approach to testing. Let me mention 
some points.


The test structure is more flat - either all commands at the same level 
or at most one/two calls to "features" in separate files. This gives the 
user more insight into all details of the request/response. However, it 
is questionable is the test itself is more readable than the Scala code. 
The tests are a mix of Gherkin Feature file commands, embedded Java 
snippets, and JavaScript as opposed to just Scala in the current test suite.


The tests in the Scala test suite are already written in the way: 
"Component XY should do this and that" which is close to BDD.


I guess the most important point is whether OpenWhisk does BDD which is 
really a style of development where behaviour is defined first as well 
as tests with exact specification and coding follows. Does OpenWhisk 
development work in this way or want to work this way? Then these tests 
might be useful, especially for new features and test cases. I don't see 
so much value in re-writing the existing tests because the community is 
familiar with them and not sure how many people are familiar with the 
Karate framework.


Perhaps other questions are:
* is there any code completion for the features files in some IDEs? 
Especially for the embedded code snippets.

* do the tests run more quickly or slower than the current scala test suite?

Cheers,
Martin Gencur
QE, Red Hat

On 13.8.2018 12:00, Rahul Tripathi wrote:

Hi Markus,

I am yet to get the permission to add the proposal to the OW Wiki but though of 
sharing it here:

What is Karate?
Karate is based on Cucumber which is a BDD framework.

What is BDD Testing?
BDD Testing is a testing approach based on Behavioural Driven Development. It 
uses Given ,When ,Then statements to create test scenario's. Example:


TEST CASE-1
Feature:  Get List of actions based on the NameSpace

Scenario: As a user I want to get the list of actions available for the given 
namespace
 * def path = '/api/v1/namespaces/'+nameSpace+'/actions?limit=30=0'
 Given url BaseUrl+path
 And header Authorization = Auth
 And header Content-Type = 'application/json'
 When method get
 Then status 200
 And def json = response
 
  The above example tests that the List Actions API's is returning a 200 ok . A lot more assertions can be put as per the requirements. Karate has some very good in built functions for

  asserting the nested JSON's
  
  
TEST CASE-2(This example shows a simple smoke test on all the wsk user functions)


Feature: This feature file will test all the wsk functions

   Background:
 * configure ssl = true
 * def nameSpace = 'guest'
 * def params = '?blocking=true=false'
 * def scriptcode = call 
read('classpath:com/karate/openwhisk/functions/hello-world.js')
 * def base64encoding = 
read('classpath:com/karate/openwhisk/utils/base64.js')

   Scenario: TC01-As a user I want to all the wsk functions available to the 
user and check if they give the proper response
 # Get User Auth
 * def getNSCreds = call 
read('classpath:com/karate/openwhisk/wskadmin/get-user.feature') 
{nameSpace:'#(nameSpace)'}
 * def result = getNSCreds.result
 * def Auth = base64encoding(result)
 * print "Got the Creds for the guest user"
 * print Auth
 
 # Create an Action .Create an action for the above defined guest name

 #* def createAction = call 
read('classpath:com/karate/openwhisk/wskactions/create-action.feature') 
{script:'#(scriptcode)' ,nameSpace:'#(nameSpace)' ,Auth:'#(Auth)', 
actionName:'Dammyyy'}
 * def createAction = call 
read('classpath:com/karate/openwhisk/wskactions/create-action.feature') 
{script:'#(scriptcode)' ,nameSpace:'#(nameSpace)' ,Auth:'#(Auth)'}
 * def actionName = createAction.actName
 * print actionName
 * print "Successfully Created an action"
 
 # Get Action Details

 * def actionDetails = call 
read('classpath:com/karate/openwhisk/wskactions/get-action.feature') 
{nameSpace:'#(nameSpace)' ,Auth:'#(Auth)',actionName:'#(actionName)'}
 * print "Successfully got the action details"
 
 #Invoke Action

 * def invokeAction = call 
read('classpath:com/karate/openwhisk/wskactions/invoke-action.feature') 
{params:'#(params)',requestBody:'',nameSpace:'#(nameSpace)' 
,Auth:'#(Auth)',actionName:'#(actionName)'}
 * def actID = invokeAction.activationId
 * print  = "Successfully invoked the action"
 * def webhooks = callonce 
read('classpath:com/karate/openwhisk/utils/sleep.feature') {sheepCount:'20'}
 
 #Get Activation details

 * def getActivationDetails = call 
read('classpath:com/karate/openwhisk/wskactions/get-activation-details.feature')
 { activationId: '#(actID)' ,Auth:'#(Auth)'}
 * print "Successfully pulled the activation details"
 
 # Update Action

 * def updateAction = call 

OpenWhisk Karate Based BDD Functional and Performance Tests

2018-08-14 Thread Rahul Tripathi
Hi All,

To completed the existing OpenWhisk Test I have raised this pull request which 
has Karate Based BDD test cases along with Gatling Support for performance 
testing. The idea is to enable the testers to contribute the user workflows in 
the form of BDD scenarios.
Request you to kindly review and provide your valuable feedback.


https://github.com/apache/incubator-openwhisk/pull/3956

Thanks,
Rahul