Re: Submitted PR for performance boost of the Go Runtime

2018-11-30 Thread Michele Sciabarra
The improvements are based on the simple "discovery" that multithreading and 
multiprocessing do not play well together. Despite the fact the plenty of 
examples on the Go documentation uses goroutine to interact with subprocesses. 

If you fork a process and you just "suspend" on I/O of a pipe and you do not 
have anything else running in your process, the kernel is very fast to commute 
to the other process. If you have other "threads" (as implicitly happens when 
you start goroutines) then it takes ages (in the relative time of a process) to 
do the context switch.

In my mac a single binary with fasthttp can do 1500 req/sec, two communicating 
process with pipes can do 1200 req/sec. 
I removed all the goroutines and I shared the file descriptors with the child 
process to avoid also the out-of-sync logs. A simple good old Unix pipe.

For some reasons with action loop nodejs looks a (bit) faster, but not good 
enough to be worth the switch. But Python is terrible! Needs to be fixed.

-- 
  Michele Sciabarra
  mich...@sciabarra.com

- Original message -
From: Carlos Santana 
To: dev@openwhisk.apache.org
Subject: Re: Submitted PR for performance boost of the Go Runtime
Date: Fri, 30 Nov 2018 07:15:48 -0500

James Dubee and Justin reviewed the PR and got merged yesterday Yay!

I gave it a try last night and was able to test the improvements

Really like the examples/benchmark +1

-cs
On Fri, Nov 30, 2018 at 4:40 AM Michele Sciabarra 
wrote:

> Actually I was looking into websockets too. I found this project:
> http://websocketd.com/ that works basically like the actionloop... I want
> to experiment to see if a runtime can be extended to run also websockets...
>
> --
>   Michele Sciabarra
>   mich...@sciabarra.com
>
> - Original message -
> From: Rodric Rabbah 
> To: dev@openwhisk.apache.org
> Subject: Re: Submitted PR for performance boost of the Go Runtime
> Date: Thu, 29 Nov 2018 13:30:27 -0500
>
> Michele some of your work on this was prescient - the just announced new
> features of aws lambda show it’s in this direction as well (polling a
> socket, suspending and resuming on new messages). Need to dig deeper but
> this is great validation.
>
> -r
>
> > On Nov 29, 2018, at 5:39 AM, Michele Sciabarra 
> wrote:
> >
> > Another (I would say easy) win for actionloop is Swift.
> > Numbers just tested:
> >
> > Current one:
> >
> > *** Testing openwhisk/action-swift-v4.1 ***
> > Running 10s test @ http://localhost:8080/run
> >  1 threads and 1 connections
> >  Thread Stats   Avg  Stdev Max   +/- Stdev
> >Latency11.93ms4.42ms  86.72ms   98.50%
> >Req/Sec85.72  9.85   101.00 87.00%
> >  859 requests in 10.05s, 116.60KB read
> > Requests/sec: 85.51
> > Transfer/sec: 11.61KB
> >
> > The new one you have the PR to merge:
> >
> > *** Testing msciab/actionloop-swift-v4.2.1 ***
> > Running 10s test @ http://localhost:8080/run
> >  1 threads and 1 connections
> >  Thread Stats   Avg  Stdev Max   +/- Stdev
> >Latency14.15ms   66.82ms 529.16ms   95.77%
> >Req/Sec 1.04k   151.90 1.21k88.54%
> >  9935 requests in 10.01s, 1.28MB read
> > Requests/sec:    992.40
> > Transfer/sec:130.83KB
> >
> >
> > --
> >  Michele Sciabarra
> >  mich...@sciabarra.com
> >
> > - Original message -
> > From: Carlos Santana 
> > To: dev@openwhisk.apache.org
> > Subject: Re: Submitted PR for performance boost of the Go Runtime
> > Date: Wed, 28 Nov 2018 09:20:21 -0500
> >
> > Thanks Michele for looking into this
> >
> > The numbers look great !
> >
> > I will review the PR soon
> >
> > -- Carlos
> >
> > On Wed, Nov 28, 2018 at 5:25 AM Michele Sciabarra  >
> > wrote:
> >
> >> Hello all,
> >>
> >> after the first embarrassing numbers of the Golang runtime that made me
> >> hurry to fix performances :)), here I provide the updated numbers of
> >> action-loop based runtimes compared with the current ones. Those numbers
> >> can vary slightly from one run to another (for example I discovered it
> >> depends how charged is your Mac) but generally are proportional when
> >> running in the same environment.
> >>
> >> I just run the benchmarks, and I have:
> >>
> >> Performance report
> >> *** actionloop-golang-v1.11 (improved)
> >> Requests/sec:   1086.67
> >> Transfer/sec:154.93KB
> >> *** nodejs6action
> >> Requests/sec:   1006.41
> >> Transfer/sec:225.07KB
> >> *** 

Re: Submitted PR for performance boost of the Go Runtime

2018-11-30 Thread Carlos Santana
James Dubee and Justin reviewed the PR and got merged yesterday Yay!

I gave it a try last night and was able to test the improvements

Really like the examples/benchmark +1

-cs
On Fri, Nov 30, 2018 at 4:40 AM Michele Sciabarra 
wrote:

> Actually I was looking into websockets too. I found this project:
> http://websocketd.com/ that works basically like the actionloop... I want
> to experiment to see if a runtime can be extended to run also websockets...
>
> --
>   Michele Sciabarra
>   mich...@sciabarra.com
>
> - Original message -
> From: Rodric Rabbah 
> To: dev@openwhisk.apache.org
> Subject: Re: Submitted PR for performance boost of the Go Runtime
> Date: Thu, 29 Nov 2018 13:30:27 -0500
>
> Michele some of your work on this was prescient - the just announced new
> features of aws lambda show it’s in this direction as well (polling a
> socket, suspending and resuming on new messages). Need to dig deeper but
> this is great validation.
>
> -r
>
> > On Nov 29, 2018, at 5:39 AM, Michele Sciabarra 
> wrote:
> >
> > Another (I would say easy) win for actionloop is Swift.
> > Numbers just tested:
> >
> > Current one:
> >
> > *** Testing openwhisk/action-swift-v4.1 ***
> > Running 10s test @ http://localhost:8080/run
> >  1 threads and 1 connections
> >  Thread Stats   Avg  Stdev Max   +/- Stdev
> >Latency11.93ms4.42ms  86.72ms   98.50%
> >Req/Sec85.72  9.85   101.00 87.00%
> >  859 requests in 10.05s, 116.60KB read
> > Requests/sec: 85.51
> > Transfer/sec: 11.61KB
> >
> > The new one you have the PR to merge:
> >
> > *** Testing msciab/actionloop-swift-v4.2.1 ***
> > Running 10s test @ http://localhost:8080/run
> >  1 threads and 1 connections
> >  Thread Stats   Avg  Stdev Max   +/- Stdev
> >Latency14.15ms   66.82ms 529.16ms   95.77%
> >Req/Sec 1.04k   151.90 1.21k88.54%
> >  9935 requests in 10.01s, 1.28MB read
> > Requests/sec:    992.40
> > Transfer/sec:    130.83KB
> >
> >
> > --
> >  Michele Sciabarra
> >  mich...@sciabarra.com
> >
> > - Original message -
> > From: Carlos Santana 
> > To: dev@openwhisk.apache.org
> > Subject: Re: Submitted PR for performance boost of the Go Runtime
> > Date: Wed, 28 Nov 2018 09:20:21 -0500
> >
> > Thanks Michele for looking into this
> >
> > The numbers look great !
> >
> > I will review the PR soon
> >
> > -- Carlos
> >
> > On Wed, Nov 28, 2018 at 5:25 AM Michele Sciabarra  >
> > wrote:
> >
> >> Hello all,
> >>
> >> after the first embarrassing numbers of the Golang runtime that made me
> >> hurry to fix performances :)), here I provide the updated numbers of
> >> action-loop based runtimes compared with the current ones. Those numbers
> >> can vary slightly from one run to another (for example I discovered it
> >> depends how charged is your Mac) but generally are proportional when
> >> running in the same environment.
> >>
> >> I just run the benchmarks, and I have:
> >>
> >> Performance report
> >> *** actionloop-golang-v1.11 (improved)
> >> Requests/sec:   1086.67
> >> Transfer/sec:154.93KB
> >> *** nodejs6action
> >> Requests/sec:   1006.41
> >> Transfer/sec:225.07KB
> >> *** actionloop-nodejs6action
> >> Requests/sec:   1230.92
> >> Transfer/sec:157.09KB
> >> *** python3action
> >> Requests/sec: 20.05
> >> Transfer/sec:  2.62KB
> >> *** actionloop-python
> >> Requests/sec:   1066.84
> >> Transfer/sec:139.61KB
> >>
> >> Note the actionloop runtimes are really prototypes, I have not run any
> >> test suite against them.
> >>
> >> --
> >>  Michele Sciabarra
> >>  mich...@sciabarra.com
> >>
> >> - Original message -
> >> From: Michele Sciabarra 
> >> To: dev@openwhisk.apache.org
> >> Subject: Submitted PR for performance boost of the Go Runtime
> >> Date: Tue, 27 Nov 2018 23:38:11 +0100
> >>
> >> I just submitted a PR to improve performances of the Go/ActionLoop
> runtime
> >> while being the less invasive possible.
> >>
> >> I am keeping the current design and I have not changed approach: there
> is
> >> still child process fed by the HTTP server on standard input with
> output on
> >> file descriptor 3. I am still using the standard http server (not the
> fast
> >> HTTP server).
> >> On the converse, I removed all the goroutines. They do not play well
> with
> >> external processes. Instead, I am using the classical technique of
> >> suspending on reading the standard input is fast to commute to the child
> >> process/
> >>
> >> I reached in my tests around 1150 requests/second (a bit worse than the
> >> nodejs but I think decent).
> >> It is possible to further improve performances up to 1500 but I  need to
> >> abandon the actionloop model and run an action with an independent
> server,
> >> so for now I am not doing anything like this,
> >>
> >> --
> >>  Michele Sciabarra
> >>  mich...@sciabarra.com
> >>
> >
> >
> > --
> > Carlos Santana
> > 
>


Re: Submitted PR for performance boost of the Go Runtime

2018-11-30 Thread Michele Sciabarra
Actually I was looking into websockets too. I found this project: 
http://websocketd.com/ that works basically like the actionloop... I want to 
experiment to see if a runtime can be extended to run also websockets...

-- 
  Michele Sciabarra
  mich...@sciabarra.com

- Original message -
From: Rodric Rabbah 
To: dev@openwhisk.apache.org
Subject: Re: Submitted PR for performance boost of the Go Runtime
Date: Thu, 29 Nov 2018 13:30:27 -0500

Michele some of your work on this was prescient - the just announced new 
features of aws lambda show it’s in this direction as well (polling a socket, 
suspending and resuming on new messages). Need to dig deeper but this is great 
validation. 

-r

> On Nov 29, 2018, at 5:39 AM, Michele Sciabarra  wrote:
> 
> Another (I would say easy) win for actionloop is Swift. 
> Numbers just tested:
> 
> Current one:
> 
> *** Testing openwhisk/action-swift-v4.1 ***
> Running 10s test @ http://localhost:8080/run
>  1 threads and 1 connections
>  Thread Stats   Avg  Stdev Max   +/- Stdev
>Latency11.93ms4.42ms  86.72ms   98.50%
>Req/Sec85.72  9.85   101.00 87.00%
>  859 requests in 10.05s, 116.60KB read
> Requests/sec: 85.51
> Transfer/sec: 11.61KB
> 
> The new one you have the PR to merge:
> 
> *** Testing msciab/actionloop-swift-v4.2.1 ***
> Running 10s test @ http://localhost:8080/run
>  1 threads and 1 connections
>  Thread Stats   Avg  Stdev Max   +/- Stdev
>Latency14.15ms   66.82ms 529.16ms   95.77%
>Req/Sec 1.04k   151.90 1.21k88.54%
>  9935 requests in 10.01s, 1.28MB read
> Requests/sec:992.40
> Transfer/sec:130.83KB
> 
> 
> -- 
>  Michele Sciabarra
>  mich...@sciabarra.com
> 
> ----- Original message -
> From: Carlos Santana 
> To: dev@openwhisk.apache.org
> Subject: Re: Submitted PR for performance boost of the Go Runtime
> Date: Wed, 28 Nov 2018 09:20:21 -0500
> 
> Thanks Michele for looking into this
> 
> The numbers look great !
> 
> I will review the PR soon
> 
> -- Carlos
> 
> On Wed, Nov 28, 2018 at 5:25 AM Michele Sciabarra 
> wrote:
> 
>> Hello all,
>> 
>> after the first embarrassing numbers of the Golang runtime that made me
>> hurry to fix performances :)), here I provide the updated numbers of
>> action-loop based runtimes compared with the current ones. Those numbers
>> can vary slightly from one run to another (for example I discovered it
>> depends how charged is your Mac) but generally are proportional when
>> running in the same environment.
>> 
>> I just run the benchmarks, and I have:
>> 
>> Performance report
>> *** actionloop-golang-v1.11 (improved)
>> Requests/sec:   1086.67
>> Transfer/sec:154.93KB
>> *** nodejs6action
>> Requests/sec:   1006.41
>> Transfer/sec:225.07KB
>> *** actionloop-nodejs6action
>> Requests/sec:   1230.92
>> Transfer/sec:157.09KB
>> *** python3action
>> Requests/sec: 20.05
>> Transfer/sec:  2.62KB
>> *** actionloop-python
>> Requests/sec:   1066.84
>> Transfer/sec:139.61KB
>> 
>> Note the actionloop runtimes are really prototypes, I have not run any
>> test suite against them.
>> 
>> --
>>  Michele Sciabarra
>>  mich...@sciabarra.com
>> 
>> - Original message -
>> From: Michele Sciabarra 
>> To: dev@openwhisk.apache.org
>> Subject: Submitted PR for performance boost of the Go Runtime
>> Date: Tue, 27 Nov 2018 23:38:11 +0100
>> 
>> I just submitted a PR to improve performances of the Go/ActionLoop runtime
>> while being the less invasive possible.
>> 
>> I am keeping the current design and I have not changed approach: there is
>> still child process fed by the HTTP server on standard input with output on
>> file descriptor 3. I am still using the standard http server (not the fast
>> HTTP server).
>> On the converse, I removed all the goroutines. They do not play well with
>> external processes. Instead, I am using the classical technique of
>> suspending on reading the standard input is fast to commute to the child
>> process/
>> 
>> I reached in my tests around 1150 requests/second (a bit worse than the
>> nodejs but I think decent).
>> It is possible to further improve performances up to 1500 but I  need to
>> abandon the actionloop model and run an action with an independent server,
>> so for now I am not doing anything like this,
>> 
>> --
>>  Michele Sciabarra
>>  mich...@sciabarra.com
>> 
> 
> 
> -- 
> Carlos Santana
> 


Re: Submitted PR for performance boost of the Go Runtime

2018-11-29 Thread Michele Sciabarra
Another (I would say easy) win for actionloop is Swift. 
Numbers just tested:

Current one:

*** Testing openwhisk/action-swift-v4.1 ***
Running 10s test @ http://localhost:8080/run
  1 threads and 1 connections
  Thread Stats   Avg  Stdev Max   +/- Stdev
Latency11.93ms4.42ms  86.72ms   98.50%
Req/Sec85.72  9.85   101.00 87.00%
  859 requests in 10.05s, 116.60KB read
Requests/sec: 85.51
Transfer/sec: 11.61KB

The new one you have the PR to merge:

*** Testing msciab/actionloop-swift-v4.2.1 ***
Running 10s test @ http://localhost:8080/run
  1 threads and 1 connections
  Thread Stats   Avg  Stdev Max   +/- Stdev
Latency14.15ms   66.82ms 529.16ms   95.77%
Req/Sec 1.04k   151.90 1.21k88.54%
  9935 requests in 10.01s, 1.28MB read
Requests/sec:992.40
Transfer/sec:130.83KB


-- 
  Michele Sciabarra
  mich...@sciabarra.com

- Original message -
From: Carlos Santana 
To: dev@openwhisk.apache.org
Subject: Re: Submitted PR for performance boost of the Go Runtime
Date: Wed, 28 Nov 2018 09:20:21 -0500

Thanks Michele for looking into this

The numbers look great !

I will review the PR soon

-- Carlos

On Wed, Nov 28, 2018 at 5:25 AM Michele Sciabarra 
wrote:

> Hello all,
>
> after the first embarrassing numbers of the Golang runtime that made me
> hurry to fix performances :)), here I provide the updated numbers of
> action-loop based runtimes compared with the current ones. Those numbers
> can vary slightly from one run to another (for example I discovered it
> depends how charged is your Mac) but generally are proportional when
> running in the same environment.
>
> I just run the benchmarks, and I have:
>
> Performance report
> *** actionloop-golang-v1.11 (improved)
> Requests/sec:   1086.67
> Transfer/sec:154.93KB
> *** nodejs6action
> Requests/sec:   1006.41
> Transfer/sec:225.07KB
> *** actionloop-nodejs6action
> Requests/sec:   1230.92
> Transfer/sec:157.09KB
> *** python3action
> Requests/sec: 20.05
> Transfer/sec:  2.62KB
> *** actionloop-python
> Requests/sec:   1066.84
> Transfer/sec:139.61KB
>
> Note the actionloop runtimes are really prototypes, I have not run any
> test suite against them.
>
> --
>   Michele Sciabarra
>   mich...@sciabarra.com
>
> - Original message -
> From: Michele Sciabarra 
> To: dev@openwhisk.apache.org
> Subject: Submitted PR for performance boost of the Go Runtime
> Date: Tue, 27 Nov 2018 23:38:11 +0100
>
> I just submitted a PR to improve performances of the Go/ActionLoop runtime
> while being the less invasive possible.
>
> I am keeping the current design and I have not changed approach: there is
> still child process fed by the HTTP server on standard input with output on
> file descriptor 3. I am still using the standard http server (not the fast
> HTTP server).
> On the converse, I removed all the goroutines. They do not play well with
> external processes. Instead, I am using the classical technique of
> suspending on reading the standard input is fast to commute to the child
> process/
>
> I reached in my tests around 1150 requests/second (a bit worse than the
> nodejs but I think decent).
> It is possible to further improve performances up to 1500 but I  need to
> abandon the actionloop model and run an action with an independent server,
> so for now I am not doing anything like this,
>
> --
>   Michele Sciabarra
>   mich...@sciabarra.com
>


-- 
Carlos Santana