Re: How to best run non-local tests in ASF (was: Performance tests for OpenWhisk)

2018-04-06 Thread Matt Rutkowski
Hi Michael,

The model that I am familiar with is from Apache SystemML where Apache 
Infra. could not provide hardware Compute resources (actual GPU) that 
could effectively run the code for testing in a reasonable amount of time. 
 Therefore, IBM worked out a deal with Apache where IBM donated suitable 
Compute resources and provided access to Apache Infra. to manage them.

I reached out this morning to a colleague in my group at IBM, Luciano 
Resende, who worked on negotiating and setting up these testing pipelines 
b/w IBM and Apache (SystemML), and also described similar arrangements for 
Apache Spark and Bahir projects.

Here is what he described to me:
here are the possibilities for having a heavy Ci infrastructure 
for an Apache Project (assuming Apache CI infrastructure does not provide 
you enough resources)

Self hosted CI infrastructure: This is the scenario that we use in Apache 
Spark, Apache SystemML and some portions of Apache Bahir. A company 
provision machines (in the case of Spark it's AMPLAB and for the others 
it's IBM) and than we configure and manage these machines with the project 
communities, providing public access to build outputs and management 
access per request for committers/pmc members.

Apache managed donated machines: In this scenario, which was a little more 
popular a few years ago, you can procure a set of machines and donate 
these to Apache trough a target donation which in summary means it should 
be used by your project and not shared with the overall projects. In this 
case, the management of the infrastructure is done by Apache, and these 
nodes would be added to their jenkins infrastructure and your jobs 
assigned to run on these machines.


He indicated that if we want to hear more that he can join this thread to 
answer any further or more detailed questions.

Kind regards,
Matt 




From:   Michael Marth 
To: "dev@openwhisk.apache.org" 
Date:   04/06/2018 08:39 AM
Subject:How to best run non-local tests in ASF (was: Performance 
tests for OpenWhisk)



Hi mentors (and others),



Had an offline discussion this week in which the question came up how an 
ASF project should best go about running 
performance/throughput/scalability tests – i.e. tests that cannot be run 
locally and require a repeatable environment.



Some options:

* interested companies run the tests on their own infra and publish 
results. Pretty lame, especially because typically only that company’s 
engineers can access the env and investigate further.

* interested companies donate cash to sponsor compute resources, 
committers can run and investigate the tests. Ideal from tech perspective, 
but I have no idea how that cash would make its way from the ASF to a 
particular project.

* maybe a middle-ground: interested party that happens to have a public 
cloud offering gives credentials to committers



I am mainly interested to learn if there are other ASF projects (e.g. in 
the Big Data/Hadoop ecosystem) that do something similar. Or if there is 
an ASF-recommended way to do this. Or else, where I could ask this 
question?



Thanks!

Michael



From: Michael Marth 

Date: Wednesday 3 May 2017 20:57

To: "dev@openwhisk.apache.org" 

Subject: Re: Performance tests for OpenWhisk



Markus,



Quick update: sent the below to users@infra. So far no reaction. The 
archive is here [1] but Bertrand tells me only ASF member have access  - 
for whatever reason.



Michael



[1] 
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_70999f9233dac9b416ef9dedc97c0ef196a938c05d6a407b94ba3479-40-253Cusers.infra.apache.org-253E=DwIGaQ=jf_iaSHvJObTbx-siA1ZOg=6zQLM7Gc0Sv1iwayKOKa4_SFxRIxS478q2gZlAJj4Zw=1jQPrKo-XPFNcyWvKgVdJSRBLNVI_DSgmZJd-kr8xnk=A4hNPlAaL65ukbkNROftPjO0CZ1GDbggEVvTaWQOpQc=






On Fri, Apr 28, 2017 at 2:23 PM, Michael Marth > wrote:

Dear Infra team,



I am enquiring on behalf of the OpenWhisk project (currently in Incubator)

[1].



We would like to periodically run performance tests on a distributed

environment (OpenWhisk typically runs on more than 1 machine). So we are

basically looking for an ability to spin up/tear down a number of 
(virtual)

machines and exclusively use them for a certain amount of time (so that 
the

VMs are not shared and the performance test results are comparable over

time).

The order of magnitude would be ~5-10 VMs for 1 hour 3 times a week.



I would like to find out if there is an ASF-supported mechanism to do 
that.

For example, can Infra provide such infrastructure? Or is there a cloud

provider (like Azure) that might sponsor such efforts with VMs? Or maybe

there is an established way for commercial companies that are interested 
in

an ASF project to sponsor (fund) such tests?



If none of the above exists, then it would also be helpful for us to get 
to


Re: Please comment on README update to release repo.

2018-04-06 Thread Matt Rutkowski
Hi James,

Thanks! You might see that this is a table that I have had on the CWIKI 
(all repos. including providers) for some time:
https://cwiki.apache.org/confluence/display/OPENWHISK/GitHub+Repository+Status

but will migrate them here to sub-document that will link from the release 
repo. README...

I am working to reuse the same HTML (port back the generic/clean HTML) as 
a "widget" into Confluence now.

maybe today if I have time I can move the other table over (export from 
Confluence to HTML -> scrub HTML -> fixup for GitHub markdown).

Kind regards,
Matt 



From:   "James W Dubee" 
To: dev@openwhisk.apache.org
Date:   04/05/2018 04:51 PM
Subject:Re: Please comment on README update to release repo.



Hey Matt,
 
Having all the repository statuses for Travis all in one place is really 
nice for monitoring purposes! I think we could add the providers repos 
(alarms, Cloudant, Kafka) eventually.

Regards,
James Dubee



"Matt Rutkowski" ---04/05/2018 05:01:04 PM---Whiskers, Started work 
towards getting the incubator-openwhisk-release repo. docs in

From: "Matt Rutkowski" 
To: dev@openwhisk.apache.org
Date: 04/05/2018 05:01 PM
Subject: Please comment on README update to release repo.



Whiskers,

Started work towards getting the incubator-openwhisk-release repo. docs in 

"GA" shape with focus on the role of a "Release Manager"...

Added the "repo status" table to the README so Rel. Mgrs. can get a 
snapshot of project health (migrated and updated from our CWIKI).
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_incubator-2Dopenwhisk-2Drelease=DwIFAg=jf_iaSHvJObTbx-siA1ZOg=mNYL3iRkIHSpena7hJD92ihAR7Np-_j5HhizqqbvOHE=HOAG0-RWpYCfqcGEuNrAyzsaSI5762zXBM80W0493Zc=P77gBP58GSA9naqmmeInRh_sC72knWJGUsAslSkQR1M=


Also, plan to update scancode to the new "strict" configuration and submit 

PRs to affected repos. as well so we can mark the final column with all 
"yes' values.

Much more work to do in the docs that will promote the actual release 
process steps to main README, clearly indicate which steps are manual and 
provide links to supporting docs from each step. Will appreciate more 
comments as I submit more PRs over the next few days.

-matt









Re: Please comment on README update to release repo.

2018-04-06 Thread Carlos Santana
James unfortunately the repo providers are not in a good stable state since
they don’t have Travis testing the functionality.

But would be happy if someone takes the task to configure Travis to deploy
and tests the providers.

— Carlos
On Thu, Apr 5, 2018 at 5:44 PM James W Dubee  wrote:

> Hey Matt,
>
> Having all the repository statuses for Travis all in one place is really
> nice for monitoring purposes! I think we could add the providers repos
> (alarms, Cloudant, Kafka) eventually.
>
> Regards,
> James Dubee
>
>
>
> [image: Inactive hide details for "Matt Rutkowski" ---04/05/2018 05:01:04
> PM---Whiskers, Started work towards getting the incubator-ope]"Matt
> Rutkowski" ---04/05/2018 05:01:04 PM---Whiskers, Started work towards
> getting the incubator-openwhisk-release repo. docs in
>
> From: "Matt Rutkowski" 
> To: dev@openwhisk.apache.org
> Date: 04/05/2018 05:01 PM
> Subject: Please comment on README update to release repo.
> --
>
>
>
>
> Whiskers,
>
> Started work towards getting the incubator-openwhisk-release repo. docs in
> "GA" shape with focus on the role of a "Release Manager"...
>
> Added the "repo status" table to the README so Rel. Mgrs. can get a
> snapshot of project health (migrated and updated from our CWIKI).
>
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_incubator-2Dopenwhisk-2Drelease=DwIFAg=jf_iaSHvJObTbx-siA1ZOg=mNYL3iRkIHSpena7hJD92ihAR7Np-_j5HhizqqbvOHE=HOAG0-RWpYCfqcGEuNrAyzsaSI5762zXBM80W0493Zc=P77gBP58GSA9naqmmeInRh_sC72knWJGUsAslSkQR1M=
>
>
>
> Also, plan to update scancode to the new "strict" configuration and submit
> PRs to affected repos. as well so we can mark the final column with all
> "yes' values.
>
> Much more work to do in the docs that will promote the actual release
> process steps to main README, clearly indicate which steps are manual and
> provide links to supporting docs from each step. Will appreciate more
> comments as I submit more PRs over the next few days.
>
> -matt
>
>
>
>
>


Re: How to best run non-local tests in ASF (was: Performance tests for OpenWhisk)

2018-04-06 Thread Carlos Santana
Sebastian Bazley from ASF INFRA would be good resource to contact to drive
this.

In the past when I brought this with infra I think I understood from him
that companies can donate funds and explicitly state that are for the
OpenWhisk project then the INFRA team will allocate VMs to OpenWhisk to run
load/performance tests, same way that Hadoop project has decimated VMs in
Jenkins today.

— Carlos

On Fri, Apr 6, 2018 at 9:38 AM Michael Marth 
wrote:

> Hi mentors (and others),
>
> Had an offline discussion this week in which the question came up how an
> ASF project should best go about running performance/throughput/scalability
> tests – i.e. tests that cannot be run locally and require a repeatable
> environment.
>
> Some options:
> * interested companies run the tests on their own infra and publish
> results. Pretty lame, especially because typically only that company’s
> engineers can access the env and investigate further.
> * interested companies donate cash to sponsor compute resources,
> committers can run and investigate the tests. Ideal from tech perspective,
> but I have no idea how that cash would make its way from the ASF to a
> particular project.
> * maybe a middle-ground: interested party that happens to have a public
> cloud offering gives credentials to committers
>
> I am mainly interested to learn if there are other ASF projects (e.g. in
> the Big Data/Hadoop ecosystem) that do something similar. Or if there is an
> ASF-recommended way to do this. Or else, where I could ask this question?
>
> Thanks!
> Michael
>
> From: Michael Marth 
> Date: Wednesday 3 May 2017 20:57
> To: "dev@openwhisk.apache.org" 
> Subject: Re: Performance tests for OpenWhisk
>
> Markus,
>
> Quick update: sent the below to users@infra. So far no reaction. The
> archive is here [1] but Bertrand tells me only ASF member have access  -
> for whatever reason.
>
> Michael
>
> [1]
> https://lists.apache.org/thread.html/70999f9233dac9b416ef9dedc97c0ef196a938c05d6a407b94ba3479@%3Cusers.infra.apache.org%3E
>
>
> On Fri, Apr 28, 2017 at 2:23 PM, Michael Marth > wrote:
> Dear Infra team,
>
> I am enquiring on behalf of the OpenWhisk project (currently in Incubator)
> [1].
>
> We would like to periodically run performance tests on a distributed
> environment (OpenWhisk typically runs on more than 1 machine). So we are
> basically looking for an ability to spin up/tear down a number of (virtual)
> machines and exclusively use them for a certain amount of time (so that the
> VMs are not shared and the performance test results are comparable over
> time).
> The order of magnitude would be ~5-10 VMs for 1 hour 3 times a week.
>
> I would like to find out if there is an ASF-supported mechanism to do that.
> For example, can Infra provide such infrastructure? Or is there a cloud
> provider (like Azure) that might sponsor such efforts with VMs? Or maybe
> there is an established way for commercial companies that are interested in
> an ASF project to sponsor (fund) such tests?
>
> If none of the above exists, then it would also be helpful for us to get to
> know how other projects run such sort of tests.
>
> Thanks a lot!
> Michael
>
>
> [1]
>
> https://lists.apache.org/thread.html/b66ab5b438f2db5cdc8c5f5eabece201b4ad090058fa3a9a3bd09d12@%3Cdev.openwhisk.apache.org%3E
>
>
>
>
> From: Markus Thömmes >
> Reply-To: "dev@openwhisk.apache.org" <
> dev@openwhisk.apache.org>
> Date: Wednesday 26 April 2017 12:59
> To: "dev@openwhisk.apache.org" <
> dev@openwhisk.apache.org>
> Subject: Re: Performance tests for OpenWhisk
>
> Hi Michael,
>
> yeah that sounds pretty much spot on. I'd like to have at least 2 VMs with
> 4+ cores and 8GB memory. One VM would host the management stack while one
> would be dedicated to an Invoker only. That way we could assert
> single-invoker performance the easiest.
>
> Thanks for helping!
>
> Cheers,
> Markus
>
> Am 26. April 2017 um 11:36 schrieb Michael Marth >:
> Markus,
>
> Does what I describe reflect what you are looking for?
> If yes, I am happy to ask on infra.
>
> Let me know
> Michael
>
>
>
> On 26/04/17 07:52, "Bertrand Delacretaz" > wrote:
>
>
> Hi Michael,
>
> On Tue, Apr 25, 2017 at 6:52 PM, Michael Marth > wrote:
> ...Maybe our mentors can chime in. Has this been discussed in the ASF
> board or so?...
>
> Best would be to ask the ASF infrastructure team via
> us...@infra.apache.org - briefly describe
> what you need to see what's
> possible.
>
> -Bertrand
>