from:"Kamil Szewczyk"

Re: Announcement & Proposal: HDFS tests on large cluster.

2018-06-11 Thread Kamil Szewczyk

Hi all,

as a positive outcome of extending kubernetes cluster at the bottom of the
https://builds.apache.org/view/A-D/view/Beam/job/beam_PerformanceTests_Analysis/37/consoleText
and on dedicated slack channel
https://apachebeam.slack.com/messages/CAB3W69SS/ we can observe better
stability of the tests after cluster resize. Most of the execution times
slightly decreased and finally, all tests were executed and analysed.

Thanks,
Kamil Szewczyk



2018-06-08 13:13 GMT+02:00 Łukasz Gajowy :

> @Pablo this is exactly as Chamikara says. In fact, there is a dedicated
> Gcloud project for whole testing infrastructure (called
> "apache-beam-testing"). It provides the Kubernetes cluster for the data
> stores as well as big query storage for the test results presented in the
> testing dashboard.
>
> @Alan thanks a lot!
>
> Best regards,
> Łukasz
>
>
>
> czw., 7 cze 2018 o 22:37 Chamikara Jayalath 
> napisał(a):
>
>> We still use Jenkins machines to execute the test but data stores are
>> hosted in Kubernetes.
>>
>> On Thu, Jun 7, 2018 at 1:35 PM Pablo Estrada  wrote:
>>
>>> Just out of curiosity: This does not use the Jenkins machines then?
>>> -P.
>>>
>>> On Thu, Jun 7, 2018 at 1:33 PM Alan Myrvold  wrote:
>>>
>>>> Done. Changed the size of the io-datastores kubernetes cluster in
>>>> apache-beam-testing to 3 nodes.
>>>>
>>>> On Thu, Jun 7, 2018 at 1:45 AM Kamil Szewczyk 
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> the node pool size of io-datastores kubernetes cluster in
>>>>> apache-beam-testing project must be changed from 1 -> 3 (or other value).
>>>>> @Alan Myrvold was already helpful with kubernetes cluster settings so
>>>>> far, but I am not aware who made decisions regarding that as
>>>>> this will increase monthly billing.
>>>>>
>>>>> Kamil Szewczyk
>>>>>
>>>>> 2018-06-07 6:27 GMT+02:00 Kenneth Knowles :
>>>>>
>>>>>> This is rad. Another +1 from me for a bigger cluster. What do you
>>>>>> need to make that happen?
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Wed, Jun 6, 2018 at 10:16 AM Pablo Estrada 
>>>>>> wrote:
>>>>>>
>>>>>>> This is really cool!
>>>>>>>
>>>>>>> +1 for having a cluster with more than one machine run the test.
>>>>>>>
>>>>>>> -P.
>>>>>>>
>>>>>>> On Wed, Jun 6, 2018 at 9:57 AM Chamikara Jayalath <
>>>>>>> chamik...@google.com> wrote:
>>>>>>>
>>>>>>>> On Wed, Jun 6, 2018 at 5:19 AM Łukasz Gajowy <
>>>>>>>> lukasz.gaj...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> I'd like to announce that thanks to Kamil Szewczyk, since this PR
>>>>>>>>> <https://github.com/apache/beam/pull/5441> we have 4 file-based
>>>>>>>>> HDFS tests run on a "Large HDFS Cluster"! More specifically I mean:
>>>>>>>>>
>>>>>>>>> - beam_PerformanceTests_Compressed_TextIOIT_HDFS
>>>>>>>>> - beam_PerformanceTests_Compressed_TextIOIT_HDFS
>>>>>>>>> - beam_PerformanceTests_AvroIOIT_HDFS
>>>>>>>>> - beam_PerformanceTests_XmlIOIT_HDFS
>>>>>>>>>
>>>>>>>>> The "Large HDFS Cluster" (in contrast to the small one, that is
>>>>>>>>> also available) consists of a master node and three data nodes all in
>>>>>>>>> separate pods. Thanks to that we can mimic more real-life scenarios 
>>>>>>>>> on HDFS
>>>>>>>>> (3 distributed nodes) and possibly run bigger tests so there's 
>>>>>>>>> progress! :)
>>>>>>>>>
>>>>>>>>>
>>>>>>>> This is great. Also, looks like results are available in test
>>>>>>>> dashboard: https://apache-beam-testing.appspot.com/
>>>>>>>> explore?dashboard=5755685136498688
>>>>>>>> (BTW we should add information about dashboard to the testing doc:
>>>>>>>> https://beam.apache.org/contribute/testing/)
>>>>>>>>
>>>>>>>> I'm currently working on proper documentation for this so that
>>>>>>>>> everyone can use it in IOITs (stay tuned).
>>>>>>>>>
>>>>>>>>> Regarding the above, I'd like to propose scaling up the
>>>>>>>>> Kubernetes cluster. AFAIK, currently, it consists of 1 node. If we 
>>>>>>>>> scale it
>>>>>>>>> up to eg. 3 nodes, the HDFS' kubernetes pods will distribute 
>>>>>>>>> themselves on
>>>>>>>>> different machines rather than one, making it an even more "real-life"
>>>>>>>>> scenario (possibly more efficient?). Moreover, other Performance Tests
>>>>>>>>> (such as JDBC or mongo) could use more space for their infrastructure 
>>>>>>>>> as
>>>>>>>>> well. Scaling up the cluster could also turn out useful for some 
>>>>>>>>> future
>>>>>>>>> efforts, like BEAM-4508[1] (adapting and running some old IOITs
>>>>>>>>> on Jenkins).
>>>>>>>>>
>>>>>>>>> WDYT? Are there any objections?
>>>>>>>>>
>>>>>>>> +1 for increasing the size of Kubernetes cluster.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] https://issues.apache.org/jira/browse/BEAM-4508
>>>>>>>>>
>>>>>>>>> --
>>>>>>> Got feedback? go/pabloem-feedback
>>>>>>> <https://goto.google.com/pabloem-feedback>
>>>>>>>
>>>>>>
>>>>> --
>>> Got feedback? go/pabloem-feedback
>>> <https://goto.google.com/pabloem-feedback>
>>>
>>

Re: Announcement & Proposal: HDFS tests on large cluster.

2018-06-07 Thread Kamil Szewczyk

Hi,

the node pool size of io-datastores kubernetes cluster in
apache-beam-testing project must be changed from 1 -> 3 (or other value).
@Alan Myrvold was already helpful with kubernetes cluster settings so far,
but I am not aware who made decisions regarding that as
this will increase monthly billing.

Kamil Szewczyk

2018-06-07 6:27 GMT+02:00 Kenneth Knowles :

> This is rad. Another +1 from me for a bigger cluster. What do you need to
> make that happen?
>
> Kenn
>
> On Wed, Jun 6, 2018 at 10:16 AM Pablo Estrada  wrote:
>
>> This is really cool!
>>
>> +1 for having a cluster with more than one machine run the test.
>>
>> -P.
>>
>> On Wed, Jun 6, 2018 at 9:57 AM Chamikara Jayalath 
>> wrote:
>>
>>> On Wed, Jun 6, 2018 at 5:19 AM Łukasz Gajowy 
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I'd like to announce that thanks to Kamil Szewczyk, since this PR
>>>> <https://github.com/apache/beam/pull/5441> we have 4 file-based HDFS
>>>> tests run on a "Large HDFS Cluster"! More specifically I mean:
>>>>
>>>> - beam_PerformanceTests_Compressed_TextIOIT_HDFS
>>>> - beam_PerformanceTests_Compressed_TextIOIT_HDFS
>>>> - beam_PerformanceTests_AvroIOIT_HDFS
>>>> - beam_PerformanceTests_XmlIOIT_HDFS
>>>>
>>>> The "Large HDFS Cluster" (in contrast to the small one, that is also
>>>> available) consists of a master node and three data nodes all in separate
>>>> pods. Thanks to that we can mimic more real-life scenarios on HDFS (3
>>>> distributed nodes) and possibly run bigger tests so there's progress! :)
>>>>
>>>>
>>> This is great. Also, looks like results are available in test dashboard:
>>> https://apache-beam-testing.appspot.com/explore?dashboard=
>>> 5755685136498688
>>> (BTW we should add information about dashboard to the testing doc:
>>> https://beam.apache.org/contribute/testing/)
>>>
>>> I'm currently working on proper documentation for this so that everyone
>>>> can use it in IOITs (stay tuned).
>>>>
>>>> Regarding the above, I'd like to propose scaling up the
>>>> Kubernetes cluster. AFAIK, currently, it consists of 1 node. If we scale it
>>>> up to eg. 3 nodes, the HDFS' kubernetes pods will distribute themselves on
>>>> different machines rather than one, making it an even more "real-life"
>>>> scenario (possibly more efficient?). Moreover, other Performance Tests
>>>> (such as JDBC or mongo) could use more space for their infrastructure as
>>>> well. Scaling up the cluster could also turn out useful for some future
>>>> efforts, like BEAM-4508[1] (adapting and running some old IOITs on
>>>> Jenkins).
>>>>
>>>> WDYT? Are there any objections?
>>>>
>>> +1 for increasing the size of Kubernetes cluster.
>>>
>>>>
>>>> [1] https://issues.apache.org/jira/browse/BEAM-4508
>>>>
>>>> --
>> Got feedback? go/pabloem-feedback
>> <https://goto.google.com/pabloem-feedback>
>>
>

Kubernetes cluster of apache-beam-testing project

2018-05-23 Thread Kamil Szewczyk

Dear Beam Devs,

we are using kubernetes to on demand create/tear down resources for
performance testing. It is done automatically by Jenkins jobs using
PerfKit. On 20th of May, there was a blog post about security issues in
kubernetes
https://cloud.google.com/kubernetes-engine/docs/security-bulletins and we
should upgrade it to the latest version of the main branch. As far as I
know in apache-beam-testing, we are now using kubernetes cluster with
engine version 1.8, can we update it to version 1.9(1.9.7-gke.1)?
We are missing some new cool features introduced in kubernetes 1.9 like
StatefulSets which I would like to use to setup HDFS large cluster for
performance testing, I have already submitted PR for that. My findings are
described in this two JIRA issues:
 - https://issues.apache.org/jira/browse/BEAM-4362
 - https://issues.apache.org/jira/browse/BEAM-4390

Who is in charge of Google Cloud Platform (have admin access) and can help
with that?

Re: Apache Beam - jenkins question

2018-05-08 Thread Kamil Szewczyk

Hi, Jason

Sorry for late response I was on vacations. I would like to send messages
automatically to slack with Performance Analysis Daily Reports as described
in this https://github.com/apache/beam/pull/5180:
example report could be found on old apache beam slack
https://apachebeam.slack.com/messages/CAB3W69SS/ Those messages were sent
by me, and the missing thing is adding SLACK_WEBHOOK_URL which a token that
allows post messages to slack. I will send it to you in separate message.
So far I only have this token for an old apache beam slack generated, but
in order to migrate it to new slack only credential in jenkins ui will need
to be replaced. We can do it later, as I don't know who is responsible for
managing the-asf.slack.com and can help me with that.


2018-04-28 1:31 GMT+02:00 Jason Kuster <jasonkus...@google.com>:

> Thanks for the heads-up regarding the permissions. At this point I need
> more information about the credentials we want to use -- Kamil, can you
> provide more info? What is the purpose of the credentials you want to use
> here?
>
> On Fri, Apr 27, 2018 at 3:50 PM Davor Bonaci <da...@apache.org> wrote:
>
>> Jason, you should now have all the permissions needed. (You should,
>> however, evaluate whether this is a good place for it. Executors
>> themselves, for example, might be an alternative.)
>>
>> On Fri, Apr 27, 2018 at 7:42 PM, Jason Kuster <jasonkus...@google.com>
>> wrote:
>>
>>> See https://github.com/apache/beam/blob/master/.test-infra/
>>> jenkins/common_job_properties.groovy#L119 for an example of this being
>>> done in practice to add the coveralls repo token as an environment variable.
>>>
>>> On Fri, Apr 27, 2018 at 12:41 PM Jason Kuster <jasonkus...@google.com>
>>> wrote:
>>>
>>>> Hi Kamil, Davor,
>>>>
>>>> I think what you want is the Jenkins secrets feature (see
>>>> https://support.cloudbees.com/hc/en-us/articles/203802500-Injecting-
>>>> Secrets-into-Jenkins-Build-Jobs). Davor, I believe you are the only
>>>> one with enough karma on Jenkins to access the credentials UI; once the
>>>> credential is created in Jenkins it should be able to be set as an
>>>> environment variable through the Jenkins job configuration (groovy files in
>>>> $BEAM_ROOT/.test-infra/jenkins). Hope this helps.
>>>>
>>>> Jason
>>>>
>>>> On Thu, Apr 26, 2018 at 8:43 PM Davor Bonaci <da...@apache.org> wrote:
>>>>
>>>>> Hi Kamil --
>>>>> Thanks for reaching out.
>>>>>
>>>>> This is a great question for the dev@ mailing list. You may want to
>>>>> share a little bit more why you need, how long, frequency of updates to 
>>>>> the
>>>>> secret, etc. for the community to be aware how things work.
>>>>>
>>>>> Hopefully others on the mailing list can help you by manually putting
>>>>> the necessary secret into the cloud settings related to the executors.
>>>>>
>>>>> Davor
>>>>>
>>>>> -- Forwarded message --
>>>>> From: Kamil Szewczyk <szewi...@gmail.com>
>>>>> Date: Tue, Apr 24, 2018 at 12:21 PM
>>>>> Subject: Apache Beam - jenkins question
>>>>> To: da...@apache.org
>>>>>
>>>>>
>>>>> Dear Davor
>>>>>
>>>>> I sent you a message on asf slack, wasn't sure how can I reach you.
>>>>>
>>>>> Anyway are you able to add secret (environment variable) to jenkins.
>>>>> ??
>>>>> Or point me to a person that would be able to do that ?
>>>>>
>>>>> Kind Regards
>>>>> Kamil Szewczyk
>>>>>
>>>>>
>>>>
>>>> --
>>>> ---
>>>> Jason Kuster
>>>> Apache Beam / Google Cloud Dataflow
>>>>
>>>> See something? Say something. go/jasonkuster-feedback
>>>> <https://goto.google.com/jasonkuster-feedback>
>>>>
>>>
>>>
>>> --
>>> ---
>>> Jason Kuster
>>> Apache Beam / Google Cloud Dataflow
>>>
>>> See something? Say something. go/jasonkuster-feedback
>>> <https://goto.google.com/jasonkuster-feedback>
>>>
>>
>>
>
> --
> ---
> Jason Kuster
> Apache Beam / Google Cloud Dataflow
>
> See something? Say something. go/jasonkuster-feedback
>

Re: performance tests of spark fail

2018-04-24 Thread Kamil Szewczyk

Hi Etienne,

I was recently playing a lot with BigQuerry when working on anomaly
detection tool and noticed that in db schema timestamp is defined as FLOAT.
Perfkit also produces it as a float

'timestamp': 1524485484.41655,

 so the upload passes.

Probably it was defined as float from the beginning because of how perfkit
produces it.

Are you using Perfkit when running performance tests job for spark ?

Kind Regards
Kamil Szewczyk

2018-04-23 10:17 GMT+02:00 Etienne Chauchot <echauc...@apache.org>:

> Hi guys,
>
> I noticed a failure in the performance tests job for spark (I did not take
> a look at the others): it seems to be related to a schema update in the
> bigQuery output.
>
> BigQuery error in load operation: Error processing job
> 'apache-beam-testing:bqjob_r2527a0e444514f2b_0162f128db2b_1': Invalid
> schema
> update. Field timestamp has changed type from TIMESTAMP to FLOAT
>
> I opened a ticket to track the issue https://issues.apache.
> org/jira/browse/BEAM-4153
>
> Best
>
> Etienne
>
>

Re: Build failed in Jenkins: beam_SeedJob #1522

2018-04-18 Thread Kamil Szewczyk

FYI:
I was adding new job to jenkins and got some groovy parsing error.
https://github.com/apache/beam/pull/5170.
No worries, I will take a look at this later.

2018-04-18 21:36 GMT+02:00 Apache Jenkins Server 
:

> See 
>
> --
> GitHub pull request #5170 of commit f39ac0bf0f392c9ed42a43e655698b55d10c2b54,
> no merge conflicts.
> Setting status of f39ac0bf0f392c9ed42a43e655698b55d10c2b54 to PENDING
> with url https://builds.apache.org/job/beam_SeedJob/1522/ and message:
> 'Build started sha1 is merged.'
> Using context: Jenkins: Seed Job
> [EnvInject] - Loading node environment variables.
> Building remotely on beam6 (beam) in workspace  job/beam_SeedJob/ws/>
>  > git rev-parse --is-inside-work-tree # timeout=10
> Fetching changes from the remote Git repository
>  > git config remote.origin.url https://github.com/apache/beam.git #
> timeout=10
> Fetching upstream changes from https://github.com/apache/beam.git
>  > git --version # timeout=10
>  > git fetch --tags --progress https://github.com/apache/beam.git
> +refs/heads/*:refs/remotes/origin/* +refs/pull/5170/*:refs/
> remotes/origin/pr/5170/*
>  > git rev-parse refs/remotes/origin/pr/5170/merge^{commit} # timeout=10
>  > git rev-parse refs/remotes/origin/origin/pr/5170/merge^{commit} #
> timeout=10
> Checking out Revision d0370d9d6420108198473e726ca3b035fae06441
> (refs/remotes/origin/pr/5170/merge)
>  > git config core.sparsecheckout # timeout=10
>  > git checkout -f d0370d9d6420108198473e726ca3b035fae06441
> Commit message: "Merge f39ac0bf0f392c9ed42a43e655698b55d10c2b54 into
> c743632e47f92719913a5104cd2c356c8b02f36b"
> First time build. Skipping changelog.
> Cleaning workspace
>  > git rev-parse --verify HEAD # timeout=10
> Resetting working tree
>  > git reset --hard # timeout=10
>  > git clean -fdx # timeout=10
> Processing DSL script job_00_seed.groovy
> Processing DSL script job_beam_Inventory.groovy
> Processing DSL script job_beam_PerformanceTests_Analysis.groovy
> ERROR: (job_beam_PerformanceTests_Analysis.groovy, line 78) No signature
> of method: java.lang.String.positive() is applicable for argument types: ()
> values: []
> Possible solutions: notify(), tokenize(), size(), size()
>
>

Re: Performance tests status and anomaly detection proposal

2018-04-17 Thread Kamil Szewczyk

Thanks for reviewing it. When operating on average there should also be
presented standard deviation. Higher deviation means that we have more
distributed and extreme results results and lower deviation value means
that results are closer the average. It will indicate that tests are going
more stable, by having values closer to trend. That also indicate that
occurrence of any anomalies is rather not a false positive. However would
be really nice to run some queries over real data exported from BigQuery,
and see which statistic parameters could also be useful.


2018-04-16 19:38 GMT+02:00 Jason Kuster <jasonkus...@google.com>:

> Great suggestions -- added some comments. Do you have plans to add more
> sophisticated analysis past just analyzing runtime relative to the 6d
> average?
>
> On Mon, Apr 16, 2018 at 10:21 AM Łukasz Gajowy <lukasz.gaj...@gmail.com>
> wrote:
>
>> That is correct - for now, we're running tests on Dataflow only. There
>> were plans to run them on Spark and Flink (and possibly more runners) but
>> we had some difficulties on the way. We decided to focus on Dataflow, at
>> least for now. Currently the tests are quite flaky, so the priority is to
>> make them more stable. Meanwhile we're providing the all necessary
>> "infrastructure" (hence the anomaly detection proposal).
>>
>> If anyone is willing to contribute in this area, those seem to be the
>> biggest blockers for Spark and Flink:
>> https://issues.apache.org/jira/browse/BEAM-3370
>> https://issues.apache.org/jira/browse/BEAM-3371
>>
>> Best regards,
>> Łukasz
>>
>> 2018-04-16 18:28 GMT+02:00 Pablo Estrada <pabl...@google.com>:
>>
>>> This is very cool!
>>> Are these dashboards for tests running on Dataflow only? Are there plans
>>> for other runners? : )
>>> -P.
>>>
>>> On Mon, Apr 16, 2018 at 9:23 AM Chamikara Jayalath <chamik...@google.com>
>>> wrote:
>>>
>>>> Thanks Dariusz. This sounds great. Added some comments.
>>>> Also, +Jeff Gardner <gardn...@google.com>  who has experience on
>>>> performance regression analysis of integration tests.
>>>>
>>>> Thanks,
>>>> Cham
>>>>
>>>>
>>>> On Mon, Apr 16, 2018 at 2:58 AM Łukasz Gajowy <lukasz.gaj...@gmail.com>
>>>> wrote:
>>>>
>>>>> @Etienne +1 to doing that! :) if we have both results (Nexmark and
>>>>> IOITs) in BQ we could use the same (similar?) tools to detect anomalies
>>>>> captured by Nexmark (if there's need for doing that).
>>>>>
>>>>> 2018-04-16 11:17 GMT+02:00 Etienne Chauchot <echauc...@apache.org>:
>>>>>
>>>>>> Very nice to see the dashboards !
>>>>>>
>>>>>> Regarding Kenn's comment:  Nexmark supports outputing the results to
>>>>>> Bigquery so it could be easily integrated into the dashboards. We're, 
>>>>>> with
>>>>>> Kenn, scheduling Nexmark runs. We could configure the output to bigquery
>>>>>> dashboard tables ?
>>>>>> WDYT?
>>>>>>
>>>>>> Etienne
>>>>>> Le samedi 14 avril 2018 à 23:20 +, Kenneth Knowles a écrit :
>>>>>>
>>>>>> This is very cool. So is it easy for someone to integrate the
>>>>>> proposal to regularly run Nexmark benchmarks and get those on the
>>>>>> dashboard? (or a separate one to keep IOs in their own page)
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Fri, Apr 13, 2018 at 9:02 AM Dariusz Aniszewski <
>>>>>> dariusz.aniszew...@polidea.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Hello Beam devs!As you might already noticed, together with Łukasz
>>>>>> Gajowy, Kamil Szewczyk and Katarzyna Kucharczyk (all directly cc’d here)
>>>>>> we’re working on adding some performance tests to the project. We were
>>>>>> following directions from the Testing I/O Transforms in Apache Beam
>>>>>> <https://beam.apache.org/documentation/

Looking for sb to do review of kubernetes scripts with HDFS datastore

2017-12-18 Thread Kamil Szewczyk

Hi all,

I recently submitted PR https://github.com/apache/beam/pull/4261 that
allows to set up small Kubernetes hdfs cluster and run filebased io tests
on it using Direct and Dataflow runner.
This is basically enabler for doing performance testing on hdfs. Is there
anyone who can do a review of Kubernetes scripts ?

I will be happy to answer any questions.

Thanks!
Kind Regards
Kamil Szewczyk

Re: [Proposal] Add performance tests for commonly used file-based I/O PTransforms

2017-11-10 Thread Kamil Szewczyk

We updated Step #2 in our proposal.
Comments and suggestions are highly appreciated.

Thanks

2017-10-31 15:42 GMT+01:00 Łukasz Gajowy :

> We edited the "Roadmap" section a little bit to reflect our state of
> knowledge. As before, all comments are welcome.
>
> Thank you in advance!
>
> 2017-10-27 5:10 GMT+02:00 Kenneth Knowles :
>
> > I am really excited about this development. Glad to have such a detailed
> > document! Thanks for taking the time to write it up thoughtfully.
> >
> > On Wed, Oct 25, 2017 at 10:00 AM, Chamikara Jayalath <
> > chamik...@google.com.invalid> wrote:
> >
> > > Thanks Łukasz and the team for the proposal. I think fixing this JIRA
> > will
> > > allow us to keep track of the performance of widely used
> > > source/sink/runner/file-system combinations of Beam SDK. As Łukasz
> > > mentioned, all comments are welcome.
> > >
> > > Thanks,
> > > Cham
> > >
> > > On Wed, Oct 25, 2017 at 8:08 AM Łukasz Gajowy  >
> > > wrote:
> > >
> > > > Hello Beam Community!
> > > >
> > > >
> > > > During the last year many of Beam developers has put much effort in
> > > > developing and discussing means of testing beam transforms. We would
> > like
> > > > to benefit from that and implement performance tests for file-based
> I/O
> > > > Transforms.
> > > >
> > > >
> > > > This proposal is strictly related to BEAM-3060 issue. Here’s the link
> > to
> > > > the doc:
> > > >
> > > > https://docs.google.com/document/d/1dA-5s6OHiP_cz-
> > > NRAbwapoKF5MEC1wKps4A5tFbIPKE/edit
> > > >
> > > >
> > > > All comments are deeply appreciated.
> > > >
> > > >
> > > > Thanks!
> > > >
> > > > ŁG
> > > >
> > >
> >
>

Re: Announcement & Proposal: HDFS tests on large cluster.

Re: Announcement & Proposal: HDFS tests on large cluster.

Kubernetes cluster of apache-beam-testing project

Re: Apache Beam - jenkins question

Re: performance tests of spark fail

Re: Build failed in Jenkins: beam_SeedJob #1522

Re: Performance tests status and anomaly detection proposal

Looking for sb to do review of kubernetes scripts with HDFS datastore

Re: [Proposal] Add performance tests for commonly used file-based I/O PTransforms

9 matches

Site Navigation

Mail list logo

Footer information