Etienne, cut some JIRAs for improvements like ValidatesRunner for the
Nexmark suite that you think are worthy. Some of them might be good
'starter' tasks as well.

On Fri, Aug 25, 2017 at 1:43 AM, Etienne Chauchot <[email protected]>
wrote:

> Hi guys,
>
> There is also some points to discuss:
>
> - I think some of the tests in this test suite should be generalized as
> validatesRunner tests like it was done for example for custom window
> merging (https://github.com/apache/beam/blob/5181e619f17e1f69fabe8d5
> bdfc7a3a6a2142cde/sdks/java/core/src/test/java/org/apache/
> beam/sdk/transforms/windowing/WindowTest.java#L591)
>
> - We have run almost no tests on Dataflow, so if someone could run the
> test suite on dataflow, he's very welcome. All needed information are still
> in the README, but I'll move these info to the website.
>
> - other points?
>
> WDYT?
>
> Best,
>
> Etienne
>
>
>
> Le 24/08/2017 à 18:35, Lukasz Cwik a écrit :
>
>> Yeah, was looking forward to this.
>>
>> On Thu, Aug 24, 2017 at 9:20 AM, Tyler Akidau <[email protected]
>> >
>> wrote:
>>
>> Awesome news, thank you! :-D
>>>
>>> On Thu, Aug 24, 2017 at 12:40 AM Etienne Chauchot <[email protected]>
>>> wrote:
>>>
>>> Hi all,
>>>>
>>>> I wanted to let you know that the Nexmark PR is merged into master. Feel
>>>> free to use it (e.g. performance testing, release testing ...).
>>>>
>>>> Etienne
>>>>
>>>> Le 12/05/2017 à 10:55, Etienne Chauchot a écrit :
>>>>
>>>>> Hi guys,
>>>>>
>>>>> I wanted to let you know that I have just submitted a PR around
>>>>> NexMark. This is a port of the NexMark queries to Beam, to be used as
>>>>> integration tests.
>>>>> This can also be used as A-B testing (no-regression or performance
>>>>> comparison between 2 versions of the same engine or of the same runner)
>>>>>
>>>>> This a continuation of the previous PR (#99) from Mark Shields.
>>>>> The code has changed quite a bit: some queries have changed to use new
>>>>> Beam APIs and there where some big refactorings. More important, we
>>>>> can now run all the queries in all the runners.
>>>>>
>>>>> Nevertheless, there are still some open issues in Nexmark
>>>>> (https://github.com/iemejia/beam/issues) and in Beam upstream (see
>>>>> issue links in https://issues.apache.org/jira/browse/BEAM-160)
>>>>>
>>>>> I wanted to submit the PR before our (Ismaël and I) NexMark talk at
>>>>> the ApacheCon. The PR is not perfect but it is in a good shape to
>>>>> share it.
>>>>>
>>>>> Best,
>>>>>
>>>>> Etienne
>>>>>
>>>>>
>>>>>
>>>>> Le 22/03/2017 à 04:51, Kenneth Knowles a écrit :
>>>>>
>>>>>> This is great! Having a variety of realistic-ish pipelines running on
>>>>>> all
>>>>>> runners complements the validation suite and IO IT work.
>>>>>>
>>>>>> If I recall, some of these involve heavy and esoteric uses of state,
>>>>>>
>>>>> so
>>>
>>>> definitely give me a ping if you hit any trouble.
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Tue, Mar 21, 2017 at 9:38 AM, Etienne Chauchot <
>>>>>>
>>>>> [email protected]>
>>>
>>>> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>>
>>>>>>> Ismael and I are working on upgrading the Nexmark implementation for
>>>>>>> Beam.
>>>>>>> See https://github.com/iemejia/beam/tree/BEAM-160-nexmark and
>>>>>>> https://issues.apache.org/jira/browse/BEAM-160. We are continuing
>>>>>>>
>>>>>> the
>>>
>>>> work done by Mark Shields. See https://github.com/apache/
>>>>>>>
>>>>>> beam/pull/366
>>>
>>>> for the original PR.
>>>>>>>
>>>>>>> The PR contains queries that have a wide coverage of the Beam model
>>>>>>>
>>>>>> and
>>>
>>>> that represent a realistic end user use case (some come from client
>>>>>>> experience on Google Cloud Dataflow).
>>>>>>>
>>>>>>> So far, we have upgraded the implementation to the latest Beam
>>>>>>> snapshot.
>>>>>>> And we are able to execute a good subset of the queries in the
>>>>>>> different
>>>>>>> runners. We upgraded the nexmark drivers to do so: direct driver
>>>>>>> (upgraded
>>>>>>> from inProcessDriver) and flink driver and we added a new one for
>>>>>>> spark.
>>>>>>>
>>>>>>> There is still a good amount of work to do and we would like to know
>>>>>>>
>>>>>> if
>>>
>>>> you think that this contribution can have its place into Beam
>>>>>>> eventually.
>>>>>>>
>>>>>>> The interests of having Nexmark on Beam that we have seen so far are:
>>>>>>>
>>>>>>> - Rich batch/streaming test
>>>>>>>
>>>>>>> - A-B testing of runners or runtimes (non-regression, performance
>>>>>>> comparison between versions ...)
>>>>>>>
>>>>>>> - Integration testing (sdk/runners, runner/runtime, ...)
>>>>>>>
>>>>>>> - Validate beam capability matrix
>>>>>>>
>>>>>>> - It can be used as part of the ongoing PerfKit work (if there is any
>>>>>>> interest).
>>>>>>>
>>>>>>> As a final note, we are tracking the issues in the same repo. If
>>>>>>> someone
>>>>>>> is interested in contributing, or have more ideas, you are welcome :)
>>>>>>>
>>>>>>> Etienne
>>>>>>>
>>>>>>>
>>>>>>>
>>>>
>

Reply via email to