Interesting view Thomas - and it makes a lot of sense. Would you rather see
2 modules? embedded-runner+portable-runner+direct-runner (with inheritance
in between)? Would work for me.


Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

2018-03-05 19:43 GMT+01:00 Thomas Groh <tg...@google.com>:

> The portable java 'DirectRunner' is already in-progress, and has been for
> several months - it's tracked by https://issues.apache.org/
> jira/browse/BEAM-2899
>
> My expectation is that the actual portability augmentations is unlikely to
> require significant changes to the DirectRunner implementations. I'd prefer
> to avoid any major refactors while that effort is underway - it's likely to
> add a significant amount of overhead, and I don't think that this
> refactoring will improve the velocity for the portability changes. The
> non-checking modes (immutability, enforceability) can be for the time
> disabled with flags.
>
> After the portability runner goes in, I'm not opposed to considering a
> refactoring - but I think that splitting "Model Enforcements" into separate
> modules might be overkill for things of that scope.
>
>
> On Mon, Mar 5, 2018 at 10:25 AM Romain Manni-Bucau <rmannibu...@gmail.com>
> wrote:
>
>> Hi Lukasz,
>>
>> concretely it is pretty simple - if not let me know, i'll try to gist
>> some code but I don't think we need:
>>
>> (I'll use module names, let's not discuss them, it is just to share the
>> idea) I see it as follow:
>>
>> 1. beam-java-runner - bare API impl (extracted from direct runner, this
>> is not a new impl. Advantage is to make the new portable java runner and
>> direct runner converging)
>> 2. beam-java-runner-immutability-extension: adds the option
>> EnforceImmutability
>> 3. beam-java-runner-encodability: adds the option EnforceEncodability
>> 4. beam-java-runner-portableapi: adds ProtoTranslation (+ a few other
>> parts probably), this one will lead more or less to the portable one
>> 5. beam-java-direct-runner (current one)
>>
>> Idea is to have a *unique* and production proof embedded java runner
>> which has composable extensions and the full blown flavor (with all
>> extensions) is the direct runner, an intermediate flavor is the portable
>> runner.
>> Advantage is to be able to keep adding validations and harnessing to the
>> direct runner without degrading all the other use cases.
>> This lead to keep a light embedded runner as a beam reference
>> implementation which is usable in prod until the volumes require more.
>>
>> If we don't go that way we should think about what is the reference
>> implementation and maybe just drop some usages of the direct runner and
>> enhance another runner supporting embedded runs to support all the beam API
>> (for instance flink runner).
>>
>> Does it make it clearer?
>>
>>
>>
>>
>> Romain Manni-Bucau
>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>> <https://rmannibucau.metawerx.net/> | Old Blog
>> <http://rmannibucau.wordpress.com> | Github
>> <https://github.com/rmannibucau> | LinkedIn
>> <https://www.linkedin.com/in/rmannibucau> | Book
>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>
>> 2018-03-04 20:15 GMT+01:00 Lukasz Cwik <lc...@google.com>:
>>
>>> Feel free to document what you would like the extension mechanism to do
>>> and provide some skeleton interfaces for APIs that you would like to
>>> support.
>>>
>>> On Fri, Mar 2, 2018 at 2:33 PM, Romain Manni-Bucau <
>>> rmannibu...@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> Le 2 mars 2018 22:22, "Lukasz Cwik" <lc...@google.com> a écrit :
>>>>
>>>> To my knowledge, no one has discussed an extension mechanism for the
>>>> direct runner but the difficulty is in how to get extensions to interact
>>>> with the internals of the direct runner cleanly.
>>>> Note that the direct runner currently accepts a set of flags which
>>>> enable/disable validation and control how it runs like
>>>> "--enforceImmutability": https://github.com/apache/
>>>> beam/blob/master/runners/direct-java/src/main/java/org/
>>>> apache/beam/runners/direct/DirectOptions.java#L49
>>>> Would it be easier to just add more flags which control how the direct
>>>> runner works?
>>>>
>>>>
>>>> Nop, idea is to guarantee a behavior and prevent regression whatever is
>>>> added for other purposes
>>>>
>>>>
>>>>
>>>> As for having a direct runner using portability to be able to execute
>>>> Python / Go / Java SDKs, you should look at https://issues.apache.org/
>>>> jira/browse/BEAM-2899
>>>>
>>>> On Fri, Mar 2, 2018 at 12:53 PM, Romain Manni-Bucau <
>>>> rmannibu...@gmail.com> wrote:
>>>>
>>>>> Hi guys,
>>>>>
>>>>> wonder if you discussed or thought to break down what is called today
>>>>> the direct runner in an embedded runner which would be modular an
>>>>> extensible.
>>>>>
>>>>> What I have in mind is the following:
>>>>>
>>>>> 1. have a strong embedded runner implementing the whole beam API but
>>>>> limited to a single JVM
>>>>> 2. keep a string test oriented runner (what we call direct runner
>>>>> today)
>>>>>
>>>>> The overall design would be to ensure 1 and 2 share the common code
>>>>> and avoid to do yet another runner. This means several extension points
>>>>> should be defined to:
>>>>>
>>>>> 1. add the serialization validation
>>>>> 2. add the portability validation
>>>>> 3. add the execution randomization
>>>>>
>>>>> I didn't think yet to what would be the execution points (can just be
>>>>> replacements probably or specific extension points which would be less
>>>>> elegant but it enables to reach the same goal).
>>>>>
>>>>> The base runner (let's call it "EmbeddedRunner" to name it here) would
>>>>> have its EmbeddedRunnerOptions which would have a --modules options to
>>>>> activate all potential extension points (in 
>>>>> META-INF/org/apache/beam/embeddedrunner/extensions/xxx
>>>>> xxx being the extension name to use in --modules for instance).
>>>>>
>>>>> This would enable to have an embedded runner more usable for
>>>>> light/small but production oriented environments for users, would also
>>>>> start to align the work done for the portability (thinking to recent 
>>>>> python
>>>>> enhancements in runners) without loosing the strong validation done in
>>>>> tests or preprod envs.
>>>>>
>>>>> Was it already mentionned/thought? If not, wdyt?
>>>>>
>>>>> Romain Manni-Bucau
>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>
>>>>
>>>>
>>>>
>>>
>>

Reply via email to