Hi Luke

Will take a look at this as soon as possible and get back to you.

Best Regards,
Pulasthi

On Tue, Aug 18, 2020 at 2:30 PM Luke Cwik <lc...@google.com> wrote:

> I have made some good progress here and have gotten to the following state
> for non-portable runners:
>
> DirectRunner[1]: Merged. Supports Read.Bounded and Read.Unbounded.
> Twister2[2]: Ready for review. Supports Read.Bounded, the current runner
> doesn't support unbounded pipelines.
> Spark[3]: WIP. Supports Read.Bounded, Nexmark suite passes. Not certain
> about level of unbounded pipeline support coverage since Spark uses its own
> tiny suite of tests to get unbounded pipeline coverage instead of the
> validates runner set.
> Jet[4]: WIP. Supports Read.Bounded. Read.Unbounded definitely needs
> additional work.
> Sazma[5]: WIP. Supports Read.Bounded. Not certain about level of unbounded
> pipeline support coverage since Spark uses its own tiny suite of tests to
> get unbounded pipeline coverage instead of the validates runner set.
> Flink: Unstarted.
>
> @Pulasthi Supun Wickramasinghe <pulasthi...@gmail.com> , can you help me
> with the Twister2 PR[2]?
> @Ismaël Mejía <ieme...@gmail.com>, is PR[3] the expected level of support
> for unbounded pipelines and hence ready for review?
> @Jozsef Bartok <jo...@hazelcast.com>, can you help me out to get support
> for unbounded splittable DoFn's into Jet[4]?
> @Xinyu Liu <xinyuliu...@gmail.com>, is PR[5] the expected level of
> support for unbounded pipelines and hence ready for review?
>
> 1: https://github.com/apache/beam/pull/12519
> 2: https://github.com/apache/beam/pull/12594
> 3: https://github.com/apache/beam/pull/12603
> 4: https://github.com/apache/beam/pull/12616
> 5: https://github.com/apache/beam/pull/12617
>
> On Tue, Aug 11, 2020 at 10:55 AM Luke Cwik <lc...@google.com> wrote:
>
>> There shouldn't be any changes required since the wrapper will smoothly
>> transition the execution to be run as an SDF. New IOs should strongly
>> prefer to use SDF since it should be simpler to write and will be more
>> flexible but they can use the "*Source"-based APIs. Eventually we'll
>> deprecate the APIs but we will never stop supporting them. Eventually they
>> should all be migrated to use SDF and if there is another major Beam
>> version, we'll finally be able to remove them.
>>
>> On Tue, Aug 11, 2020 at 8:40 AM Alexey Romanenko <
>> aromanenko....@gmail.com> wrote:
>>
>>> Hi Luke,
>>>
>>> Great to hear about such progress on this!
>>>
>>> Talking about opt-out for all runners in the future, will it require any
>>> code changes for current “*Source”-based IOs or the wrappers should
>>> completely smooth this transition?
>>> Do we need to require to create new IOs only based on SDF or again, the
>>> wrappers should help to avoid this?
>>>
>>> On 10 Aug 2020, at 22:59, Luke Cwik <lc...@google.com> wrote:
>>>
>>> In the past couple of months wrappers[1, 2] have been added to the Beam
>>> Java SDK which can execute BoundedSource and UnboundedSource as Splittable
>>> DoFns. These have been opt-out for portable pipelines (e.g. Dataflow runner
>>> v2, XLang pipelines on Flink/Spark) and opt-in using an experiment for all
>>> other pipelines.
>>>
>>> I would like to start making the non-portable pipelines starting with
>>> the DirectRunner[3] to be opt-out with the plan that eventually all runners
>>> will only execute splittable DoFns and the BoundedSource/UnboundedSource
>>> specific execution logic from the runners will be removed.
>>>
>>> Users will be able to opt-in any pipeline using the experiment
>>> 'use_sdf_read' and opt-out with the experiment 'use_deprecated_read'. (For
>>> portable pipelines these experiments were 'beam_fn_api' and
>>> 'beam_fn_api_use_deprecated_read' respectively and I have added these two
>>> additional aliases to make the experience less confusing).
>>>
>>> 1:
>>> https://github.com/apache/beam/blob/af1ce643d8fde5352d4519a558de4a2dfd24721d/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java#L275
>>> 2:
>>> https://github.com/apache/beam/blob/af1ce643d8fde5352d4519a558de4a2dfd24721d/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java#L449
>>> 3: https://github.com/apache/beam/pull/12519
>>>
>>>
>>>

-- 
Pulasthi S. Wickramasinghe
PhD Candidate  | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
cell: 224-386-9035

Reply via email to