Hi Luke Will take a look at this as soon as possible and get back to you.
Best Regards, Pulasthi On Tue, Aug 18, 2020 at 2:30 PM Luke Cwik <lc...@google.com> wrote: > I have made some good progress here and have gotten to the following state > for non-portable runners: > > DirectRunner[1]: Merged. Supports Read.Bounded and Read.Unbounded. > Twister2[2]: Ready for review. Supports Read.Bounded, the current runner > doesn't support unbounded pipelines. > Spark[3]: WIP. Supports Read.Bounded, Nexmark suite passes. Not certain > about level of unbounded pipeline support coverage since Spark uses its own > tiny suite of tests to get unbounded pipeline coverage instead of the > validates runner set. > Jet[4]: WIP. Supports Read.Bounded. Read.Unbounded definitely needs > additional work. > Sazma[5]: WIP. Supports Read.Bounded. Not certain about level of unbounded > pipeline support coverage since Spark uses its own tiny suite of tests to > get unbounded pipeline coverage instead of the validates runner set. > Flink: Unstarted. > > @Pulasthi Supun Wickramasinghe <pulasthi...@gmail.com> , can you help me > with the Twister2 PR[2]? > @Ismaël Mejía <ieme...@gmail.com>, is PR[3] the expected level of support > for unbounded pipelines and hence ready for review? > @Jozsef Bartok <jo...@hazelcast.com>, can you help me out to get support > for unbounded splittable DoFn's into Jet[4]? > @Xinyu Liu <xinyuliu...@gmail.com>, is PR[5] the expected level of > support for unbounded pipelines and hence ready for review? > > 1: https://github.com/apache/beam/pull/12519 > 2: https://github.com/apache/beam/pull/12594 > 3: https://github.com/apache/beam/pull/12603 > 4: https://github.com/apache/beam/pull/12616 > 5: https://github.com/apache/beam/pull/12617 > > On Tue, Aug 11, 2020 at 10:55 AM Luke Cwik <lc...@google.com> wrote: > >> There shouldn't be any changes required since the wrapper will smoothly >> transition the execution to be run as an SDF. New IOs should strongly >> prefer to use SDF since it should be simpler to write and will be more >> flexible but they can use the "*Source"-based APIs. Eventually we'll >> deprecate the APIs but we will never stop supporting them. Eventually they >> should all be migrated to use SDF and if there is another major Beam >> version, we'll finally be able to remove them. >> >> On Tue, Aug 11, 2020 at 8:40 AM Alexey Romanenko < >> aromanenko....@gmail.com> wrote: >> >>> Hi Luke, >>> >>> Great to hear about such progress on this! >>> >>> Talking about opt-out for all runners in the future, will it require any >>> code changes for current “*Source”-based IOs or the wrappers should >>> completely smooth this transition? >>> Do we need to require to create new IOs only based on SDF or again, the >>> wrappers should help to avoid this? >>> >>> On 10 Aug 2020, at 22:59, Luke Cwik <lc...@google.com> wrote: >>> >>> In the past couple of months wrappers[1, 2] have been added to the Beam >>> Java SDK which can execute BoundedSource and UnboundedSource as Splittable >>> DoFns. These have been opt-out for portable pipelines (e.g. Dataflow runner >>> v2, XLang pipelines on Flink/Spark) and opt-in using an experiment for all >>> other pipelines. >>> >>> I would like to start making the non-portable pipelines starting with >>> the DirectRunner[3] to be opt-out with the plan that eventually all runners >>> will only execute splittable DoFns and the BoundedSource/UnboundedSource >>> specific execution logic from the runners will be removed. >>> >>> Users will be able to opt-in any pipeline using the experiment >>> 'use_sdf_read' and opt-out with the experiment 'use_deprecated_read'. (For >>> portable pipelines these experiments were 'beam_fn_api' and >>> 'beam_fn_api_use_deprecated_read' respectively and I have added these two >>> additional aliases to make the experience less confusing). >>> >>> 1: >>> https://github.com/apache/beam/blob/af1ce643d8fde5352d4519a558de4a2dfd24721d/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java#L275 >>> 2: >>> https://github.com/apache/beam/blob/af1ce643d8fde5352d4519a558de4a2dfd24721d/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java#L449 >>> 3: https://github.com/apache/beam/pull/12519 >>> >>> >>> -- Pulasthi S. Wickramasinghe PhD Candidate | Research Assistant School of Informatics and Computing | Digital Science Center Indiana University, Bloomington cell: 224-386-9035