Re: [DISCUSS] State of the project

Etienne Chauchot Thu, 01 Feb 2018 08:30:24 -0800


Le 31/01/2018 à 15:50, Kenneth Knowles a écrit :

On Wed, Jan 31, 2018 at 3:40 AM, Etienne Chauchot<[email protected] <mailto:[email protected]>> wrote:
    Thanks Kenn and Luke for your comments.

    WDYT about my proposition (bellow) to add methods to the runner
    api to enhance the coherence between the runners?
If I understand your point, I think I agree but maybe with evenstronger opinions. This is somewhat related to my comment on the Benthread about "runner == service". The methods in any programminglanguage SDK should not be considered "the runner API" any more. ABeam runner should be expected to be a service hosting Beam's jobmanagement proto APIs. So there's already more methods than run() butwe don't have any runners :-)

Yes +1 portability architecture helps a lot on that topic

    WDYT about my other proposition (bellow) of trying to avoid having
    validates runner tests that are specific to a runner like we have now?
Yes, I think ValidatesRunner tests should be independent of anyrunner. Is this just a convenient use of the Java annotation when itprobably should just be a static call to PipelineOptions.setRunner?IMO that is just a bug.

Actually I was referring to that: I noticed, for example, for spark,that the validates runner tests annotated @Category(StreamingTest.class)were specific to that runner. I guess this is because they need to useCreateStream to trigger streaming mode. IMHO it could be interesting tomove these ones (and other existing ones in other runners) to theirequivalent common tests. I don't know spark runner enough to see if itis feasible though.

If you consider the above point along with this, ValidatesRunner testsshould be launched as jobs against a runner service. This matchesthings like SQL compliance tests.
A build starting from change to SDK X might build like this:

  (build) SDK X
  (build) DirectRunner X
  (build) ValidatesRunner suite X
  (test) SDK X (submit the NeedsRunner tests to DR X)
  (test) DR X (submit the VR suite X)
  (test) other runners, submit the VR suite X
The way this is organized into modules today is more history andconvenience. You could easily imagine new ways to express this buildin Maven or Gradle. Portability makes it clearer!

+1 for sure but it is more a long term shot, I was thinking more short term.
Thanks for your comments

Kenn


    Thanks,

    Etienne


    Le 26/01/2018 à 21:34, Kenneth Knowles a écrit :

    I also think that at a high level the success of Beam as a
    project/community and as a piece of software depends on having
    multiple viable runners with healthy set of users and
    contributors. The pieces that are missing to me:

    *User-focused comparison of runners (and IOs)*
    +1 to Jesse's. Automated capability tests don't really help this.
    Benchmarks will be part of the story but are worth very little on
    their own. Focusing on these is just choosing to measure things
    that are easy to measure instead of addressing what is important,
    which is in the end almost always qualitative.

    *Automated integration tests on clusters*
    We do need to know that runners and IOs "work" in a basic yes/no
    manner on every commit/release, beyond unit tests. I am not
    really willing to strongly claim to a potential user that
    something "works" without this level of automation.

    *More uniform operational experiences*
    Setting up your Spark/Flink/Apex deployment should be different.
    Launching a Beam pipeline on it should not be.

    *Portability: Any SDK on any runner*
    We have now one SDK on master and one SDK on a dev branch that
    both support portable execution somewhat. Unfortunately we have
    no major open source runner that supports portability*. "Java on
    any runner" is not compelling enough any more, if it ever was.

    ----

    Reviews: I agree our response latency is too slow. I do not agree
    that our quality bar is too high; I think we should raise it
    *significantly*. Our codebase fails tests for long periods. Our
    tests need to be green enough that we are comfortable blocking
    merges *even for unrelated failures*. We should be able to cut a
    release any time, modulo known blocker-level bugs.

    Runner dev: I think Etienne's point about making it more uniform
    to add features to all runners actually is quite important, since
    the portability framework is a lot harder than "translate a Beam
    ParDo to XYZ's FlatMap" where they are both Java. And even the
    support code we've been building is not obvious to use and
    probably won't be for the foreseeable future. This fits well into
    the "Ben thread" on technical ideas so I'll comment there.

    Kenn

    *We do have a local batch-only portable runner in Python

    On Fri, Jan 26, 2018 at 10:09 AM, Lukasz Cwik <[email protected]
    <mailto:[email protected]>> wrote:

        Etienne, for the cross runner coherence, the portability
        framework is attempting to create an API across all runners
        for job management and job execution. A lot of work still
        needs to be done to define and implement these APIs and
        migrate runners and SDKs to support it since the current set
        of Java APIs are adhoc in usage and purpose. In my opinion,
        development should really be focused to migrate runners and
        SDKs to use these APIs to get developer coherence. Work is
        slowly progressing on integrating them into the Java, Python,
        and Go SDKs and there are several JIRA issues in this regard
        but involvement from more people could help.

        Some helpful pointers are:
        https://s.apache.org/beam-runner-api
        <https://s.apache.org/beam-runner-api>
        https://s.apache.org/beam-fn-api
        <https://s.apache.org/beam-fn-api>
        
https://issues.apache.org/jira/browse/BEAM-3515?jql=project%20%3D%20BEAM%20AND%20labels%20%3D%20portability
        
<https://issues.apache.org/jira/browse/BEAM-3515?jql=project%20%3D%20BEAM%20AND%20labels%20%3D%20portability>

        On Fri, Jan 26, 2018 at 7:21 AM, Etienne Chauchot
        <[email protected] <mailto:[email protected]>> wrote:

            Hi all,

            Does anyone have comments about my point about dev
            coherence across the runners?

            Thanks
            Etienne


            Le 22/01/2018 à 16:16, Etienne Chauchot a écrit :

                Thanks Davor for bringing this discussion up!

                I particularly like that you listed the different
                areas of improvement and proposed to assign people
                based on their tastes.

                I wanted to add a point about consistency across
                runners, but through the dev point of view: I've been
                working on a trans-runner feature lately (metrics
                push agnostic of the runners) for which I compared
                the behavior of the runners and wired up this feature
                into the flink and spark runners themselves. I must
                admit that I had a hard time figuring out how to wire
                it up in the different runners and that it was
                completely different between the runners. Also, their
                use (or non-use) of runner-core facilities vary. Even
                in the architecture of the tests: some, like spark,
                own their validates runner tests in the runner module
                and others runners run validates runner tests that
                are owned by sdk-core module. I also noticed some
                differences in the way to do streaming test: for some
                runners to trigger streaming mode it is needed to use
                an equivalent of direct runner's TestStream in the
                pipeline but for others putting streaming=true in
                pipelineOptions is enough.

                => long story short, IMHO I think that It could be
                interesting to enhance the runner API to contain more
                than run(). This could have the benefit to increase
                the coherence between runners. Besides we would need
                to find the correct balance between too many methods
                in the runner api that would reduce the flexibility
                of the runner implementations and too few methods
                that would reduce the coherence between the runners.

                =>In addition, to enhance the coherence (dev point of
                view) between the runners, having all the runners run
                the exact same validates runner tests in both batch
                and streaming modes would be awesome!

                Another thing: big +1 to have a programmatic way of
                defining the capability matrix as Romain suggested.

                Also agree on Ismaël's point about too flexible
                concepts across runners (termination, bundling, ...)

                Also big +1 to what Jessee wrote. I was myself in the
                past in the user architect position, and I can
                confirm that all the points that he mentioned are
                accurate.

                Best,

                Etienne


                Le 16/01/2018 à 17:39, Ismaël Mejía a écrit :

                    Thanks Davor for opening this discussion and HUGE
                    +1 to do this every
                    year or in cycles. I will fork this thread into a
                    new one for the
                    Culture / Project management part issues as
                    suggested.

                    About the diversity of users across runners
                    subject I think that this
                    requires more attention to unification and
                    implies at least work in
                    different areas:

                    * Automatized validation and consistent semantics
                    among runners

                    Users should be confident that moving their code
                    from one runner to
                    the other just works and the only way to ensure
                    this is by having a
                    runner to pass ValidatesRunner/TCK tests and with
                    this 'graduate' such
                    support as Romain suggested. The
                    capatility-matrix is really nice but
                    it is not a programmatic way to do this. And also
                    usually individual
                    features do work, but feature combinations
                    produce issues so we need
                    to have a more exact semantics to avoid these.

                    Some parts of Beam's semantics are loose (e.g.
                    bundle partiiioning,
                    pipeline termination, etc.), this I suppose has
                    been a design decision
                    to allow flexibility in the runners
                    implementation but it becomes
                    inconvenient when users move among runners and
                    have different results.
                    I am not sure if the current tradeoff is worth
                    the usability sacrifice
                    for the end user.

                    * Make user experience across runners a priority

                    Today all runners not only behave in different
                    ways but the way users
                    publish and package their applications differ. Of
                    course this is not a
                    trivial problem because deployment normally is a
                    end user problem, but
                    we can improve in this area, e.g. guaranteeing a
                    consistent deployment
                    mechanism across runners, and making IO
                    integration easier for example
                    when using multiple IOs and switching runners it
                    is easy to run into
                    conflicts, we should try to minimize this for the
                    end-users.

                    * Simplify operational tasks among runners

                    We need to add a minimum degree of consistent
                    observability across
                    runners. Of course Beam has metrics to do this,
                    but it is not enough,
                    an end-user that starts on one runner and moves
                    to another has to deal
                    with a totally different set of logs and
                    operational issues. We can
                    try to improve this too, of course without trying
                    to cover the full
                    spectrum but at least bringing some minimum set
                    of observability. I
                    hope that the current work on portability will
                    bring some improvements
                    in this area. But this is crucial for users that
                    probably pass more
                    time running (and dealing) with issues in their
                    jobs than writing
                    them.

                    We need to have some integration tests that
                    simulate common user
                    scenarios and some distribution use cases, e.g.
                    Probably the most
                    common data store used for streaming is Kafka (at
                    least in Open
                    Source). We should have an IT that tests some
                    common issues that can
                    arrive when you use kafka, what happens if a
                    kafka broker goes down,
                    does Beam continues to read without issue? what
                    about a new leader
                    election, do we continue to work as expected,
                    etc. Few projects have
                    something like this and this will send a clear
                    message that Beam cares
                    about reliability too.

                    Apart of these, I think we also need to work on:

                    * Simpler APIs + User friendly libraries.

                    I want to add a big thanks for Jesse for his list
                    on criteria that
                    people seek when they choose a framework for data
                    processing. And the
                    first point 'Will this dramatically improve the
                    problems I'm trying to
                    solve?' is super important. Of course Beam has
                    portability and a rich
                    model as its biggest assets  but I have been
                    consistently asked in
                    conferences if Beam has libraries for graph
                    processing, CEP, Machine
                    Learning or a Scala API.

                    Of course we have had some progress with the
                    recent addition of the
                    SQL and hopefully the schema-aware PCollections
                    would help there too,
                    but there is still some way to go, and of course
                    this can not be
                    crucial considering the portability goals of Beam
                    but these libraries
                    are sometimes what make users to decide if they
                    use a tool or not, so
                    better have those than not.

                    These are the most important issues from my point
                    of view. my excuses
                    for the long email but this was the perfect
                    moment to discuss these.

                    One extra point I think we should write and agree
                    on a concise roadmap
                    and take a look at our progress on it at the
                    middle and the end of the
                    year as other communities do.

                    Regards,
                    Ismaël

                    On Mon, Jan 15, 2018 at 7:49 PM, Jesse Anderson
                    <[email protected]
                    <mailto:[email protected]>> wrote:

                        I think a focus on the runners is what's key
                        to Beam's adoption. The runners
                        are the foundation on which Beam sits. If the
                        runners don't work properly,
                        Beam won't work.

                        A focus on improved unit tests is a good
                        start, but isn't what's needed.
                        Compatibility matrices will help see how your
                        runner of choice stacks up,
                        but that requires too much knowledge of
                        Beam's internals to be
                        interpretable.

                        Imagine you're an (enterprise) architect
                        looking at adopting Beam. What do
                        you look at or what do you look for before
                        going deeper? What would make you
                        stick your neck out to adopt Beam? For my
                        experience, there are several/pass
                        fails along the way.

                        Here are a few of the common ones I've seen:

                        Will this dramatically improve the problems
                        I'm trying to solve? (not
                        writing APIs/better programming model/Beam's
                        better handling of windowing)
                        Can I get commercial support for Beam? (This
                        is changing soon)
                        Are other people using Beam with the
                        configuration and use case as me? (e.g.
                        I'm using Spark with Beam to process imagery.
                        Are others doing this in
                        production?)
                        Is there good documentation and books on the
                        subject? (Tyler's and others'
                        book will improve this)
                        Can I get my team trained on this new
                        technology? (I have Beam training and
                        Google has some cursory training)

                        I think the one the community can improve on
                        the most is the social proof of
                        Beam. I've tried to do this
                        (http://www.jesse-anderson.com/2017/06/beam-2-0-q-and-a/
                        
<http://www.jesse-anderson.com/2017/06/beam-2-0-q-and-a/>
                        and
                        
http://www.jesse-anderson.com/2016/07/question-and-answers-with-the-apache-beam-team/
                        
<http://www.jesse-anderson.com/2016/07/question-and-answers-with-the-apache-beam-team/>).

                        We need to get the message out more about
                        people using Beam in production,
                        which configuration they have, and what their
                        results were. I think we have
                        the social proof on Dataflow, but not as much
                        on Spark/Flink/Apex.

                        I think it's important to note that these
                        checks don't look at the hardcore
                        language or API semantics that we're working
                        on. These are much later stage
                        issues, if they're ever used at all.

                        In my experience with other open source
                        adoption at enterprises, it starts
                        with architects and works its way around the
                        organization from there.

                        Thanks,

                        Jesse

                        On Mon, Jan 15, 2018 at 8:14 AM Ted Yu
                        <[email protected]
                        <mailto:[email protected]>> wrote:

                            bq. are hard to detect in our unit-test
                            framework

                            Looks like more integration tests would
                            help discover bug / regression
                            more quickly. If committer reviewing the
                            PR has concern in this regard, the
                            concern should be stated on the PR so
                            that the contributor (and reviewer)
                            can spend more time in solidifying the
                            solution.

                            bq. I've gone and fixed these issues
                            myself when merging

                            We can make stricter checkstyle rules so
                            that the code wouldn't pass build
                            without addressing commonly known issues.

                            Cheers

                            On Sun, Jan 14, 2018 at 12:37 PM, Reuven
                            Lax <[email protected]
                            <mailto:[email protected]>> wrote:

                                I agree with the sentiment, but I
                                don't completely agree with the
                                criteria.

                                I think we need to be much better
                                about reviewing PRs. Some PRs languish
                                for too long before the reviewer gets
                                to it (and I've been guilty of this
                                too), which does not send a good
                                message. Also new PRs sometimes languish
                                because there is no reviewer
                                assigned; maybe we could write a
                                gitbot to
                                automatically assign a reviewer to
                                every new PR?

                                Also, I think that the bar for
                                merging a PR from a contributor
                                should not
                                be "the PR is perfect." It's
                                perfectly fine to merge a PR that
                                still has
                                some issues (especially if the issues
                                are stylistic). In the past when I've
                                done this, I've gone and fixed these
                                issues myself when merging. It was a
                                bit more work for me to fix these
                                things myself, but it was a small
                                price to
                                pay in order to portray Beam as a
                                welcoming place for contributions.

                                On the other hand, "the build does
                                not break" is - in my opinion - too
                                weak of a criterion for merging. A
                                few reasons for this:

                                   * Beam is a data-processing
                                framework, and data integrity is
                                paramount.
                                If a reviewer sees an issue that
                                could lead to data loss (or
                                duplication, or
                                corruption), I don't think that PR
                                should be merged. Historically many such
                                issues only actually manifest at
                                scale, and are hard to detect in our
                                unit-test framework. (we also need to
                                invest in more at-scale tests to catch
                                such issues).

                                   * Beam guarantees backwards
                                compatibility for users (except across
                                major versions). If a bad API gets
                                merged and released (and the chances of
                                "forgetting" about it before the
                                release is cut is unfortunately high), we
                                are stuck with it. This is less of an
                                issue for many other open-source
                                projects that do not make such a
                                compatibility guarantee, as they are able
                                to simply remove or fix the API in
                                the next version.

                                I think we still need honest review
                                of PRs, with the criteria being
                                stronger than "the build doesn't
                                break." However reviewers also need to be
                                reasonable about what they ask for.

                                Reuven

                                On Sun, Jan 14, 2018 at 11:19 AM, Ted
                                Yu <[email protected]
                                <mailto:[email protected]>> wrote:

                                    bq. if a PR is basically right
                                    (it does what it should) without
                                    breaking
                                    the build, then it has to be
                                    merged fast

                                    +1 on above.
                                    This would give contributors
                                    positive feedback.

                                    On Sun, Jan 14, 2018 at 8:13 AM,
                                    Jean-Baptiste Onofré
                                    <[email protected]
                                    <mailto:[email protected]>>
                                    wrote:

                                        Hi Davor,

                                        Thanks a lot for this e-mail.

                                        I would like to emphasize two
                                        areas where we have to improve:

                                        1. Apache way and community.
                                        We still have to focus and
                                        being dedicated
                                        on our communities (both user
                                        & dev). Helping, encouraging,
                                        growing our
                                        communities is key for the
                                        project. Building bridges
                                        between communities is
                                        also very important. We have
                                        to be more "accessible":
                                        sometime simplifying
                                        our discussions, showing more
                                        interest and open minded in
                                        the proposals
                                        would help as well. I think
                                        we do a good job already: we
                                        just have to
                                        improve.

                                        2. Execution: a successful
                                        project is a project with a
                                        regular activity
                                        in term of releases, fixes,
                                        improvements.
                                        Regarding the PR, I think
                                        today we have a PR opened for
                                        long. And I
                                        think for three reasons:
                                        - some are not ready, not
                                        good enough, no question on
                                        these ones
                                        - some needs reviewer and
                                        speed up: we have to be
                                        careful on the open
                                        PRs and review asap
                                        - some are under review but
                                        we have a lot of "ping pong"
                                        and long
                                        discussion, not always
                                        justified. I already said
                                        that on the mailing list
                                        but, as for other Apache
                                        projects, if a PR is
                                        basically right (it does what
                                        it should) without breaking
                                        the build, then it has to be
                                        merged fast. If it
                                        requires additional changes
                                        (tests, polishing,
                                        improvements, ...), then it
                                        can be addressed in new PRs.
                                        As already mentioned in the
                                        Beam 2.3.0 thread, we have to
                                        adopt a
                                        regular schedule for
                                        releases. It's a best effort
                                        to have a release every 2
                                        months, whatever the release
                                        will contain. That's
                                        essential to maintain a
                                        good activity in the project
                                        and for the third party
                                        projects using Beam.

                                        Again, don't get me wrong: we
                                        already do a good job ! It's
                                        just area
                                        where I think we have to improve.

                                        Anyway, thanks for all the
                                        hard work we are doing all
                                        together !

                                        Regards
                                        JB


                                        On 13/01/2018 05:12, Davor
                                        Bonaci wrote:

                                            Hi everyone --
                                            Apache Beam was
                                            established as a
                                            top-level project a year
                                            ago (on
                                            December 21, to be
                                            exact). This first
                                            anniversary is a great
                                            opportunity for
                                            us to look back at the
                                            past year, celebrate its
                                            successes, learn from any
                                            mistakes we have made,
                                            and plan for the next 1+
                                            years.

                                            I’d like to invite
                                            everyone in the
                                            community, particularly
                                            users and
                                            observers on this mailing
                                            list, to participate in
                                            this discussion. Apache
                                            Beam is your project and
                                            I, for one, would much
                                            appreciate your candid
                                            thoughts and comments.
                                            Just as some other
                                            projects do, I’d like to
                                            make this
                                            “state of the project”
                                            discussion an annual
                                            tradition in this community.

                                            In terms of successes,
                                            the availability of the
                                            first stable release,
                                            version 2.0.0, was the
                                            biggest and most
                                            important milestone last
                                            year.
                                            Additionally, we have
                                            expanded the project’s
                                            breadth with new components,
                                            including several new
                                            runners, SDKs, and DSLs,
                                            and interconnected a large
                                            number of
                                            storage/messaging systems
                                            with new Beam IOs. In
                                            terms of community
                                            growth, crossing 200
                                            lifetime individual
                                            contributors and achieving 76
                                            contributors to a single
                                            release were other
                                            highlights. We have
                                            doubled the
                                            number of committers, and
                                            invited a handful of new
                                            PMC members. Thanks to
                                            each and every one of you
                                            for making all of this
                                            possible in our first year.

                                            On the other hand, in
                                            such a young project as
                                            Beam, there are
                                            naturally many areas for
                                            improvement. This is the
                                            principal purpose of this
                                            thread (and any of its
                                            forks). To organize the
                                            separate discussions, I’d
                                            suggest to fork separate
                                            threads for different
                                            discussion areas:
                                            * Culture and governance
                                            (anything related to
                                            people and their
                                            behavior)
                                            * Community growth (what
                                            can we do to further grow
                                            a diverse and
                                            vibrant community)
                                            * Technical execution
                                            (anything related to
                                            releases, their frequency,
                                            website, infrastructure)
                                            * Feature roadmap for
                                            2018 (what can we do to
                                            make the project more
                                            attractive to users, Beam
                                            3.0, etc.).

                                            I know many passionate
                                            folks who particularly
                                            care about each of these
                                            areas, but let me call on
                                            some folks from the
                                            community to get things
                                            started: Ismael for
                                            culture, Gris for
                                            community, JB for
                                            technical execution,
                                            and Ben for feature roadmap.

                                            Perhaps we can use this
                                            thread to discuss
                                            project-wide vision. To seed
                                            that discussion, I’d
                                            start somewhat
                                            provocatively -- we
                                            aren’t doing so well
                                            on the diversity of users
                                            across runners, which is
                                            very important to the
                                            realization of the
                                            project’s vision. Would
                                            you agree, and would you be
                                            willing to make it the
                                            project’s #1 priority for
                                            the next 1-2 years?

                                            Thanks -- and please join
                                            us in what would
                                            hopefully be a productive
                                            and informative
                                            discussion that shapes
                                            the future of this project!

                                            Davor

Re: [DISCUSS] State of the project

Reply via email to