Hi Kenn, thank you for your answers!
We'll update the Capability Matrix UI according to your design related suggestions. It will be implemented in a way that allows you to rather easily change its text content, remove/add columns and rows etc. :) Kind regards, Agnieszka On Wed, Jan 6, 2021 at 6:48 PM Kenneth Knowles <k...@apache.org> wrote: > Very good questions. Answers inline. > > On Wed, Jan 6, 2021 at 8:16 AM Agnieszka Sell <agnieszka.s...@polidea.com> > wrote: > >> Hi Kenneth, >> >> Thank you for your feedback about the Capability Matrix! I have several >> questions about it: >> >> *Feedback: I think we can also remove rows that are not started or not >> complete in the Beam Model, and remove the Beam Model column.* >> Question: If we remove the Beam model column the whole point of making it >> static and showing the capabilities would be lost. Isn't the point to show >> capabilities of Beam vs. other tools? >> >> > To clarify the purpose of the capability matrix: it is not comparing Beam > vs other tools. It is comparing adapters that run a Beam pipeline on top of > other tools. For example the "Apache Spark" column describes the > capabilities of Beam's "SparkRunner", not Spark itself. Maybe we need to > adjust the wording above the matrix to make this clear. > > So the column with the title "What is being computed?" is already a full > list of the features of the Beam Model. The rows where "Beam Model" has an > "X" or "~" are just ideas for future work, or features still in progress. > > *Feedback: I think Splittable DoFn really just deserves one row for bounded, > one for unbounded, and any caveats go in the details.* >> Question: How would it look like? All this in one matrix or separate? >> >> > I suggest to add it as a row in "What is being computed?" like ParDo, > GroupByKey, ..., Stateful Processing, Splittable DoFn. > > >> >> *Feedback: All the windowing rows can be condensed into "Basic windowing >> support" and "Merging windowing support" and any runner that can only run a >> couple WindowFns can have details in the caveats. At this point any runner >> that doesn't do Windowing by invoking a user's WindowFn simply doesn't >> really support windowing in the model.* >> Suggestion: Do we still have a separate matrix for only two(?) rows? >> >> > My opinion may be controversial... I don't care that much about splitting > What/Where/When/How. Especially it is confusing to use "Where" to talk > about event time. > > Personally, I would just make all the last three tables into a single > table "Windowing and Triggering" and the rows "Basic windowing support", > "Merging windowing support", "Configurable triggering", "Allowed lateness", > "Discarding mode", "Accumulating mode". I would remove Timers from that > table and rename "Stateful processing" in the table above to "State & > timers" since these are really one feature taken together. > > Many of those decisions are not really part of the redesign, but just > ideas to save space. If you need more space savings, I can find more... for > example there is no value to ParDo, GroupByKey, and Flatten being separate, > really. If you don't have those all implemented, you don't have a Beam > runner at all, so they will never be different. This could be omitted. Or > it could be a single "Baseline runner" row to add caveats. For example the > existing caveats are unnecessary: Spark has a caveat on GroupByKey that is > really about triggers. Structured streaming has "~" but the details are not > actually caveats. > > Kenn > > >> Kind regards, >> >> Agnieszka >> >> On Mon, Dec 21, 2020 at 7:49 PM Griselda Cuevas <g...@apache.org> wrote: >> >>> Thanks Kenn, this is super helpful. >>> >>> >>> >>> On Mon, 21 Dec 2020 at 09:57, Kenneth Knowles <k...@apache.org> wrote: >>> >>>> For the capability matrix, part of the problem is that the rows don't >>>> all make that much sense, as we've discussed a couple times. >>>> >>>> But assuming we keep the content identical, maybe we could just have >>>> the collapsed view and make the table selectable where *just* the selected >>>> cell controls content below? You won't be able to do side-by-side >>>> comparisons of the full text of things, but you will be able to keep the >>>> overview and drill in one at a time quickly. Just one idea. >>>> >>>> A couple ways to save space without rearchitecting it: >>>> >>>> - Apache Hadoop MapReduce and JStorm can be removed as they are on >>>> branches, not released. >>>> - I think we can also remove rows that are not started or not complete >>>> in the Beam Model, and remove the Beam Model column. >>>> - I think Splittable DoFn really just deserves one row for bounded, >>>> one for unbounded, and any caveats go in the details. >>>> - All the windowing rows can be condensed into "Basic windowing >>>> support" and "Merging windowing support" and any runner that can only run a >>>> couple WindowFns can have details in the caveats. At this point any runner >>>> that doesn't do Windowing by invoking a user's WindowFn simply doesn't >>>> really support windowing in the model. >>>> - "Configurable triggering" can absorb "Event-time triggers", >>>> "Processing-time triggers", "Count triggers", and "Composite triggers". >>>> Same. At this point any runner that doesn't support the whole triggering >>>> language doesn't really support triggers fully. >>>> >>>> Kenn >>>> >>>> On Mon, Dec 14, 2020 at 7:39 PM Griselda Cuevas <g...@apache.org> >>>> wrote: >>>> >>>>> Hi folks, another page that's getting a refresh this time around is >>>>> the Capability Matrix, which is one of the most critical pages for users >>>>> as >>>>> they evaluate the current support for each of the Beam runners. >>>>> >>>>> The situation we'd like to get your input on is: How do we optimize >>>>> the expanded version of the capability matrix, which explains the level of >>>>> support in each of the functions? >>>>> >>>>> Right now the text gets in the way of analyzing the table and makes >>>>> reading hard. You can see a screenshot in the Beam wiki here [1], the file >>>>> is titled current_CapMatExt. >>>>> >>>>> One of the proposed solutions is that after clicking the link "(click >>>>> to expand details)", we load a new page that has the corresponding table >>>>> to >>>>> the click (what, where, when, how) at the top, and all the content of each >>>>> runner/function gets displayed at the bottom of the page, the file with >>>>> the >>>>> proposed design is also in the Beam wiki here [1] and the file's name is >>>>> proposed_CapMatExt. This solution isn't perfect either, since we'd need to >>>>> move too much text under the table and reading isn't much easier. >>>>> >>>>> Do you have suggestions/ideas in how to condense the extended version? >>>>> >>>>> Share with us your feedback through this week, >>>>> Thanks! >>>>> G >>>>> >>>>> >>>>> [1] >>>>> https://cwiki.apache.org/confluence/display/BEAM/Website+Redesign+Files >>>>> >>>> >> >> -- >> >> Agnieszka Sell >> Polidea <https://www.polidea.com/> | Project Manager >> >> M: *+48 504 901 334* <+48504901334> >> E: agnieszka.s...@polidea.com >> [image: Polidea] <https://www.polidea.com/> >> >> Check out our projects! <https://www.polidea.com/our-work> >> [image: Github] <https://github.com/Polidea> [image: Facebook] >> <https://www.facebook.com/Polidea.Software> [image: Twitter] >> <https://twitter.com/polidea> [image: Linkedin] >> <https://www.linkedin.com/company/polidea> [image: Instagram] >> <https://instagram.com/polidea> >> >> Unique Tech >> Check out our projects! <https://www.polidea.com/our-work> >> > -- Agnieszka Sell Polidea <https://www.polidea.com/> | Project Manager M: *+48 504 901 334* <+48504901334> E: agnieszka.s...@polidea.com [image: Polidea] <https://www.polidea.com/> Check out our projects! <https://www.polidea.com/our-work> [image: Github] <https://github.com/Polidea> [image: Facebook] <https://www.facebook.com/Polidea.Software> [image: Twitter] <https://twitter.com/polidea> [image: Linkedin] <https://www.linkedin.com/company/polidea> [image: Instagram] <https://instagram.com/polidea> Unique Tech Check out our projects! <https://www.polidea.com/our-work>