Re: [Input needed] Capability Matrix Visual Redesign for extended version

2021-01-12 Thread Agnieszka Sell
Hi Kenn,

thank you for your answers!

We'll update the Capability Matrix UI according to your design related
suggestions. It will be implemented in a way that allows you to rather
easily change its text content, remove/add columns and rows etc. :)

Kind regards,

Agnieszka

On Wed, Jan 6, 2021 at 6:48 PM Kenneth Knowles  wrote:

> Very good questions. Answers inline.
>
> On Wed, Jan 6, 2021 at 8:16 AM Agnieszka Sell 
> wrote:
>
>> Hi Kenneth,
>>
>> Thank you for your feedback about the Capability Matrix! I have several
>> questions about it:
>>
>> *Feedback: I think we can also remove rows that are not started or not 
>> complete in the Beam Model, and remove the Beam Model column.*
>> Question:  If we remove the Beam model column the whole point of making it 
>> static and showing the capabilities would be lost. Isn't the point to show 
>> capabilities of Beam vs. other tools?
>>
>>
> To clarify the purpose of the capability matrix: it is not comparing Beam
> vs other tools. It is comparing adapters that run a Beam pipeline on top of
> other tools. For example the "Apache Spark" column describes the
> capabilities of Beam's "SparkRunner", not Spark itself. Maybe we need to
> adjust the wording above the matrix to make this clear.
>
> So the column with the title "What is being computed?" is already a full
> list of the features of the Beam Model. The rows where "Beam Model" has an
> "X" or "~" are just ideas for future work, or features still in progress.
>
> *Feedback: I think Splittable DoFn really just deserves one row for bounded, 
> one for unbounded, and any caveats go in the details.*
>> Question: How would it look like? All this in one matrix or separate?
>>
>>
> I suggest to add it as a row in "What is being computed?" like ParDo,
> GroupByKey, ..., Stateful Processing, Splittable DoFn.
>
>
>>
>> *Feedback: All the windowing rows can be condensed into "Basic windowing 
>> support" and "Merging windowing support" and any runner that can only run a 
>> couple WindowFns can have details in the caveats. At this point any runner 
>> that doesn't do Windowing by invoking a user's WindowFn simply doesn't 
>> really support windowing in the model.*
>> Suggestion: Do we still have a separate matrix for only two(?) rows?
>>
>>
> My opinion may be controversial... I don't care that much about splitting
> What/Where/When/How. Especially it is confusing to use "Where" to talk
> about event time.
>
> Personally, I would just make all the last three tables into a single
> table "Windowing and Triggering" and the rows "Basic windowing support",
> "Merging windowing support", "Configurable triggering", "Allowed lateness",
> "Discarding mode", "Accumulating mode". I would remove Timers from that
> table and rename "Stateful processing" in the table above to "State &
> timers" since these are really one feature taken together.
>
> Many of those decisions are not really part of the redesign, but just
> ideas to save space. If you need more space savings, I can find more... for
> example there is no value to ParDo, GroupByKey, and Flatten being separate,
> really. If you don't have those all implemented, you don't have a  Beam
> runner at all, so they will never be different. This could be omitted. Or
> it could be a single "Baseline runner" row to add caveats. For example the
> existing caveats are unnecessary: Spark has a caveat on GroupByKey that is
> really about triggers. Structured streaming has "~" but the details are not
> actually caveats.
>
> Kenn
>
>
>> Kind regards,
>>
>> Agnieszka
>>
>> On Mon, Dec 21, 2020 at 7:49 PM Griselda Cuevas  wrote:
>>
>>> Thanks Kenn, this is super helpful.
>>>
>>>
>>>
>>> On Mon, 21 Dec 2020 at 09:57, Kenneth Knowles  wrote:
>>>
 For the capability matrix, part of the problem is that the rows don't
 all make that much sense, as we've discussed a couple times.

 But assuming we keep the content identical, maybe we could just have
 the collapsed view and make the table selectable where *just* the selected
 cell controls content below? You won't be able to do side-by-side
 comparisons of the full text of things, but you will be able to keep the
 overview and drill in one at a time quickly. Just one idea.

 A couple ways to save space without rearchitecting it:

  - Apache Hadoop MapReduce and JStorm can be removed as they are on
 branches, not released.
  - I think we can also remove rows that are not started or not complete
 in the Beam Model, and remove the Beam Model column.
  - I think Splittable DoFn really just deserves one row for bounded,
 one for unbounded, and any caveats go in the details.
  - All the windowing rows can be condensed into "Basic windowing
 support" and "Merging windowing support" and any runner that can only run a
 couple WindowFns can have details in the caveats. At this point any runner
 that doesn't do Windowing by invoking a user's WindowFn simply doesn't
 

Re: [Input needed] Capability Matrix Visual Redesign for extended version

2021-01-06 Thread Kenneth Knowles
Very good questions. Answers inline.

On Wed, Jan 6, 2021 at 8:16 AM Agnieszka Sell 
wrote:

> Hi Kenneth,
>
> Thank you for your feedback about the Capability Matrix! I have several
> questions about it:
>
> *Feedback: I think we can also remove rows that are not started or not 
> complete in the Beam Model, and remove the Beam Model column.*
> Question:  If we remove the Beam model column the whole point of making it 
> static and showing the capabilities would be lost. Isn't the point to show 
> capabilities of Beam vs. other tools?
>
>
To clarify the purpose of the capability matrix: it is not comparing Beam
vs other tools. It is comparing adapters that run a Beam pipeline on top of
other tools. For example the "Apache Spark" column describes the
capabilities of Beam's "SparkRunner", not Spark itself. Maybe we need to
adjust the wording above the matrix to make this clear.

So the column with the title "What is being computed?" is already a full
list of the features of the Beam Model. The rows where "Beam Model" has an
"X" or "~" are just ideas for future work, or features still in progress.

*Feedback: I think Splittable DoFn really just deserves one row for
bounded, one for unbounded, and any caveats go in the details.*
> Question: How would it look like? All this in one matrix or separate?
>
>
I suggest to add it as a row in "What is being computed?" like ParDo,
GroupByKey, ..., Stateful Processing, Splittable DoFn.


>
> *Feedback: All the windowing rows can be condensed into "Basic windowing 
> support" and "Merging windowing support" and any runner that can only run a 
> couple WindowFns can have details in the caveats. At this point any runner 
> that doesn't do Windowing by invoking a user's WindowFn simply doesn't really 
> support windowing in the model.*
> Suggestion: Do we still have a separate matrix for only two(?) rows?
>
>
My opinion may be controversial... I don't care that much about splitting
What/Where/When/How. Especially it is confusing to use "Where" to talk
about event time.

Personally, I would just make all the last three tables into a single table
"Windowing and Triggering" and the rows "Basic windowing support", "Merging
windowing support", "Configurable triggering", "Allowed lateness",
"Discarding mode", "Accumulating mode". I would remove Timers from that
table and rename "Stateful processing" in the table above to "State &
timers" since these are really one feature taken together.

Many of those decisions are not really part of the redesign, but just ideas
to save space. If you need more space savings, I can find more... for
example there is no value to ParDo, GroupByKey, and Flatten being separate,
really. If you don't have those all implemented, you don't have a  Beam
runner at all, so they will never be different. This could be omitted. Or
it could be a single "Baseline runner" row to add caveats. For example the
existing caveats are unnecessary: Spark has a caveat on GroupByKey that is
really about triggers. Structured streaming has "~" but the details are not
actually caveats.

Kenn


> Kind regards,
>
> Agnieszka
>
> On Mon, Dec 21, 2020 at 7:49 PM Griselda Cuevas  wrote:
>
>> Thanks Kenn, this is super helpful.
>>
>>
>>
>> On Mon, 21 Dec 2020 at 09:57, Kenneth Knowles  wrote:
>>
>>> For the capability matrix, part of the problem is that the rows don't
>>> all make that much sense, as we've discussed a couple times.
>>>
>>> But assuming we keep the content identical, maybe we could just have the
>>> collapsed view and make the table selectable where *just* the selected cell
>>> controls content below? You won't be able to do side-by-side comparisons of
>>> the full text of things, but you will be able to keep the overview and
>>> drill in one at a time quickly. Just one idea.
>>>
>>> A couple ways to save space without rearchitecting it:
>>>
>>>  - Apache Hadoop MapReduce and JStorm can be removed as they are on
>>> branches, not released.
>>>  - I think we can also remove rows that are not started or not complete
>>> in the Beam Model, and remove the Beam Model column.
>>>  - I think Splittable DoFn really just deserves one row for bounded, one
>>> for unbounded, and any caveats go in the details.
>>>  - All the windowing rows can be condensed into "Basic windowing
>>> support" and "Merging windowing support" and any runner that can only run a
>>> couple WindowFns can have details in the caveats. At this point any runner
>>> that doesn't do Windowing by invoking a user's WindowFn simply doesn't
>>> really support windowing in the model.
>>>  - "Configurable triggering" can absorb "Event-time triggers",
>>> "Processing-time triggers", "Count triggers", and "Composite triggers".
>>> Same. At this point any runner that doesn't support the whole triggering
>>> language doesn't really support triggers fully.
>>>
>>> Kenn
>>>
>>> On Mon, Dec 14, 2020 at 7:39 PM Griselda Cuevas  wrote:
>>>
 Hi folks, another page that's getting a refresh this time around is the
 

Re: [Input needed] Capability Matrix Visual Redesign for extended version

2021-01-06 Thread Agnieszka Sell
Hi Kenneth,

Thank you for your feedback about the Capability Matrix! I have several
questions about it:

*Feedback: I think we can also remove rows that are not started or not
complete in the Beam Model, and remove the Beam Model column.*
Question:  If we remove the Beam model column the whole point of
making it static and showing the capabilities would be lost. Isn't the
point to show capabilities of Beam vs. other tools?

*Feedback: I think Splittable DoFn really just deserves one row for
bounded, one for unbounded, and any caveats go in the details.*
Question: How would it look like? All this in one matrix or separate?


*Feedback: All the windowing rows can be condensed into "Basic
windowing support" and "Merging windowing support" and any runner that
can only run a couple WindowFns can have details in the caveats. At
this point any runner that doesn't do Windowing by invoking a user's
WindowFn simply doesn't really support windowing in the model.*
Suggestion: Do we still have a separate matrix for only two(?) rows?


Kind regards,

Agnieszka

On Mon, Dec 21, 2020 at 7:49 PM Griselda Cuevas  wrote:

> Thanks Kenn, this is super helpful.
>
>
>
> On Mon, 21 Dec 2020 at 09:57, Kenneth Knowles  wrote:
>
>> For the capability matrix, part of the problem is that the rows don't all
>> make that much sense, as we've discussed a couple times.
>>
>> But assuming we keep the content identical, maybe we could just have the
>> collapsed view and make the table selectable where *just* the selected cell
>> controls content below? You won't be able to do side-by-side comparisons of
>> the full text of things, but you will be able to keep the overview and
>> drill in one at a time quickly. Just one idea.
>>
>> A couple ways to save space without rearchitecting it:
>>
>>  - Apache Hadoop MapReduce and JStorm can be removed as they are on
>> branches, not released.
>>  - I think we can also remove rows that are not started or not complete
>> in the Beam Model, and remove the Beam Model column.
>>  - I think Splittable DoFn really just deserves one row for bounded, one
>> for unbounded, and any caveats go in the details.
>>  - All the windowing rows can be condensed into "Basic windowing support"
>> and "Merging windowing support" and any runner that can only run a couple
>> WindowFns can have details in the caveats. At this point any runner that
>> doesn't do Windowing by invoking a user's WindowFn simply doesn't really
>> support windowing in the model.
>>  - "Configurable triggering" can absorb "Event-time triggers",
>> "Processing-time triggers", "Count triggers", and "Composite triggers".
>> Same. At this point any runner that doesn't support the whole triggering
>> language doesn't really support triggers fully.
>>
>> Kenn
>>
>> On Mon, Dec 14, 2020 at 7:39 PM Griselda Cuevas  wrote:
>>
>>> Hi folks, another page that's getting a refresh this time around is the
>>> Capability Matrix, which is one of the most critical pages for users as
>>> they evaluate the current support for each of the Beam runners.
>>>
>>> The situation we'd like to get your input on is: How do we optimize the
>>> expanded version of the capability matrix, which explains the level of
>>> support in each of the functions?
>>>
>>> Right now the text gets in the way of analyzing the table and makes
>>> reading hard. You can see a screenshot in the Beam wiki here [1], the file
>>> is titled current_CapMatExt.
>>>
>>> One of the proposed solutions is that after clicking the link "(click to
>>> expand details)", we load a new page that has the corresponding table to
>>> the click (what, where, when, how) at the top, and all the content of each
>>> runner/function gets displayed at the bottom of the page, the file with the
>>> proposed design is also in the Beam wiki here [1] and the file's name is
>>> proposed_CapMatExt. This solution isn't perfect either, since we'd need to
>>> move too much text under the table and reading isn't much easier.
>>>
>>> Do you have suggestions/ideas in how to condense the extended version?
>>>
>>> Share with us your feedback through this week,
>>> Thanks!
>>> G
>>>
>>>
>>> [1]
>>> https://cwiki.apache.org/confluence/display/BEAM/Website+Redesign+Files
>>>
>>

-- 

Agnieszka Sell
Polidea  | Project Manager

M: *+48 504 901 334* <+48504901334>
E: agnieszka.s...@polidea.com
[image: Polidea] 

Check out our projects! 
[image: Github]  [image: Facebook]
 [image: Twitter]
 [image: Linkedin]
 [image: Instagram]


Unique Tech
Check out our projects! 


Re: [Input needed] Capability Matrix Visual Redesign for extended version

2020-12-21 Thread Griselda Cuevas
Thanks Kenn, this is super helpful.



On Mon, 21 Dec 2020 at 09:57, Kenneth Knowles  wrote:

> For the capability matrix, part of the problem is that the rows don't all
> make that much sense, as we've discussed a couple times.
>
> But assuming we keep the content identical, maybe we could just have the
> collapsed view and make the table selectable where *just* the selected cell
> controls content below? You won't be able to do side-by-side comparisons of
> the full text of things, but you will be able to keep the overview and
> drill in one at a time quickly. Just one idea.
>
> A couple ways to save space without rearchitecting it:
>
>  - Apache Hadoop MapReduce and JStorm can be removed as they are on
> branches, not released.
>  - I think we can also remove rows that are not started or not complete in
> the Beam Model, and remove the Beam Model column.
>  - I think Splittable DoFn really just deserves one row for bounded, one
> for unbounded, and any caveats go in the details.
>  - All the windowing rows can be condensed into "Basic windowing support"
> and "Merging windowing support" and any runner that can only run a couple
> WindowFns can have details in the caveats. At this point any runner that
> doesn't do Windowing by invoking a user's WindowFn simply doesn't really
> support windowing in the model.
>  - "Configurable triggering" can absorb "Event-time triggers",
> "Processing-time triggers", "Count triggers", and "Composite triggers".
> Same. At this point any runner that doesn't support the whole triggering
> language doesn't really support triggers fully.
>
> Kenn
>
> On Mon, Dec 14, 2020 at 7:39 PM Griselda Cuevas  wrote:
>
>> Hi folks, another page that's getting a refresh this time around is the
>> Capability Matrix, which is one of the most critical pages for users as
>> they evaluate the current support for each of the Beam runners.
>>
>> The situation we'd like to get your input on is: How do we optimize the
>> expanded version of the capability matrix, which explains the level of
>> support in each of the functions?
>>
>> Right now the text gets in the way of analyzing the table and makes
>> reading hard. You can see a screenshot in the Beam wiki here [1], the file
>> is titled current_CapMatExt.
>>
>> One of the proposed solutions is that after clicking the link "(click to
>> expand details)", we load a new page that has the corresponding table to
>> the click (what, where, when, how) at the top, and all the content of each
>> runner/function gets displayed at the bottom of the page, the file with the
>> proposed design is also in the Beam wiki here [1] and the file's name is
>> proposed_CapMatExt. This solution isn't perfect either, since we'd need to
>> move too much text under the table and reading isn't much easier.
>>
>> Do you have suggestions/ideas in how to condense the extended version?
>>
>> Share with us your feedback through this week,
>> Thanks!
>> G
>>
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/BEAM/Website+Redesign+Files
>>
>


Re: [Input needed] Capability Matrix Visual Redesign for extended version

2020-12-21 Thread Kenneth Knowles
For the capability matrix, part of the problem is that the rows don't all
make that much sense, as we've discussed a couple times.

But assuming we keep the content identical, maybe we could just have the
collapsed view and make the table selectable where *just* the selected cell
controls content below? You won't be able to do side-by-side comparisons of
the full text of things, but you will be able to keep the overview and
drill in one at a time quickly. Just one idea.

A couple ways to save space without rearchitecting it:

 - Apache Hadoop MapReduce and JStorm can be removed as they are on
branches, not released.
 - I think we can also remove rows that are not started or not complete in
the Beam Model, and remove the Beam Model column.
 - I think Splittable DoFn really just deserves one row for bounded, one
for unbounded, and any caveats go in the details.
 - All the windowing rows can be condensed into "Basic windowing support"
and "Merging windowing support" and any runner that can only run a couple
WindowFns can have details in the caveats. At this point any runner that
doesn't do Windowing by invoking a user's WindowFn simply doesn't really
support windowing in the model.
 - "Configurable triggering" can absorb "Event-time triggers",
"Processing-time triggers", "Count triggers", and "Composite triggers".
Same. At this point any runner that doesn't support the whole triggering
language doesn't really support triggers fully.

Kenn

On Mon, Dec 14, 2020 at 7:39 PM Griselda Cuevas  wrote:

> Hi folks, another page that's getting a refresh this time around is the
> Capability Matrix, which is one of the most critical pages for users as
> they evaluate the current support for each of the Beam runners.
>
> The situation we'd like to get your input on is: How do we optimize the
> expanded version of the capability matrix, which explains the level of
> support in each of the functions?
>
> Right now the text gets in the way of analyzing the table and makes
> reading hard. You can see a screenshot in the Beam wiki here [1], the file
> is titled current_CapMatExt.
>
> One of the proposed solutions is that after clicking the link "(click to
> expand details)", we load a new page that has the corresponding table to
> the click (what, where, when, how) at the top, and all the content of each
> runner/function gets displayed at the bottom of the page, the file with the
> proposed design is also in the Beam wiki here [1] and the file's name is
> proposed_CapMatExt. This solution isn't perfect either, since we'd need to
> move too much text under the table and reading isn't much easier.
>
> Do you have suggestions/ideas in how to condense the extended version?
>
> Share with us your feedback through this week,
> Thanks!
> G
>
>
> [1]
> https://cwiki.apache.org/confluence/display/BEAM/Website+Redesign+Files
>


[Input needed] Capability Matrix Visual Redesign for extended version

2020-12-14 Thread Griselda Cuevas
Hi folks, another page that's getting a refresh this time around is the
Capability Matrix, which is one of the most critical pages for users as
they evaluate the current support for each of the Beam runners.

The situation we'd like to get your input on is: How do we optimize the
expanded version of the capability matrix, which explains the level of
support in each of the functions?

Right now the text gets in the way of analyzing the table and makes reading
hard. You can see a screenshot in the Beam wiki here [1], the file is
titled current_CapMatExt.

One of the proposed solutions is that after clicking the link "(click to
expand details)", we load a new page that has the corresponding table to
the click (what, where, when, how) at the top, and all the content of each
runner/function gets displayed at the bottom of the page, the file with the
proposed design is also in the Beam wiki here [1] and the file's name is
proposed_CapMatExt. This solution isn't perfect either, since we'd need to
move too much text under the table and reading isn't much easier.

Do you have suggestions/ideas in how to condense the extended version?

Share with us your feedback through this week,
Thanks!
G


[1] https://cwiki.apache.org/confluence/display/BEAM/Website+Redesign+Files