Re: Renaming SideOutput
All outputs are logically the same "type" of output. Within the SDKs, the "main" output is just the one that is output if no output tag is specified, and the one that matches the output type parameter of the DoFn. However, because there are multiple methods, including something that is a reasonable default, I think it's reasonable to distinguish between the two "methods" of outputting, while still just calling everything an "output". Having a main output does reduce the amount of code required for a DoFn that produces only a single output quite significantly, which is why it's still around. Within the language-independent representations and most runners, there's no actual concept of main-vs-side outputs, except to support the default output(OutputT) method. On Wed, Apr 12, 2017 at 6:53 PM, Ankur Chauhan wrote: > This question maybe obvious to others but why is there a distinction > between main output and additional outputs? Why not just have a simple list > of outputs where the first one is the Main one. > > -- AC > > Sent from my iPhone > > > On Apr 12, 2017, at 18:08, Melissa Pashniak > wrote: > > > > I agree, I'll create a PR with the doc changes (the rename + text changes > > to make things more clear). I know of at least 2 places we refer to side > > outputs (programming guide and the "Design your pipeline" page). > > > > > > On Tue, Apr 11, 2017 at 5:34 PM, Thomas Groh > > wrote: > > > >> I think that's a good idea. I would call the outputs of a ParDo the > "Main > >> Output" and "Additional Outputs" - it seems like an easy way to make it > >> clear that there's one output that is always expected, and there may be > >> more. > >> > >> On Tue, Apr 11, 2017 at 5:29 PM, Robert Bradshaw < > >> rober...@google.com.invalid> wrote: > >> > >>> We should do some renaming in Python too. Right now we have > >>> SideOutputValue which I'd propose naming TaggedOutput or something > >>> like that. > >>> > >>> Should the docs change too? > >>> https://beam.apache.org/documentation/programming- > >> guide/#transforms-sideio > >>> > >>> On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles > >>> > >>> wrote: > +1 ditto about sideInput and sideOutput not actually being related > > On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < > rober...@google.com.invalid> wrote: > > > +1, I think this is a lot clearer. > > > > On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk > >>> > > wrote: > >> strong +1 for changing the name away from sideOutput - the fact that > >> sideInput and sideOutput are not really related was definitely a > >>> source > > of > >> confusion for me when learning beam. > >> > >> S > >> > >> On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh > >> > >> wrote: > >> > >>> Hey everyone: > >>> > >>> I'd like to rename DoFn.Context#sideOutput to #output (in the Java > >>> SDK). > >>> > >>> Having two methods, both named output, one which takes the "main > >>> output > >>> type" and one that takes a tag to specify the type more clearly > >>> communicates the actual behavior - sideOutput isn't a "special" way > >>> to > >>> output, it's the same as output(T), just to a specified > >> PCollection. > > This > >>> will help pipeline authors understand the actual behavior of > >>> outputting > > to > >>> a tag, and detangle it from "sideInput", which is a special way to > > receive > >>> input. Giving them the same name means that it's not even strange > >> to > > call > >>> output and provide the main output type, which is what we want - > >>> it's a > >>> more specific way to output, but does not have different > >>> restrictions or > >>> capabilities. > >>> > >>> This is also a pretty small change within the SDK - it touches > >> about > >>> 20 > >>> files, and the changes are pretty automatic. > >>> > >>> Thanks, > >>> > >>> Thomas > >>> > > > >>> > >> >
Re: Renaming SideOutput
This question maybe obvious to others but why is there a distinction between main output and additional outputs? Why not just have a simple list of outputs where the first one is the Main one. -- AC Sent from my iPhone > On Apr 12, 2017, at 18:08, Melissa Pashniak > wrote: > > I agree, I'll create a PR with the doc changes (the rename + text changes > to make things more clear). I know of at least 2 places we refer to side > outputs (programming guide and the "Design your pipeline" page). > > > On Tue, Apr 11, 2017 at 5:34 PM, Thomas Groh > wrote: > >> I think that's a good idea. I would call the outputs of a ParDo the "Main >> Output" and "Additional Outputs" - it seems like an easy way to make it >> clear that there's one output that is always expected, and there may be >> more. >> >> On Tue, Apr 11, 2017 at 5:29 PM, Robert Bradshaw < >> rober...@google.com.invalid> wrote: >> >>> We should do some renaming in Python too. Right now we have >>> SideOutputValue which I'd propose naming TaggedOutput or something >>> like that. >>> >>> Should the docs change too? >>> https://beam.apache.org/documentation/programming- >> guide/#transforms-sideio >>> >>> On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles >> >>> wrote: +1 ditto about sideInput and sideOutput not actually being related On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < rober...@google.com.invalid> wrote: > +1, I think this is a lot clearer. > > On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk >> > wrote: >> strong +1 for changing the name away from sideOutput - the fact that >> sideInput and sideOutput are not really related was definitely a >>> source > of >> confusion for me when learning beam. >> >> S >> >> On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh >> >>> >> wrote: >> >>> Hey everyone: >>> >>> I'd like to rename DoFn.Context#sideOutput to #output (in the Java >>> SDK). >>> >>> Having two methods, both named output, one which takes the "main >>> output >>> type" and one that takes a tag to specify the type more clearly >>> communicates the actual behavior - sideOutput isn't a "special" way >>> to >>> output, it's the same as output(T), just to a specified >> PCollection. > This >>> will help pipeline authors understand the actual behavior of >>> outputting > to >>> a tag, and detangle it from "sideInput", which is a special way to > receive >>> input. Giving them the same name means that it's not even strange >> to > call >>> output and provide the main output type, which is what we want - >>> it's a >>> more specific way to output, but does not have different >>> restrictions or >>> capabilities. >>> >>> This is also a pretty small change within the SDK - it touches >> about >>> 20 >>> files, and the changes are pretty automatic. >>> >>> Thanks, >>> >>> Thomas >>> > >>> >>
Re: Renaming SideOutput
I agree, I'll create a PR with the doc changes (the rename + text changes to make things more clear). I know of at least 2 places we refer to side outputs (programming guide and the "Design your pipeline" page). On Tue, Apr 11, 2017 at 5:34 PM, Thomas Groh wrote: > I think that's a good idea. I would call the outputs of a ParDo the "Main > Output" and "Additional Outputs" - it seems like an easy way to make it > clear that there's one output that is always expected, and there may be > more. > > On Tue, Apr 11, 2017 at 5:29 PM, Robert Bradshaw < > rober...@google.com.invalid> wrote: > > > We should do some renaming in Python too. Right now we have > > SideOutputValue which I'd propose naming TaggedOutput or something > > like that. > > > > Should the docs change too? > > https://beam.apache.org/documentation/programming- > guide/#transforms-sideio > > > > On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles > > > wrote: > > > +1 ditto about sideInput and sideOutput not actually being related > > > > > > On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < > > > rober...@google.com.invalid> wrote: > > > > > >> +1, I think this is a lot clearer. > > >> > > >> On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk > > > >> wrote: > > >> > strong +1 for changing the name away from sideOutput - the fact that > > >> > sideInput and sideOutput are not really related was definitely a > > source > > >> of > > >> > confusion for me when learning beam. > > >> > > > >> > S > > >> > > > >> > On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh > > > > > >> > wrote: > > >> > > > >> >> Hey everyone: > > >> >> > > >> >> I'd like to rename DoFn.Context#sideOutput to #output (in the Java > > SDK). > > >> >> > > >> >> Having two methods, both named output, one which takes the "main > > output > > >> >> type" and one that takes a tag to specify the type more clearly > > >> >> communicates the actual behavior - sideOutput isn't a "special" way > > to > > >> >> output, it's the same as output(T), just to a specified > PCollection. > > >> This > > >> >> will help pipeline authors understand the actual behavior of > > outputting > > >> to > > >> >> a tag, and detangle it from "sideInput", which is a special way to > > >> receive > > >> >> input. Giving them the same name means that it's not even strange > to > > >> call > > >> >> output and provide the main output type, which is what we want - > > it's a > > >> >> more specific way to output, but does not have different > > restrictions or > > >> >> capabilities. > > >> >> > > >> >> This is also a pretty small change within the SDK - it touches > about > > 20 > > >> >> files, and the changes are pretty automatic. > > >> >> > > >> >> Thanks, > > >> >> > > >> >> Thomas > > >> >> > > >> > > >
Re: Renaming SideOutput
Cool! I've filed https://issues.apache.org/jira/browse/BEAM-1949 and authored https://github.com/apache/beam/pull/2512 to make this change. On Tue, Apr 11, 2017 at 11:33 PM, Ted Yu wrote: > +1 > > > On Apr 11, 2017, at 5:34 PM, Thomas Groh > wrote: > > > > I think that's a good idea. I would call the outputs of a ParDo the "Main > > Output" and "Additional Outputs" - it seems like an easy way to make it > > clear that there's one output that is always expected, and there may be > > more. > > > > On Tue, Apr 11, 2017 at 5:29 PM, Robert Bradshaw < > > rober...@google.com.invalid> wrote: > > > >> We should do some renaming in Python too. Right now we have > >> SideOutputValue which I'd propose naming TaggedOutput or something > >> like that. > >> > >> Should the docs change too? > >> https://beam.apache.org/documentation/programming- > guide/#transforms-sideio > >> > >> On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles > > >> wrote: > >>> +1 ditto about sideInput and sideOutput not actually being related > >>> > >>> On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < > >>> rober...@google.com.invalid> wrote: > >>> > +1, I think this is a lot clearer. > > On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk > > wrote: > > strong +1 for changing the name away from sideOutput - the fact that > > sideInput and sideOutput are not really related was definitely a > >> source > of > > confusion for me when learning beam. > > > > S > > > > On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh >>> > > wrote: > > > >> Hey everyone: > >> > >> I'd like to rename DoFn.Context#sideOutput to #output (in the Java > >> SDK). > >> > >> Having two methods, both named output, one which takes the "main > >> output > >> type" and one that takes a tag to specify the type more clearly > >> communicates the actual behavior - sideOutput isn't a "special" way > >> to > >> output, it's the same as output(T), just to a specified PCollection. > This > >> will help pipeline authors understand the actual behavior of > >> outputting > to > >> a tag, and detangle it from "sideInput", which is a special way to > receive > >> input. Giving them the same name means that it's not even strange to > call > >> output and provide the main output type, which is what we want - > >> it's a > >> more specific way to output, but does not have different > >> restrictions or > >> capabilities. > >> > >> This is also a pretty small change within the SDK - it touches about > >> 20 > >> files, and the changes are pretty automatic. > >> > >> Thanks, > >> > >> Thomas > >> >
Re: Renaming SideOutput
+1 > On Apr 11, 2017, at 5:34 PM, Thomas Groh wrote: > > I think that's a good idea. I would call the outputs of a ParDo the "Main > Output" and "Additional Outputs" - it seems like an easy way to make it > clear that there's one output that is always expected, and there may be > more. > > On Tue, Apr 11, 2017 at 5:29 PM, Robert Bradshaw < > rober...@google.com.invalid> wrote: > >> We should do some renaming in Python too. Right now we have >> SideOutputValue which I'd propose naming TaggedOutput or something >> like that. >> >> Should the docs change too? >> https://beam.apache.org/documentation/programming-guide/#transforms-sideio >> >> On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles >> wrote: >>> +1 ditto about sideInput and sideOutput not actually being related >>> >>> On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < >>> rober...@google.com.invalid> wrote: >>> +1, I think this is a lot clearer. On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk wrote: > strong +1 for changing the name away from sideOutput - the fact that > sideInput and sideOutput are not really related was definitely a >> source of > confusion for me when learning beam. > > S > > On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh >> > wrote: > >> Hey everyone: >> >> I'd like to rename DoFn.Context#sideOutput to #output (in the Java >> SDK). >> >> Having two methods, both named output, one which takes the "main >> output >> type" and one that takes a tag to specify the type more clearly >> communicates the actual behavior - sideOutput isn't a "special" way >> to >> output, it's the same as output(T), just to a specified PCollection. This >> will help pipeline authors understand the actual behavior of >> outputting to >> a tag, and detangle it from "sideInput", which is a special way to receive >> input. Giving them the same name means that it's not even strange to call >> output and provide the main output type, which is what we want - >> it's a >> more specific way to output, but does not have different >> restrictions or >> capabilities. >> >> This is also a pretty small change within the SDK - it touches about >> 20 >> files, and the changes are pretty automatic. >> >> Thanks, >> >> Thomas >>
RE: Renaming SideOutput
+1. SideInput and SideOutput probably make new user confused. It is different behavior. BTW, is it also better to change "main output" to "default output" when user does not explicitly specify an output tag? Regards Jian Liu(Basti) -Original Message- From: Thomas Groh [mailto:tg...@google.com.INVALID] Sent: Wednesday, April 12, 2017 4:56 AM To: dev@beam.apache.org Subject: Renaming SideOutput Hey everyone: I'd like to rename DoFn.Context#sideOutput to #output (in the Java SDK). Having two methods, both named output, one which takes the "main output type" and one that takes a tag to specify the type more clearly communicates the actual behavior - sideOutput isn't a "special" way to output, it's the same as output(T), just to a specified PCollection. This will help pipeline authors understand the actual behavior of outputting to a tag, and detangle it from "sideInput", which is a special way to receive input. Giving them the same name means that it's not even strange to call output and provide the main output type, which is what we want - it's a more specific way to output, but does not have different restrictions or capabilities. This is also a pretty small change within the SDK - it touches about 20 files, and the changes are pretty automatic. Thanks, Thomas
Re: Renaming SideOutput
+1 On Wed, Apr 12, 2017 at 6:06 AM JingsongLee wrote: > strong +1 > best, > JingsongLee--From:Tang > Jijun(上海_技术部_数据平台_唐觊隽) Time:2017 Apr 12 (Wed) > 10:39To:dev@beam.apache.org Subject:答复: Renaming > SideOutput > +1 more clearer > > > -邮件原件- > 发件人: Ankur Chauhan [mailto:an...@malloc64.com] > 发送时间: 2017年4月12日 10:36 > 收件人: dev@beam.apache.org > 主题: Re: Renaming SideOutput > > > +1 this is pretty much the topmost things that I found odd when starting with > the beam model. It would definitely be more intuitive to have a consistent > name. > > Sent from my iPhone > > > On Apr 11, 2017, at 18:29, Aljoscha Krettek wrote: > > > > +1 > > > >> On Wed, Apr 12, 2017, at 02:34, Thomas Groh wrote: > >> I think that's a good idea. I would call the outputs of a ParDo the > >> "Main Output" and "Additional Outputs" - it seems like an easy way to > >> make it clear that there's one output that is always expected, and > >> there may be more. > >> > >> On Tue, Apr 11, 2017 at 5:29 PM, Robert Bradshaw < > >> rober...@google.com.invalid> wrote: > >> > >>> We should do some renaming in Python too. Right now we have > >>> SideOutputValue which I'd propose naming TaggedOutput or something > >>> like that. > >>> > >>> Should the docs change too? > >>> https://beam.apache.org/documentation/programming-guide/#transforms- > >>> sideio > >>> > >>> On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles > >>> > >>> wrote: > >>>> +1 ditto about sideInput and sideOutput not actually being related > >>>> > >>>> On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < > >>>> rober...@google.com.invalid> wrote: > >>>> > >>>>> +1, I think this is a lot clearer. > >>>>> > >>>>> On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk > >>>>> > >>>>> wrote: > >>>>>> strong +1 for changing the name away from sideOutput - the fact > >>>>>> that sideInput and sideOutput are not really related was > >>>>>> definitely a > >>> source > >>>>> of > >>>>>> confusion for me when learning beam. > >>>>>> > >>>>>> S > >>>>>> > >>>>>> On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh > >>>>>> >>>> > >>>>>> wrote: > >>>>>> > >>>>>>> Hey everyone: > >>>>>>> > >>>>>>> I'd like to rename DoFn.Context#sideOutput to #output (in the > >>>>>>> Java > >>> SDK). > >>>>>>> > >>>>>>> Having two methods, both named output, one which takes the "main > >>> output > >>>>>>> type" and one that takes a tag to specify the type more clearly > >>>>>>> communicates the actual behavior - sideOutput isn't a "special" > >>>>>>> way > >>> to > > >>>>>>> output, it's the same as output(T), just to a specified PCollection. > >>>>> This > >>>>>>> will help pipeline authors understand the actual behavior of > >>> outputting > >>>>> to > >>>>>>> a tag, and detangle it from "sideInput", which is a special way > >>>>>>> to > >>>>> receive > >>>>>>> input. Giving them the same name means that it's not even > >>>>>>> strange to > >>>>> call > >>>>>>> output and provide the main output type, which is what we want - > >>> it's a > >>>>>>> more specific way to output, but does not have different > >>> restrictions or > >>>>>>> capabilities. > >>>>>>> > >>>>>>> This is also a pretty small change within the SDK - it touches > >>>>>>> about > >>> 20 > >>>>>>> files, and the changes are pretty automatic. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>> Thomas > >>>>>>> > >>>>> > >>> >
Re: Renaming SideOutput
strong +1 best, JingsongLee--From:Tang Jijun(上海_技术部_数据平台_唐觊隽) Time:2017 Apr 12 (Wed) 10:39To:dev@beam.apache.org Subject:答复: Renaming SideOutput +1 more clearer -邮件原件- 发件人: Ankur Chauhan [mailto:an...@malloc64.com] 发送时间: 2017年4月12日 10:36 收件人: dev@beam.apache.org 主题: Re: Renaming SideOutput +1 this is pretty much the topmost things that I found odd when starting with the beam model. It would definitely be more intuitive to have a consistent name. Sent from my iPhone > On Apr 11, 2017, at 18:29, Aljoscha Krettek wrote: > > +1 > >> On Wed, Apr 12, 2017, at 02:34, Thomas Groh wrote: >> I think that's a good idea. I would call the outputs of a ParDo the >> "Main Output" and "Additional Outputs" - it seems like an easy way to >> make it clear that there's one output that is always expected, and >> there may be more. >> >> On Tue, Apr 11, 2017 at 5:29 PM, Robert Bradshaw < >> rober...@google.com.invalid> wrote: >> >>> We should do some renaming in Python too. Right now we have >>> SideOutputValue which I'd propose naming TaggedOutput or something >>> like that. >>> >>> Should the docs change too? >>> https://beam.apache.org/documentation/programming-guide/#transforms- >>> sideio >>> >>> On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles >>> >>> wrote: >>>> +1 ditto about sideInput and sideOutput not actually being related >>>> >>>> On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < >>>> rober...@google.com.invalid> wrote: >>>> >>>>> +1, I think this is a lot clearer. >>>>> >>>>> On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk >>>>> >>>>> wrote: >>>>>> strong +1 for changing the name away from sideOutput - the fact >>>>>> that sideInput and sideOutput are not really related was >>>>>> definitely a >>> source >>>>> of >>>>>> confusion for me when learning beam. >>>>>> >>>>>> S >>>>>> >>>>>> On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh >>>>>> >>> >>>>>> wrote: >>>>>> >>>>>>> Hey everyone: >>>>>>> >>>>>>> I'd like to rename DoFn.Context#sideOutput to #output (in the >>>>>>> Java >>> SDK). >>>>>>> >>>>>>> Having two methods, both named output, one which takes the "main >>> output >>>>>>> type" and one that takes a tag to specify the type more clearly >>>>>>> communicates the actual behavior - sideOutput isn't a "special" >>>>>>> way >>> to >>>>>>> output, it's the same as output(T), just to a specified PCollection. >>>>> This >>>>>>> will help pipeline authors understand the actual behavior of >>> outputting >>>>> to >>>>>>> a tag, and detangle it from "sideInput", which is a special way >>>>>>> to >>>>> receive >>>>>>> input. Giving them the same name means that it's not even >>>>>>> strange to >>>>> call >>>>>>> output and provide the main output type, which is what we want - >>> it's a >>>>>>> more specific way to output, but does not have different >>> restrictions or >>>>>>> capabilities. >>>>>>> >>>>>>> This is also a pretty small change within the SDK - it touches >>>>>>> about >>> 20 >>>>>>> files, and the changes are pretty automatic. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Thomas >>>>>>> >>>>> >>>
答复: Renaming SideOutput
+1 more clearer -邮件原件- 发件人: Ankur Chauhan [mailto:an...@malloc64.com] 发送时间: 2017年4月12日 10:36 收件人: dev@beam.apache.org 主题: Re: Renaming SideOutput +1 this is pretty much the topmost things that I found odd when starting with the beam model. It would definitely be more intuitive to have a consistent name. Sent from my iPhone > On Apr 11, 2017, at 18:29, Aljoscha Krettek wrote: > > +1 > >> On Wed, Apr 12, 2017, at 02:34, Thomas Groh wrote: >> I think that's a good idea. I would call the outputs of a ParDo the >> "Main Output" and "Additional Outputs" - it seems like an easy way to >> make it clear that there's one output that is always expected, and >> there may be more. >> >> On Tue, Apr 11, 2017 at 5:29 PM, Robert Bradshaw < >> rober...@google.com.invalid> wrote: >> >>> We should do some renaming in Python too. Right now we have >>> SideOutputValue which I'd propose naming TaggedOutput or something >>> like that. >>> >>> Should the docs change too? >>> https://beam.apache.org/documentation/programming-guide/#transforms- >>> sideio >>> >>> On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles >>> >>> wrote: >>>> +1 ditto about sideInput and sideOutput not actually being related >>>> >>>> On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < >>>> rober...@google.com.invalid> wrote: >>>> >>>>> +1, I think this is a lot clearer. >>>>> >>>>> On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk >>>>> >>>>> wrote: >>>>>> strong +1 for changing the name away from sideOutput - the fact >>>>>> that sideInput and sideOutput are not really related was >>>>>> definitely a >>> source >>>>> of >>>>>> confusion for me when learning beam. >>>>>> >>>>>> S >>>>>> >>>>>> On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh >>>>>> >>> >>>>>> wrote: >>>>>> >>>>>>> Hey everyone: >>>>>>> >>>>>>> I'd like to rename DoFn.Context#sideOutput to #output (in the >>>>>>> Java >>> SDK). >>>>>>> >>>>>>> Having two methods, both named output, one which takes the "main >>> output >>>>>>> type" and one that takes a tag to specify the type more clearly >>>>>>> communicates the actual behavior - sideOutput isn't a "special" >>>>>>> way >>> to >>>>>>> output, it's the same as output(T), just to a specified PCollection. >>>>> This >>>>>>> will help pipeline authors understand the actual behavior of >>> outputting >>>>> to >>>>>>> a tag, and detangle it from "sideInput", which is a special way >>>>>>> to >>>>> receive >>>>>>> input. Giving them the same name means that it's not even >>>>>>> strange to >>>>> call >>>>>>> output and provide the main output type, which is what we want - >>> it's a >>>>>>> more specific way to output, but does not have different >>> restrictions or >>>>>>> capabilities. >>>>>>> >>>>>>> This is also a pretty small change within the SDK - it touches >>>>>>> about >>> 20 >>>>>>> files, and the changes are pretty automatic. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Thomas >>>>>>> >>>>> >>>
Re: Renaming SideOutput
+1 this is pretty much the topmost things that I found odd when starting with the beam model. It would definitely be more intuitive to have a consistent name. Sent from my iPhone > On Apr 11, 2017, at 18:29, Aljoscha Krettek wrote: > > +1 > >> On Wed, Apr 12, 2017, at 02:34, Thomas Groh wrote: >> I think that's a good idea. I would call the outputs of a ParDo the "Main >> Output" and "Additional Outputs" - it seems like an easy way to make it >> clear that there's one output that is always expected, and there may be >> more. >> >> On Tue, Apr 11, 2017 at 5:29 PM, Robert Bradshaw < >> rober...@google.com.invalid> wrote: >> >>> We should do some renaming in Python too. Right now we have >>> SideOutputValue which I'd propose naming TaggedOutput or something >>> like that. >>> >>> Should the docs change too? >>> https://beam.apache.org/documentation/programming-guide/#transforms-sideio >>> >>> On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles >>> wrote: +1 ditto about sideInput and sideOutput not actually being related On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < rober...@google.com.invalid> wrote: > +1, I think this is a lot clearer. > > On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk > wrote: >> strong +1 for changing the name away from sideOutput - the fact that >> sideInput and sideOutput are not really related was definitely a >>> source > of >> confusion for me when learning beam. >> >> S >> >> On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh >>> >> wrote: >> >>> Hey everyone: >>> >>> I'd like to rename DoFn.Context#sideOutput to #output (in the Java >>> SDK). >>> >>> Having two methods, both named output, one which takes the "main >>> output >>> type" and one that takes a tag to specify the type more clearly >>> communicates the actual behavior - sideOutput isn't a "special" way >>> to >>> output, it's the same as output(T), just to a specified PCollection. > This >>> will help pipeline authors understand the actual behavior of >>> outputting > to >>> a tag, and detangle it from "sideInput", which is a special way to > receive >>> input. Giving them the same name means that it's not even strange to > call >>> output and provide the main output type, which is what we want - >>> it's a >>> more specific way to output, but does not have different >>> restrictions or >>> capabilities. >>> >>> This is also a pretty small change within the SDK - it touches about >>> 20 >>> files, and the changes are pretty automatic. >>> >>> Thanks, >>> >>> Thomas >>> > >>>
Re: Renaming SideOutput
+1 On Wed, Apr 12, 2017, at 02:34, Thomas Groh wrote: > I think that's a good idea. I would call the outputs of a ParDo the "Main > Output" and "Additional Outputs" - it seems like an easy way to make it > clear that there's one output that is always expected, and there may be > more. > > On Tue, Apr 11, 2017 at 5:29 PM, Robert Bradshaw < > rober...@google.com.invalid> wrote: > > > We should do some renaming in Python too. Right now we have > > SideOutputValue which I'd propose naming TaggedOutput or something > > like that. > > > > Should the docs change too? > > https://beam.apache.org/documentation/programming-guide/#transforms-sideio > > > > On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles > > wrote: > > > +1 ditto about sideInput and sideOutput not actually being related > > > > > > On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < > > > rober...@google.com.invalid> wrote: > > > > > >> +1, I think this is a lot clearer. > > >> > > >> On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk > > >> wrote: > > >> > strong +1 for changing the name away from sideOutput - the fact that > > >> > sideInput and sideOutput are not really related was definitely a > > source > > >> of > > >> > confusion for me when learning beam. > > >> > > > >> > S > > >> > > > >> > On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh > > > > >> > wrote: > > >> > > > >> >> Hey everyone: > > >> >> > > >> >> I'd like to rename DoFn.Context#sideOutput to #output (in the Java > > SDK). > > >> >> > > >> >> Having two methods, both named output, one which takes the "main > > output > > >> >> type" and one that takes a tag to specify the type more clearly > > >> >> communicates the actual behavior - sideOutput isn't a "special" way > > to > > >> >> output, it's the same as output(T), just to a specified PCollection. > > >> This > > >> >> will help pipeline authors understand the actual behavior of > > outputting > > >> to > > >> >> a tag, and detangle it from "sideInput", which is a special way to > > >> receive > > >> >> input. Giving them the same name means that it's not even strange to > > >> call > > >> >> output and provide the main output type, which is what we want - > > it's a > > >> >> more specific way to output, but does not have different > > restrictions or > > >> >> capabilities. > > >> >> > > >> >> This is also a pretty small change within the SDK - it touches about > > 20 > > >> >> files, and the changes are pretty automatic. > > >> >> > > >> >> Thanks, > > >> >> > > >> >> Thomas > > >> >> > > >> > >
Re: Renaming SideOutput
I think that's a good idea. I would call the outputs of a ParDo the "Main Output" and "Additional Outputs" - it seems like an easy way to make it clear that there's one output that is always expected, and there may be more. On Tue, Apr 11, 2017 at 5:29 PM, Robert Bradshaw < rober...@google.com.invalid> wrote: > We should do some renaming in Python too. Right now we have > SideOutputValue which I'd propose naming TaggedOutput or something > like that. > > Should the docs change too? > https://beam.apache.org/documentation/programming-guide/#transforms-sideio > > On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles > wrote: > > +1 ditto about sideInput and sideOutput not actually being related > > > > On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < > > rober...@google.com.invalid> wrote: > > > >> +1, I think this is a lot clearer. > >> > >> On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk > >> wrote: > >> > strong +1 for changing the name away from sideOutput - the fact that > >> > sideInput and sideOutput are not really related was definitely a > source > >> of > >> > confusion for me when learning beam. > >> > > >> > S > >> > > >> > On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh > > >> > wrote: > >> > > >> >> Hey everyone: > >> >> > >> >> I'd like to rename DoFn.Context#sideOutput to #output (in the Java > SDK). > >> >> > >> >> Having two methods, both named output, one which takes the "main > output > >> >> type" and one that takes a tag to specify the type more clearly > >> >> communicates the actual behavior - sideOutput isn't a "special" way > to > >> >> output, it's the same as output(T), just to a specified PCollection. > >> This > >> >> will help pipeline authors understand the actual behavior of > outputting > >> to > >> >> a tag, and detangle it from "sideInput", which is a special way to > >> receive > >> >> input. Giving them the same name means that it's not even strange to > >> call > >> >> output and provide the main output type, which is what we want - > it's a > >> >> more specific way to output, but does not have different > restrictions or > >> >> capabilities. > >> >> > >> >> This is also a pretty small change within the SDK - it touches about > 20 > >> >> files, and the changes are pretty automatic. > >> >> > >> >> Thanks, > >> >> > >> >> Thomas > >> >> > >> >
Re: Renaming SideOutput
We should do some renaming in Python too. Right now we have SideOutputValue which I'd propose naming TaggedOutput or something like that. Should the docs change too? https://beam.apache.org/documentation/programming-guide/#transforms-sideio On Tue, Apr 11, 2017 at 5:25 PM, Kenneth Knowles wrote: > +1 ditto about sideInput and sideOutput not actually being related > > On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < > rober...@google.com.invalid> wrote: > >> +1, I think this is a lot clearer. >> >> On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk >> wrote: >> > strong +1 for changing the name away from sideOutput - the fact that >> > sideInput and sideOutput are not really related was definitely a source >> of >> > confusion for me when learning beam. >> > >> > S >> > >> > On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh >> > wrote: >> > >> >> Hey everyone: >> >> >> >> I'd like to rename DoFn.Context#sideOutput to #output (in the Java SDK). >> >> >> >> Having two methods, both named output, one which takes the "main output >> >> type" and one that takes a tag to specify the type more clearly >> >> communicates the actual behavior - sideOutput isn't a "special" way to >> >> output, it's the same as output(T), just to a specified PCollection. >> This >> >> will help pipeline authors understand the actual behavior of outputting >> to >> >> a tag, and detangle it from "sideInput", which is a special way to >> receive >> >> input. Giving them the same name means that it's not even strange to >> call >> >> output and provide the main output type, which is what we want - it's a >> >> more specific way to output, but does not have different restrictions or >> >> capabilities. >> >> >> >> This is also a pretty small change within the SDK - it touches about 20 >> >> files, and the changes are pretty automatic. >> >> >> >> Thanks, >> >> >> >> Thomas >> >> >>
Re: Renaming SideOutput
+1 ditto about sideInput and sideOutput not actually being related On Tue, Apr 11, 2017 at 3:52 PM, Robert Bradshaw < rober...@google.com.invalid> wrote: > +1, I think this is a lot clearer. > > On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk > wrote: > > strong +1 for changing the name away from sideOutput - the fact that > > sideInput and sideOutput are not really related was definitely a source > of > > confusion for me when learning beam. > > > > S > > > > On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh > > wrote: > > > >> Hey everyone: > >> > >> I'd like to rename DoFn.Context#sideOutput to #output (in the Java SDK). > >> > >> Having two methods, both named output, one which takes the "main output > >> type" and one that takes a tag to specify the type more clearly > >> communicates the actual behavior - sideOutput isn't a "special" way to > >> output, it's the same as output(T), just to a specified PCollection. > This > >> will help pipeline authors understand the actual behavior of outputting > to > >> a tag, and detangle it from "sideInput", which is a special way to > receive > >> input. Giving them the same name means that it's not even strange to > call > >> output and provide the main output type, which is what we want - it's a > >> more specific way to output, but does not have different restrictions or > >> capabilities. > >> > >> This is also a pretty small change within the SDK - it touches about 20 > >> files, and the changes are pretty automatic. > >> > >> Thanks, > >> > >> Thomas > >> >
Re: Renaming SideOutput
+1, I think this is a lot clearer. On Tue, Apr 11, 2017 at 2:24 PM, Stephen Sisk wrote: > strong +1 for changing the name away from sideOutput - the fact that > sideInput and sideOutput are not really related was definitely a source of > confusion for me when learning beam. > > S > > On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh > wrote: > >> Hey everyone: >> >> I'd like to rename DoFn.Context#sideOutput to #output (in the Java SDK). >> >> Having two methods, both named output, one which takes the "main output >> type" and one that takes a tag to specify the type more clearly >> communicates the actual behavior - sideOutput isn't a "special" way to >> output, it's the same as output(T), just to a specified PCollection. This >> will help pipeline authors understand the actual behavior of outputting to >> a tag, and detangle it from "sideInput", which is a special way to receive >> input. Giving them the same name means that it's not even strange to call >> output and provide the main output type, which is what we want - it's a >> more specific way to output, but does not have different restrictions or >> capabilities. >> >> This is also a pretty small change within the SDK - it touches about 20 >> files, and the changes are pretty automatic. >> >> Thanks, >> >> Thomas >>
Re: Renaming SideOutput
strong +1 for changing the name away from sideOutput - the fact that sideInput and sideOutput are not really related was definitely a source of confusion for me when learning beam. S On Tue, Apr 11, 2017 at 1:56 PM Thomas Groh wrote: > Hey everyone: > > I'd like to rename DoFn.Context#sideOutput to #output (in the Java SDK). > > Having two methods, both named output, one which takes the "main output > type" and one that takes a tag to specify the type more clearly > communicates the actual behavior - sideOutput isn't a "special" way to > output, it's the same as output(T), just to a specified PCollection. This > will help pipeline authors understand the actual behavior of outputting to > a tag, and detangle it from "sideInput", which is a special way to receive > input. Giving them the same name means that it's not even strange to call > output and provide the main output type, which is what we want - it's a > more specific way to output, but does not have different restrictions or > capabilities. > > This is also a pretty small change within the SDK - it touches about 20 > files, and the changes are pretty automatic. > > Thanks, > > Thomas >
Renaming SideOutput
Hey everyone: I'd like to rename DoFn.Context#sideOutput to #output (in the Java SDK). Having two methods, both named output, one which takes the "main output type" and one that takes a tag to specify the type more clearly communicates the actual behavior - sideOutput isn't a "special" way to output, it's the same as output(T), just to a specified PCollection. This will help pipeline authors understand the actual behavior of outputting to a tag, and detangle it from "sideInput", which is a special way to receive input. Giving them the same name means that it's not even strange to call output and provide the main output type, which is what we want - it's a more specific way to output, but does not have different restrictions or capabilities. This is also a pretty small change within the SDK - it touches about 20 files, and the changes are pretty automatic. Thanks, Thomas