Re: [Discuss] FLIP-13 Side Outputs in Flink

2016-12-09 Thread Chen Qin
Dear Flink community members,

Please review and comment on https://github.com/apache/flink/pull/2982.

Thanks,
Chen



--
View this message in context: 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discuss-FLIP-13-Side-Outputs-in-Flink-tp14204p14938.html
Sent from the Apache Flink Mailing List archive. mailing list archive at 
Nabble.com.


Re: [Discuss] FLIP-13 Side Outputs in Flink

2016-11-03 Thread Chen Qin
Adding another abstract method to Collector interface is also considerably
easier from API backward compatibility point of view.

The cost could be either

1) many class with empty implementation of * void collect(OutputTag
tag, S value) *method

2) split streamrecord related classes that implement Collector interface
from graph generator related classes. For streamrecord ones, we might be
able to implement *collect(T out)* by calling * void
collect(OutputTag tag, S value). *For graph generator keep it as it is.


On Wed, Nov 2, 2016 at 8:14 PM, Chen Qin  wrote:

> Hi Fabian
>
> Thanks for your feedback. sorry for late reply.
> Some of comments inline. Will update FLIP-13 wiki reflect your comments.
>
>
> - Will multiple side outputs of the same type be supported?
>
> > It wasn't implemented in prototype. But should be easy to support, we
> have unique id in stream record.
>
> - If I got it right, the FLIP proposes to change the signatures of many
>
> user-defined functions (FlatMapFunction, WindowFunction, ...). Most of
>
> these interfaces/classes are annotated with @Public, which means we cannot
>
> change them in the Flink 1.x release line. What would be alternatives? I
>
> can think of
> a) casting the Collector into a RichCollector (as you do in
>
> your prototype) or
> > This is like a private magic API. Should be 100% compatible but not good
> implementation.
>
> b) retrieve the RichCollector from the RuntimeContext
>
> > It seems better option, yet many highly used Function like FlatMap will
> not get support. To get support, we need to create some redundant classes
> inherited from RichFunction( like implement RichFlatMap etc) [we might put
> these in different package and isolate impact of this change)
>
> that a RichFunction provides.
>
>
> I'm not so familiar with the internals of the DataStream API, so I leave
>
> comments on that to other.
>
>
> Best, Fabian
>
> On Tue, Oct 25, 2016 at 9:00 AM, Chen Qin  wrote:
>
>> Hey folks,
>>
>> Please give feedback on FLIP-13!
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-13+
>> Side+Outputs+in+Flink
>> JIRA task link to google doc https://issues.apache.org/
>> jira/browse/FLINK-4460
>>
>> Thanks,
>> Chen Qin
>>
>
>
>
> --
> -Chen Qin
>



-- 
-Chen Qin


Re: [Discuss] FLIP-13 Side Outputs in Flink

2016-11-02 Thread Chen Qin
 Hi Fabian

Thanks for your feedback. sorry for late reply.
Some of comments inline. Will update FLIP-13 wiki reflect your comments.


- Will multiple side outputs of the same type be supported?

> It wasn't implemented in prototype. But should be easy to support, we
have unique id in stream record.

- If I got it right, the FLIP proposes to change the signatures of many

user-defined functions (FlatMapFunction, WindowFunction, ...). Most of

these interfaces/classes are annotated with @Public, which means we cannot

change them in the Flink 1.x release line. What would be alternatives? I

can think of
a) casting the Collector into a RichCollector (as you do in

your prototype) or
> This is like a private magic API. Should be 100% compatible but not good
implementation.

b) retrieve the RichCollector from the RuntimeContext

> It seems better option, yet many highly used Function like FlatMap will
not get support. To get support, we need to create some redundant classes
inherited from RichFunction( like implement RichFlatMap etc) [we might put
these in different package and isolate impact of this change)

that a RichFunction provides.


I'm not so familiar with the internals of the DataStream API, so I leave

comments on that to other.


Best, Fabian

On Tue, Oct 25, 2016 at 9:00 AM, Chen Qin  wrote:

> Hey folks,
>
> Please give feedback on FLIP-13!
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-
> 13+Side+Outputs+in+Flink
> JIRA task link to google doc https://issues.apache.org/
> jira/browse/FLINK-4460
>
> Thanks,
> Chen Qin
>



-- 
-Chen Qin


Re: [Discuss] FLIP-13 Side Outputs in Flink

2016-10-27 Thread Fabian Hueske
Hi CPC,

I agree, support for side outputs would be nice for DataSet as well.
However, this is not easily possible because it would require an extensive
rewrite of the DataSet optimizer.
IMO, that's out of scope for this proposal.

Cheers, Fabian

2016-10-27 0:29 GMT+02:00 CPC :

> Is it just related to stream api? This feature could be really useful for
> etl scenarios with dataset api as well.
>
> On Oct 26, 2016 22:29, "Fabian Hueske"  wrote:
>
> > Hi Chen,
> >
> > thanks for this interesting proposal. I think side output would be a very
> > valuable feature to have!
> >
> > I went of the FLIP and have a few questions.
> >
> > - Will multiple side outputs of the same type be supported?
> > - If I got it right, the FLIP proposes to change the signatures of many
> > user-defined functions (FlatMapFunction, WindowFunction, ...). Most of
> > these interfaces/classes are annotated with @Public, which means we
> cannot
> > change them in the Flink 1.x release line. What would be alternatives? I
> > can think of a) casting the Collector into a RichCollector (as you do in
> > your prototype) or b) retrieve the RichCollector from the RuntimeContext
> > that a RichFunction provides.
> >
> > I'm not so familiar with the internals of the DataStream API, so I leave
> > comments on that to other.
> >
> > Best, Fabian
> >
> > 2016-10-25 18:00 GMT+02:00 Chen Qin :
> >
> > > Hey folks,
> > >
> > > Please give feedback on FLIP-13!
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-
> > > 13+Side+Outputs+in+Flink
> > > JIRA task link to google doc
> > > https://issues.apache.org/jira/browse/FLINK-4460
> > >
> > > Thanks,
> > > Chen Qin
> > >
> >
>


Re: [Discuss] FLIP-13 Side Outputs in Flink

2016-10-26 Thread CPC
Is it just related to stream api? This feature could be really useful for
etl scenarios with dataset api as well.

On Oct 26, 2016 22:29, "Fabian Hueske"  wrote:

> Hi Chen,
>
> thanks for this interesting proposal. I think side output would be a very
> valuable feature to have!
>
> I went of the FLIP and have a few questions.
>
> - Will multiple side outputs of the same type be supported?
> - If I got it right, the FLIP proposes to change the signatures of many
> user-defined functions (FlatMapFunction, WindowFunction, ...). Most of
> these interfaces/classes are annotated with @Public, which means we cannot
> change them in the Flink 1.x release line. What would be alternatives? I
> can think of a) casting the Collector into a RichCollector (as you do in
> your prototype) or b) retrieve the RichCollector from the RuntimeContext
> that a RichFunction provides.
>
> I'm not so familiar with the internals of the DataStream API, so I leave
> comments on that to other.
>
> Best, Fabian
>
> 2016-10-25 18:00 GMT+02:00 Chen Qin :
>
> > Hey folks,
> >
> > Please give feedback on FLIP-13!
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-
> > 13+Side+Outputs+in+Flink
> > JIRA task link to google doc
> > https://issues.apache.org/jira/browse/FLINK-4460
> >
> > Thanks,
> > Chen Qin
> >
>


Re: [Discuss] FLIP-13 Side Outputs in Flink

2016-10-26 Thread Fabian Hueske
Hi Chen,

thanks for this interesting proposal. I think side output would be a very
valuable feature to have!

I went of the FLIP and have a few questions.

- Will multiple side outputs of the same type be supported?
- If I got it right, the FLIP proposes to change the signatures of many
user-defined functions (FlatMapFunction, WindowFunction, ...). Most of
these interfaces/classes are annotated with @Public, which means we cannot
change them in the Flink 1.x release line. What would be alternatives? I
can think of a) casting the Collector into a RichCollector (as you do in
your prototype) or b) retrieve the RichCollector from the RuntimeContext
that a RichFunction provides.

I'm not so familiar with the internals of the DataStream API, so I leave
comments on that to other.

Best, Fabian

2016-10-25 18:00 GMT+02:00 Chen Qin :

> Hey folks,
>
> Please give feedback on FLIP-13!
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-
> 13+Side+Outputs+in+Flink
> JIRA task link to google doc
> https://issues.apache.org/jira/browse/FLINK-4460
>
> Thanks,
> Chen Qin
>