[jira] [Commented] (FLINK-4575) DataSet aggregate methods should support POJOs

2017-11-13 Thread Gabor Gevay (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249444#comment-16249444
 ] 

Gabor Gevay commented on FLINK-4575:


OK, makes sense. Feel free to close this jira, if you think we shouldn't do it.

> DataSet aggregate methods should support POJOs
> --
>
> Key: FLINK-4575
> URL: https://issues.apache.org/jira/browse/FLINK-4575
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Reporter: Gabor Gevay
>Priority: Minor
>  Labels: starter
>
> The aggregate methods of DataSets (aggregate, sum, min, max) currently only 
> support Tuples, with the fields specified by indices. With 
> https://issues.apache.org/jira/browse/FLINK-3702 resolved, adding support for 
> POJOs and field expressions would be easy: {{AggregateOperator}} would create 
> {{FieldAccessors}} instead of just storing field positions, and 
> {{AggregateOperator.AggregatingUdf}} would use these {{FieldAccessors}} 
> instead of the Tuple field access methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-4575) DataSet aggregate methods should support POJOs

2017-11-13 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249312#comment-16249312
 ] 

Fabian Hueske commented on FLINK-4575:
--

I'm not sure about extending the DataSet API for such special cases. In fact, 
I'd rather remove the support for built-in aggregation functions on Tuples in 
the future as well.
IMO, the DataSet API is a rather low-level API that should provide the tools to 
implement custom functions based on {{MapFunction}}, {{GroupReduceFunction}}, 
etc.

Functionality for built-in aggregation function is much better covered by the 
Table API or SQL support. In fact, it is very easy to convert a {{DataSet}} 
into a {{Table}} and vice-versa.

> DataSet aggregate methods should support POJOs
> --
>
> Key: FLINK-4575
> URL: https://issues.apache.org/jira/browse/FLINK-4575
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Reporter: Gabor Gevay
>Priority: Minor
>  Labels: starter
>
> The aggregate methods of DataSets (aggregate, sum, min, max) currently only 
> support Tuples, with the fields specified by indices. With 
> https://issues.apache.org/jira/browse/FLINK-3702 resolved, adding support for 
> POJOs and field expressions would be easy: {{AggregateOperator}} would create 
> {{FieldAccessors}} instead of just storing field positions, and 
> {{AggregateOperator.AggregatingUdf}} would use these {{FieldAccessors}} 
> instead of the Tuple field access methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-4575) DataSet aggregate methods should support POJOs

2017-11-12 Thread Gabor Gevay (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248874#comment-16248874
 ] 

Gabor Gevay commented on FLINK-4575:


[~vcycyv], I'm not sure how would {{getFlatFields}} help here. (How would you 
convert back to POJO at the end?)

But if you would like to work on this jira, then the approach outlined in the 
jira description should work. I think this is the cleanest solution, since 
{{FieldAccessor}} is exactly for situations like we have here, where we have to 
get and set a field, based on a field expression. However, you would have to 
resolve https://issues.apache.org/jira/browse/FLINK-4578 first. I think that 
could be resolved by the solution that I wrote in a comment there.

> DataSet aggregate methods should support POJOs
> --
>
> Key: FLINK-4575
> URL: https://issues.apache.org/jira/browse/FLINK-4575
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Reporter: Gabor Gevay
>Priority: Minor
>  Labels: starter
>
> The aggregate methods of DataSets (aggregate, sum, min, max) currently only 
> support Tuples, with the fields specified by indices. With 
> https://issues.apache.org/jira/browse/FLINK-3702 resolved, adding support for 
> POJOs and field expressions would be easy: {{AggregateOperator}} would create 
> {{FieldAccessors}} instead of just storing field positions, and 
> {{AggregateOperator.AggregatingUdf}} would use these {{FieldAccessors}} 
> instead of the Tuple field access methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-4575) DataSet aggregate methods should support POJOs

2017-11-10 Thread Chuyang Wan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248142#comment-16248142
 ] 

Chuyang Wan commented on FLINK-4575:


[~ggevay], if you are ok with converting the POJO to a Tuple via getFlatFields, 
I can take the task. The idea is, to take POJO field for aggregation operators' 
argument, and convert it to tuple field, then just use the tuple field as what 
the current program does. 

> DataSet aggregate methods should support POJOs
> --
>
> Key: FLINK-4575
> URL: https://issues.apache.org/jira/browse/FLINK-4575
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Reporter: Gabor Gevay
>Priority: Minor
>  Labels: starter
>
> The aggregate methods of DataSets (aggregate, sum, min, max) currently only 
> support Tuples, with the fields specified by indices. With 
> https://issues.apache.org/jira/browse/FLINK-3702 resolved, adding support for 
> POJOs and field expressions would be easy: {{AggregateOperator}} would create 
> {{FieldAccessors}} instead of just storing field positions, and 
> {{AggregateOperator.AggregatingUdf}} would use these {{FieldAccessors}} 
> instead of the Tuple field access methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-4575) DataSet aggregate methods should support POJOs

2017-11-10 Thread Chuyang Wan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247175#comment-16247175
 ] 

Chuyang Wan commented on FLINK-4575:


How about converting the POJO to a Tuple via getFlatFields? Just a thought...

> DataSet aggregate methods should support POJOs
> --
>
> Key: FLINK-4575
> URL: https://issues.apache.org/jira/browse/FLINK-4575
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Reporter: Gabor Gevay
>Priority: Minor
>  Labels: starter
>
> The aggregate methods of DataSets (aggregate, sum, min, max) currently only 
> support Tuples, with the fields specified by indices. With 
> https://issues.apache.org/jira/browse/FLINK-3702 resolved, adding support for 
> POJOs and field expressions would be easy: {{AggregateOperator}} would create 
> {{FieldAccessors}} instead of just storing field positions, and 
> {{AggregateOperator.AggregatingUdf}} would use these {{FieldAccessors}} 
> instead of the Tuple field access methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-4575) DataSet aggregate methods should support POJOs

2016-09-04 Thread Gabor Gevay (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15463360#comment-15463360
 ] 

Gabor Gevay commented on FLINK-4575:


This is a bit harder than I thought, because of adding the ForwardedFields 
property in case of the aggregation field not being the same as the key field. 
Note, that the old logic of determining this has a bug: 
https://issues.apache.org/jira/browse/FLINK-4578

> DataSet aggregate methods should support POJOs
> --
>
> Key: FLINK-4575
> URL: https://issues.apache.org/jira/browse/FLINK-4575
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Reporter: Gabor Gevay
>Priority: Minor
>  Labels: starter
>
> The aggregate methods of DataSets (aggregate, sum, min, max) currently only 
> support Tuples, with the fields specified by indices. With 
> https://issues.apache.org/jira/browse/FLINK-3702 resolved, adding support for 
> POJOs and field expressions would be easy: {{AggregateOperator}} would create 
> {{FieldAccessors}} instead of just storing field positions, and 
> {{AggregateOperator.AggregatingUdf}} would use these {{FieldAccessors}} 
> instead of the Tuple field access methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)