[ 
https://issues.apache.org/jira/browse/FLINK-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282679#comment-14282679
 ] 

Stephan Ewen commented on FLINK-1098:
-------------------------------------

Can we do this with a special flatMapFunction, rather than with extra functions 
on the data set? Something like
{code}
public class FlatMapArrayFunction<I, O> extends RichMapFunction<I, O>  {

  public void flatMap(I in , Collector<O> out) {
    O[] result = flatMapArray(in);
    for (O element : result) {
      out.collect(element );
    }
  }

  public abstract O[] flatMapArray(I in) throws Exception;
}
{code}

This could go into the {{lib}} of common utility function.

For a more functional approach: I think in functional programming, the function 
is called {{flatten()}} and should work in {{DataSet<List<T>>}} and 
{{DataSet<T[]>}}.

In Scala, this is nice and easy, we can add an implicit conversion to an array- 
or list dataset, with a guard that the conversion only happens if the type is 
list or array.

For Java, it is not that nice. Adding the special functions is again bloating 
the API, so we need again make a careful tradeoff.

> flatArray() operator that converts arrays to elements
> -----------------------------------------------------
>
>                 Key: FLINK-1098
>                 URL: https://issues.apache.org/jira/browse/FLINK-1098
>             Project: Flink
>          Issue Type: New Feature
>            Reporter: Timo Walther
>            Priority: Minor
>
> It would be great to have an operator that converts e.g. from String[] to 
> String. Actually, it is just a flatMap over the elements of an array.
> A typical use case is a WordCount where we then could write:
> {code}
> text
> .map((line) -> line.toLowerCase().split("\\W+"))
> .flatArray()
> .map((word) -> new Tuple2(word, 1))
> .groupBy(0)
> .sum(1);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to