[
https://issues.apache.org/jira/browse/FLINK-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136277#comment-14136277
]
Fabian Hueske commented on FLINK-1098:
--------------------------------------
The current APIs support many use-cases and recent additions and proposal are
rather shortcuts than adding new functionality. I think we are at a point now,
where we should think about whether we want (1) an API with many built-in
features or (2) a concise set of the most common operations.
While (1) would mean a very rich feature set which could make many things very
comfortable for users, it has some drawbacks such has high maintenance effort
(incl. documentation and porting to other language bindings) and a potentially
bloated API which makes it hard for new users to find their way around.
On the other hand (2) offers less user comfort, but is easier to maintain and
easy to become familiar with.
A compromise could be to extract some of the non-fundamental features from
DataSet and put them into some add-on operator package. That way we could
maintain a concise API while having the option to use a rich operator set.
I am not strictly against adding new operators to the APIs but I think we
should have a discussion about this issue.
I tend to go with the second option (concise API). If we find a way go with the
add-on operator package, even better.
What do you think?
> flatArray() operator that converts arrays to elements
> -----------------------------------------------------
>
> Key: FLINK-1098
> URL: https://issues.apache.org/jira/browse/FLINK-1098
> Project: Flink
> Issue Type: New Feature
> Reporter: Timo Walther
> Priority: Minor
>
> It would be great to have an operator that converts e.g. from String[] to
> String. Actually, it is just a flatMap over the elements of an array.
> A typical use case is a WordCount where we then could write:
> {code}
> text
> .map((line) -> line.toLowerCase().split("\\W+"))
> .flatArray()
> .map((word) -> new Tuple2(word, 1))
> .groupBy(0)
> .sum(1);
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)