Hi,
I did some work recently on adding support for SQL-like queries on top
of DataSets. (This is known as "named datasets" in the jira issue:
https://issues.apache.org/jira/browse/FLINK-947?jql=project%20%3D%20FLINK%20AND%20assignee%20%3D%20currentUser()%20AND%20resolution%20%3D%20Unresolved).

I have support for filter, join, grouping and aggregation. I think the
basis is quite strong now but we can add support for more data types
and supported operations in the select expressions.

Please have a look at my branch if you're interested:
https://github.com/aljoscha/flink/tree/linq You can look at the new
Expression ITCases to see what features are currently available and
how the interface is used. There are also two complete programs:
PageRankExpression and TPCHQuery3Expression.

And now at last, a sneak peek at how the new interface is used:

in.group('key).select('key, ('a + 10).avg + " the average", 'a.count)

The notation 'foo are Scala symbols, I use them in the DSL to
reference named fields.

Cheers,
Aljoscha

Reply via email to