[
https://issues.apache.org/jira/browse/FLINK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14367368#comment-14367368
]
Gyula Fora commented on FLINK-1503:
-----------------------------------
Hey,
As you can probably tell from the description this a very complex issue with
lot of possibilities and implications.
You are always welcome to ask anything regarding this on the dev-list. (Or you
can also reach out directly to me)
http://flink.apache.org/community.html#mailing-lists
As a first step I would probably try to understand whats going on under the
hood in the Flink runtime , and try to come up with some use cases that would
need batch streaming interaction.
Having a use-case in mind can be a very good guide.
I also did some prototype implementations a while back here:
https://github.com/mbalassi/flink/tree/batch-integration
The branch is completely outdated but if you checkout you can run the "lambda
example"
> GSoC project: Batch and Streaming integration through new operators and
> unified API
> -----------------------------------------------------------------------------------
>
> Key: FLINK-1503
> URL: https://issues.apache.org/jira/browse/FLINK-1503
> Project: Flink
> Issue Type: New Feature
> Components: Java API, Scala API, Streaming
> Reporter: Gyula Fora
> Priority: Minor
> Labels: gsoc2015, java, scala
>
> Currently the Flink batch and streaming API-s (java and scala) and runtimes
> work independently from each other without operators to allow interactions
> between DataStreams and DataSets in a fault-tolerant manner.
> The goal is to modify the execution environments and the runtime layer to
> allow these interactions.
> Possible runtime changes to add:
> -Interaction through intermediate files
> -Interaction by connection the execution graphs
> Possible new operators implement:
> -Converting a dataset to a datastream (either by directly streaming in the
> results or periodically executing the dataset transformations)
> -Hash joining a datastream with a dataset by key
> -Other binary operators with streams and sets
> The implementations should work with the fault tolerance mechanism provided
> by then (exaclty-once or at-least-once guarantees).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)