Hi Diego,

you can also broadcast a changelog stream:

DataStream<X> mainStream = ...
DataStream<Y> changeStream = ...

mainStream.connect(changeStream.broadcast()).flatMap(new
YourCoFlatMapFunction());

All records of the changeStream will be forwarded to each instance of the
flatmap operator.

Best, Fabian

2017-01-31 8:12 GMT+01:00 Diego Fustes Villadóniga <dfus...@oesia.com>:

> Hi Stephan,
>
>
>
> Thanks a lot for your response. I’ll study the options that you mention,
> I’m not sure if the “chagelog stream” will be easy to implement since the
> lookup is based on matching IP ranges and not just keys.
>
>
>
> Regards,
>
>
>
> Diego
>
>
>
> *De:* Stephan Ewen [mailto:se...@apache.org]
> *Enviado el:* lunes, 30 de enero de 2017 17:39
> *Para:* user@flink.apache.org
> *Asunto:* Re: Calling external services/databases from DataStream API
>
>
>
> Hi!
>
>
>
> The Distributed cache would actually indeed be nice to add to the
> DataStream API. Since the runtime parts for that are all in place, the code
> would be mainly on the "client" side that sets up the JobGraph to be
> submitted and executed.
>
>
>
> For the problem of scaling this, there are two solutions that I can see:
>
>
>
> (1) Simpler: Use the new asynchronous I/O operator to talk with the
> external database in an asynchronous fashion (that should help to get
> higher throughput) https://ci.apache.org/projects/flink/flink-docs-
> release-1.2/dev/stream/asyncio.html
>
>
>
> (2) More elaborate: Convert the lookup database into a "changelog stream"
> and make the enrichment operation a "stream-to-stream" join.
>
>
>
> Greetings,
>
> Stephan
>
>
>
>
>
> On Mon, Jan 30, 2017 at 1:36 PM, Jonas <jo...@huntun.de> wrote:
>
> I have a similar usecase where I (for the purposes of this discussion)
> have a
> GeoIP Database that is not fully available from the start but will
> eventually be "full". The GeoIP tuples are coming in one after another.
> After ~4M tuples the GeoIP database is complete.
>
> I also need to do the same query.
>
>
>
> --
> View this message in context: http://apache-flink-user-maili
> ng-list-archive.2336050.n4.nabble.com/Calling-external-
> services-databases-from-DataStream-API-tp11366p11367.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>
>
>

Reply via email to