Hi Diego, you can also broadcast a changelog stream:
DataStream<X> mainStream = ... DataStream<Y> changeStream = ... mainStream.connect(changeStream.broadcast()).flatMap(new YourCoFlatMapFunction()); All records of the changeStream will be forwarded to each instance of the flatmap operator. Best, Fabian 2017-01-31 8:12 GMT+01:00 Diego Fustes Villadóniga <dfus...@oesia.com>: > Hi Stephan, > > > > Thanks a lot for your response. I’ll study the options that you mention, > I’m not sure if the “chagelog stream” will be easy to implement since the > lookup is based on matching IP ranges and not just keys. > > > > Regards, > > > > Diego > > > > *De:* Stephan Ewen [mailto:se...@apache.org] > *Enviado el:* lunes, 30 de enero de 2017 17:39 > *Para:* user@flink.apache.org > *Asunto:* Re: Calling external services/databases from DataStream API > > > > Hi! > > > > The Distributed cache would actually indeed be nice to add to the > DataStream API. Since the runtime parts for that are all in place, the code > would be mainly on the "client" side that sets up the JobGraph to be > submitted and executed. > > > > For the problem of scaling this, there are two solutions that I can see: > > > > (1) Simpler: Use the new asynchronous I/O operator to talk with the > external database in an asynchronous fashion (that should help to get > higher throughput) https://ci.apache.org/projects/flink/flink-docs- > release-1.2/dev/stream/asyncio.html > > > > (2) More elaborate: Convert the lookup database into a "changelog stream" > and make the enrichment operation a "stream-to-stream" join. > > > > Greetings, > > Stephan > > > > > > On Mon, Jan 30, 2017 at 1:36 PM, Jonas <jo...@huntun.de> wrote: > > I have a similar usecase where I (for the purposes of this discussion) > have a > GeoIP Database that is not fully available from the start but will > eventually be "full". The GeoIP tuples are coming in one after another. > After ~4M tuples the GeoIP database is complete. > > I also need to do the same query. > > > > -- > View this message in context: http://apache-flink-user-maili > ng-list-archive.2336050.n4.nabble.com/Calling-external- > services-databases-from-DataStream-API-tp11366p11367.html > Sent from the Apache Flink User Mailing List archive. mailing list archive > at Nabble.com. > > >