A quick note on this: the side-input API is still ongoing work and it turns out
it’s more complicated (obviously … 😳) and we will need quite a bit more work on
other parts of Flink before we can provide a good built-in solution.
In the meantime, you can check out the Async I/O operator [1]. I th
Hi Josh,
I have a use-case similar to yours. I need to join a stream with data from a
database to which I have access via a REST API. Since the Side inputs API
continues begin and ongoing work. I am wondering how did you approached it,
Did you use the rich function updating it periodically?
Thank
Hi Aljoscha,
That sounds exactly like the kind of feature I was looking for, since my
use-case fits the "Join stream with slowly evolving data" example. For now,
I will do an implementation similar to Max's suggestion. Of course it's not
as nice as the proposed feature, as there will be a delay in
Hi Josh,
for the first part of your question you might be interested in our ongoing
work of adding side inputs to Flink. I started this design doc:
https://docs.google.com/document/d/1hIgxi2Zchww_5fWUHLoYiXwSBXjv-M5eOv-MKQYN3m4/edit?usp=sharing
It's still somewhat rough around the edges but could
Hi Josh,
You can trigger an occasional refresh, e.g. on every 100 elements
received. Or, you could start a thread that does that every 100
seconds (possible with a lock involved to prevent processing in the
meantime).
Cheers,
Max
On Mon, May 23, 2016 at 7:36 PM, Josh wrote:
>
> Hi Max,
>
> Than
Hi Max,
Thanks, that's very helpful re the REST API sink. For now I don't need
exactly once guarantees for the sink, so I'll just write a simple HTTP sink
implementation. But may need to move to the idempotent version in future!
For 1), that sounds like a simple/easy solution, but how would I han
Hi Josh,
1) Use a RichFunction which has an `open()` method to load data (e.g. from
a database) at runtime before the processing starts.
2) No that's fine. If you want your Rest API Sink to interplay with
checkpointing (for fault-tolerance), this is a bit tricky though depending
on the guarantees
Hi,
1. I have no experience in broadcast variables, I suggest you give it a try.
2. I misunderstood you, I thought you were calling for Flink to serve the
results and become REST API provider, where others can call those API. What you
are saying now is that you want a sink that does HTTP calls
Hi Rami,
Thanks for the fast reply.
1. In your solution, would I need to create a new stream for 'item
updates', and add it as a source of my Flink job? Then I would need to
ensure item updates get broadcast to all nodes that are running my job and
use them to update the in-memory ite
Hi Josh,
I am no expert in Flink yet, but here are my thoughts on this:
1. what about you stream an event to flink everytime the DB of items have an
update? then in some background thread you get the new data from the DB let it
be through REST (if it is only few updates a day) then load the res
10 matches
Mail list logo