Hi Boris,

Joining across two different data streams is not really something NiFi
is aiming to solve.

Generally I think we'd say that you'd use one of the stream processing
systems like Flink, Spark, Storm, etc.

Another possible option might be to pull the data and land it in a
common location like Hive, then you can run a single query against
Hive that joins the tables.

Others may have more experience with solving this than I do, so
curious to hear other approaches people have taken.

-Bryan

On Fri, Feb 22, 2019 at 9:08 AM Boris Tyukin <[email protected]> wrote:
>
> Hi guys,
>
> I pull two datasets from two different databases on schedule and need to join 
> both on some ID and then publish combined dataset to Kafka.
>
> What is the best way to do this? Puzzled how I would synchronize two data 
> pulls so data is joined for exact flowfiles I need, i.e. if there are errors 
> anythere, I do not want to join older flowfile with a newer one.
>
> Thanks!
> Boris

Reply via email to