Hi Boris, Joining across two different data streams is not really something NiFi is aiming to solve.
Generally I think we'd say that you'd use one of the stream processing systems like Flink, Spark, Storm, etc. Another possible option might be to pull the data and land it in a common location like Hive, then you can run a single query against Hive that joins the tables. Others may have more experience with solving this than I do, so curious to hear other approaches people have taken. -Bryan On Fri, Feb 22, 2019 at 9:08 AM Boris Tyukin <[email protected]> wrote: > > Hi guys, > > I pull two datasets from two different databases on schedule and need to join > both on some ID and then publish combined dataset to Kafka. > > What is the best way to do this? Puzzled how I would synchronize two data > pulls so data is joined for exact flowfiles I need, i.e. if there are errors > anythere, I do not want to join older flowfile with a newer one. > > Thanks! > Boris
