You do not need recent versions of spark, kafka, or structured streaming in order to do this. Normal DStreams are sufficient.
You can parallelize your static data from the database to an RDD, and there's a join method available on RDDs. Transforming a single given timestamp line into multiple lines with modified timestamps can be done using flatMap. On Tue, Dec 6, 2016 at 11:11 AM, Burak Yavuz <brk...@gmail.com> wrote: > Hi Daniela, > > This is trivial with Structured Streaming. If your Kafka cluster is 0.10.0 > or above, you may use Spark 2.0.2 to create a Streaming DataFrame from > Kafka, and then also create a DataFrame using the JDBC connection, and you > may join those. In Spark 2.1, there's support for a function called > "from_json", which should also help you easily parse your messages incoming > from Kafka. > > Best, > Burak > > On Tue, Dec 6, 2016 at 2:16 AM, Daniela S <daniela_4...@gmx.at> wrote: >> >> Hi >> >> I have some questions regarding Spark Streaming. >> >> I receive a stream of JSON messages from Kafka. >> The messages consist of a timestamp and an ID. >> >> timestamp ID >> 2016-12-06 13:00 1 >> 2016-12-06 13:40 5 >> ... >> >> In a database I have values for each ID: >> >> ID minute value >> 1 0 3 >> 1 1 5 >> 1 2 7 >> 1 3 8 >> 5 0 6 >> 5 1 6 >> 5 2 8 >> 5 3 5 >> 5 4 6 >> >> So I would like to join each incoming JSON message with the corresponding >> values. It should look as follows: >> >> timestamp ID minute value >> 2016-12-06 13:00 1 0 3 >> 2016-12-06 13:00 1 1 5 >> 2016-12-06 13:00 1 2 7 >> 2016-12-06 13:00 1 3 8 >> 2016-12-06 13:40 5 0 6 >> 2016-12-06 13:40 5 1 6 >> 2016-12-06 13:40 5 2 8 >> 2016-12-06 13:40 5 3 5 >> 2016-12-06 13:40 5 4 6 >> ... >> >> Then I would like to add the minute values to the timestamp. I only need >> the computed timestamp and the values. So the result should look as follows: >> >> timestamp value >> 2016-12-06 13:00 3 >> 2016-12-06 13:01 5 >> 2016-12-06 13:02 7 >> 2016-12-06 13:03 8 >> 2016-12-06 13:40 6 >> 2016-12-06 13:41 6 >> 2016-12-06 13:42 8 >> 2016-12-06 13:43 5 >> 2016-12-06 13:44 6 >> ... >> >> Is this a possible use case for Spark Streaming? I thought I could join >> the streaming data with the static data but I am not sure how to add the >> minute values to the timestamp. Is this possible with Spark Streaming? >> >> Thank you in advance. >> >> Best regards, >> Daniela >> >> --------------------------------------------------------------------- To >> unsubscribe e-mail: user-unsubscr...@spark.apache.org > > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org