Thanks a lot for the help! I'll definately check out the
KafkaCluster.scala. I probably first try use that api from java, and later
try to build the subproject.
thanks,
Charles
On Fri, Jan 22, 2016 at 12:26 PM, Cody Koeninger wrote:
> Yes, you should query Kafka if you want to know the latest
Yes, you should query Kafka if you want to know the latest available
offsets.
There's code to make this straightforward in KafkaCluster.scala, but the
interface isnt public. There's an outstanding pull request to expose the
api at
https://issues.apache.org/jira/browse/SPARK-10963
but frankly it
Hi,
I have been using DirectKafkaInputDStream in Spark Streaming to consumer kafka
messages and it's been working very well. Now I have the need to batch process
messages from Kafka, for example, retrieve all messages every hour and process
them, output to destinations like Hive or HDFS. I woul