KafkaRDD uses the simple consumer api. and i think you need to handle
offsets yourself, unless things changed since i last looked.

I would do second approach.

On Sat, Apr 18, 2015 at 2:42 PM, Shushant Arora <shushantaror...@gmail.com>
wrote:

> Thanks !!
> I have few more doubts :
>
> Does kafka RDD uses simpleAPI for kafka consumer or highlevel API, I mean
> do I need to handle offset of partitions myself or it will be taken care by
> KafkaRDD, Plus which one is better for batch programming. I have a
> requirement to read kafka messages by a spark job at  every 2 hours
> interval.
>
> 1.One approach is to use spark stream(with stream duration as 2 hours) +
> kafka - My doubt is -Is spark stream stable enough to handle cluster
> outage, If spark cluster gets restart , will the stream application be able
> to handle it or I need to restart stream application and pass last offsets
> or how is it gonna work ?Plus will the executor nodes be different in each
> run of stream interval or once decided the same nodes will be used
> throughout the application life ? Spark stream use high level Api for kafka
> integration ?
>
> 2.Second Approach  is to Use spark batch job and fire a new  job at every
> 2 hour interval- use kafka RDD to read from kafka, Now doubt is who will
> maintain the offset of last read messages- my application need to maintain
> it or I can use high level API here somehow?
>
> Thanks
> Shushant
>
>
>
> On Sat, Apr 18, 2015 at 9:09 PM, Ilya Ganelin <ilgan...@gmail.com> wrote:
>
>> That's a much better idea :)
>>
>> On Sat, Apr 18, 2015 at 11:22 AM Koert Kuipers <ko...@tresata.com> wrote:
>>
>>> Use KafkaRDD directly. It is in spark-streaming-kafka package
>>>
>>> On Sat, Apr 18, 2015 at 6:43 AM, Shushant Arora <
>>> shushantaror...@gmail.com> wrote:
>>>
>>>> Hi
>>>>
>>>> I want to consume messages from kafka queue using spark batch program
>>>> not spark streaming, Is there any way to achieve this, other than using low
>>>> level(simple api) of kafka consumer.
>>>>
>>>> Thanks
>>>>
>>>
>>>
>

Reply via email to