Hi,

Wanted to pick Cody's mind on what he thinks about
DirectKafkaInputDStream/KafkaRDD internally using the new Kafka consumer
API. I know the latter is documented as beta-quality, but yet wanted to
know if he sees any blockers as to why shouldn't go there shortly. On my
side the consideration is that kafka 0.9.0.0 introduced Authentication and
Encryption (beta again) between clients & brokers, but this is available
only newer Consumer API's and not in the older Low-level/High-level API's.

>From briefly studying the implementation of
DirectKafkaInputDStream/KafkaRDD and new Consumer API, my thinking is that
it is possible to support the exact current implementation you have using
the new API's.
 One area that isnt so straightforward was the ctor of KafkaRDD fixes the
offsetRange (I did read about the deterministic feature you were after) and
i couldnt find a direct method in the new Consumer API to get the current
'latest' offset - however one can do a consumer.seekToEnd() and then call a
consumer.position().
 Of course one other benefit is that the new Consumer API's abstracts away
having to deal with finding the leader for a partition, so can get rid of
that code

Would be great to get your thoughts.

thanks in advance
Mario

Reply via email to