If they're files in a file system, and you don't actually need
multiple kinds of consumers, have you considered
streamingContext.fileStream instead of kafka?
On Wed, Jul 20, 2016 at 5:40 AM, Rabin Banerjee
wrote:
> Hi Cody,
>
> Thanks for your reply .
>
>Let Me elaborate a bit,We have a D
Hi Cody,
Thanks for your reply .
Let Me elaborate a bit,We have a Directory where small xml(90 KB) files
are continuously coming(pushed from other node).File has ID & Timestamp in
name and also inside record. Data coming in the directory has to be pushed
to Kafka to finally get into Spar
Unless you're using only 1 partition per topic, there's no reasonable
way of doing this. Offsets for one topicpartition do not necessarily
have anything to do with offsets for another topicpartition. You
could do the last (200 / number of partitions) messages per
topicpartition, but you have no g
Just to add ,
I want to read the MAX_OFFSET of a topic , then read MAX_OFFSET-200 ,
every time .
Also I want to know , If I want to fetch a specific offset range for Batch
processing, is there any option for doing that ?
On Sat, Jul 16, 2016 at 9:08 PM, Rabin Banerjee <
dev.rabin.baner...@gm
HI All,
I have 1000 kafka topics each storing messages for different devices . I
want to use the direct approach for connecting kafka from Spark , in which
I am only interested in latest 200 messages in the Kafka .
How do I do that ?
Thanks.