Using bootstrap streams

Garry Turkington Mon, 10 Feb 2014 03:20:08 -0800

Hi,

I was building a task to do some sentiment analysis on incoming data. I have a 
corpus each of positive and negative words to which the task needs access. This 
seemed like a good fit for bootstrap streams. But I can't seem to get them to 
work.


I have my job configured with the 3 Kafka topics in task.inputs and that seems 
to work, just throwing data at any of the topics is hitting the task.

But setting up the 2 reference streams as bootstrap doesn't seem to be working. 
Here's the relevant part of the config, I want to read the entire message 
history each time:

systems.kafka.streams.positive-words.samza.bootstrap=true
systems.kafka.streams.positive-words.samza.reset.offset=true

systems.kafka.streams.negative-words.samza.bootstrap=true
systems.kafka.streams.negative-words.samza.reset.offset=true

Do bootstrap streams get handled in any special way, I'm assuming here that the 
messages will arrive in the process method on StreamTask just like any other 
and I can handle them differently by switching on 
envelope.getSystemStreamPartition().getSystemStream().getStream(). Looking at 
the code it looks the same with the BootstrapChooser doing its magic to 
determine which message is delivered to the task but the actual delivery seems 
the same.

What am I missing?

Thanks,
Garry

Using bootstrap streams

Reply via email to