Hello,

I'm new to storm, I'm trying to build a trident topology that exports rows
from an oracle db to hdfs. I'm using an existing implementation for
HdfsState

I wrote my own trident spout that emits tuples in 1-hour time slices (so
the coordinator metadata are timestamps), so for instance, if the time now
is 5:30pm, and the current position of the spout coordinator is 12pm, the
spout will emit 5 batches of tuples.

My problem is that I'm not sure if I'm taking advantage of storm's
distributed processing, will those 5 batches be processed in parallel? Is
my design optimal? Here's how my topology looks like:

 topology.*newStream*(tableName + "_AUDIT_STREAM", auditSpout)        .
*partitionPersist*(factory, tableFields, *new* *HdfsUpdater*(), *new*
*Fields*());


Thanks

Reply via email to