can I partition streams on multiple fields? and what is the result?

magda Thu, 06 Mar 2014 10:39:12 -0800

hello,

i'm having trouble getting the right idea how to parallelize my topology.


I have a bolt subscribing to 3 streams A,B,C. All of the streams have a
field1 which is between 0-n. so it makes sense to fieldgroup on field1
with a prallism.hint of n.

Stream A and B are hashmaps which are emitted every minute or five.

But Stream C is the actual stream to process (the others are
parameters which are needed for my calculation) is huge and can/should
be partitioned by field2 (an id which are longs between 1 and n*1000000)

up until now because n was just 1. i just made an allgrouping for stream
A and B

builder.setBolt("bolt", new PBolt(),4)
 .allGrouping("A").allGrouping("B")
 .fieldsGrouping("C", new Fields("field2") );

but now i'm trying to scale n, since all streams can be partitioned by
field1. i could just take group by field1 but i want to achieve a
higher parallelism then n.

can i do this like this?
builder.setBolt("bolt", new PBolt(),n*4)
 .fieldsGrouping("A", new Fields("field1") )
 .fieldsGrouping("B",new Fields("field1"))
 .fieldsGrouping("C", new Fields("field1","field2"));

Is one tuple from A or B going to be processed by all 4 excecuters
where tuples from C with field1 are processed? (this is what i would
want). if not how could i do this?

thanks for any help,
aky

can I partition streams on multiple fields? and what is the result?

Reply via email to