Thanks Priyank! Are there any method to guarantee that same task just receives the tuples with same fields?
On Wed, Nov 4, 2015 at 11:21 AM, Priyank Shah <[email protected]> wrote: > Hi Shuo, > > Seeing a lot of group errors in log file is expected. From > http://storm.apache.org/documentation/Concepts.html the description of > Field Grouping says > > > 1. Fields grouping: The stream is partitioned by the fields specified > in the grouping. For example, if the stream is grouped by the "user-id" > field, tuples with the same "user-id" will always go to the same task, but > tuples with different "user-id"'s may go to different tasks. > > It means the tuples with same values for field A and B will always go to > the same task but it does not mean that tuples with other vales for field A > and B cannot go to the same task. For e.g. If your input data has following > tuples > > 1, 2, 3, 4 > 1, 2, 5, 6 > 3, 4, 5, 6 > > In the above scenario the first two tuples are guaranteed to go to the > same task but the third tuple can also go to the same task, specially when > parallelism hint is set to 1 for BoltY. There is no other task. Think about > it like hashcode method in java. Equal objects always have same hash codes > but two different objects can have the same hashcode. > > From: Shuo Chen > Reply-To: "[email protected]" > Date: Tuesday, November 3, 2015 at 6:46 PM > To: "[email protected]" > Subject: multiple fields grouping in storm > > I have two Bolt class BoltX and BoltY. BoltY receives tuples from BoltX. > BoltX declares output with multiple fields, each tuple contains 4 strings: > > class BoltX implements IBasicBolt { > ... > public void declareOutputFields(OutputFieldsDeclarer declarer) { > declarer.declare(new Fields("A","B","C","D")); > }} > > In BoltY: > > class BoltX implements IBasicBolt { > boolean hasReceive = false; > String A = null; > String B = null; > ... > public void execute(Tuple input, BasicOutputCollector collector) { > if (!hasReceive) { > hasReceive = true; > A = input.getString(0); > B = input.getString(1); > } > > if (!input.getString(0).equals(A) || !input.getString(1).equals(B)) { > LOG.error("group error"); > return; > } > ... > } > ...} > > In Topology: > > ... > builder.setBolt("x", new BoltX(), 3); > builder.setBolt("y", new Bolty(), 3).fieldsGrouping("x", new Fields("A", > "B"));... > > I think that the output from x with same fields "A" and "B" will go to the > same task of BoltY. > > However, the log of topology shows lots of "group error". > > So how to group outputs with same fields "A" and "B" to the same task of > BoltY? > > The question is also asked in > http://stackoverflow.com/questions/33512554/multiple-fields-grouping-in-storm > > -- > *Shuo Chen* > [email protected] > [email protected] > -- *陈硕* *Shuo Chen* [email protected] [email protected]
