How can I find out why workers do not get any tuples? After they have successfully processed a few thousand.
I have also tested the *allGrouping* to ensure that each Bolt must receive tuples. But two workers including two Bolts stop receiving tuples after a few seconds. I would appreciate any help! 2015-02-25 17:40 GMT+01:00 Harsha <[email protected]>: > My bad was looking at another supervisor.log. There are no errors in > supervisor and worker logs. > > -Harsha > > On Wed, Feb 25, 2015, at 08:29 AM, Martin Illecker wrote: > > Hi Harsha, > > I'm using three c3.4xlarge EC2 instances: > 1) Nimbus, WebUI, Zookeeper, Supervisor > 2) Zookeeper, Supervisor > 3) Zookeeper, Supervisor > > I cannot find this error message in my attached supervisor log? > By the way, I'm running on Ubuntu EC2 nodes and there is no path C:\. > > I have not made any changes in these timeout values. Should be the default > values: > storm.zookeeper.session.timeout: 20000 > storm.zookeeper.connection.timeout: 15000 > supervisor.worker.timeout.secs: 30 > > Thanks! > Best regards > Martin > > > 2015-02-25 17:03 GMT+01:00 Harsha <[email protected]>: > > > Hi Martin, > Can you share your storm.zookeeper.session.timeout and > storm.zookeeper.connection.timeout and supervisor.worker.timeout.secs. By > looking at the supervisor logs I see > Error when processing event > java.io.FileNotFoundException: File > 'c:\hdistorm\workers\f3e70029-c5c8-4f55-a4a1-396096b37509\heartbeats\1417082031858' > > you might be running into https://issues.apache.org/jira/browse/STORM-682 > Is your zookeeper cluster on a different set of nodes and can you check > you are able to connect to it without any issues > -Harsha > > > > On Wed, Feb 25, 2015, at 03:49 AM, Martin Illecker wrote: > > Hi, > > I'm still observing this strange issue. > Two of three workers stop processing after a few seconds. (each worker is > running on one dedicated EC2 node) > > My guess would be that the output stream of one spout is not properly > distributed over all three workers. > Or somehow directed to one worker only? But *shuffleGrouping* should > guarantee equal distribution among multiple bolts right? > > I'm using the following topology: > > > TopologyBuilder builder = new TopologyBuilder(); > > builder.setSpout("dataset-spout", spout); > > builder.setBolt("tokenizer-bolt", tokenizerBolt, 3).shuffleGrouping( > > "dataset-spout"); > > builder.setBolt("preprocessor-bolt", preprocessorBolt, 3).shuffleGrouping( > > "tokenizer-bolt"); > > conf.setMaxSpoutPending(2000); > > conf.setNumWorkers(3); > > StormSubmitter > > .submitTopology(TOPOLOGY_NAME, conf, builder.createTopology()); > > I have attached the screenshots of the topology and the truncated worker > and supervisor log of one idle worker. > > The supervisor log includes a few interesting lines, but I think they are > normal? > > supervisor [INFO] e76bc338-2ba5-444b-9854-bca94f9587b7 still hasn't started > > I hope, someone can help me with this issue! > > Thanks > Best regards > Martin > > > 2015-02-24 20:37 GMT+01:00 Martin Illecker <[email protected]>: > > Hi, > > I'm trying to run a topology on EC2, but I'm observing the following > strange issue: > > Some workers stop processing after a few seconds, without any error in the > worker log. > > For example, my topology consists of 3 workers and each worker is running > on its own EC2 node. > Two of them stop processing after a few seconds. But they have already > processed several tuples successfully. > > I'm using only one spout and shuffleGrouping at all bolts. > If I add more spouts then all workers keep processing, but the performance > is very bad. > > Does anyone have a guess why this happens? > > The topology is currently running at: > http://54.155.156.203:8080 > > Thanks! > > Martin > > > > > > > Email had 4 attachments: > > - topology.jpeg > 161k (image/jpeg) > - component.jpeg > 183k (image/jpeg) > - supervisor.log > 7k (application/octet-stream) > - worker.log > 37k (application/octet-stream) > > > > > > >
