Why would each bolt need to wait till others to complete? Are you using table locks while inserting the data to mysql, is it really intended?
Thanks, Satish. On Wed, Jul 27, 2016 at 11:54 AM, Erik Weathers <[email protected]> wrote: > The heartbeating is done in separate threads from the work execution > threads, so the reason for your need to increase the timeouts isn't as > straight forward as it may appear. In order for the > nimbus.task.timeout.secs & supervisor.worker.timeout.secs timeouts to be > exceeded, the worker process (normally [1]) needs to have completely locked > up. The most common reason for that would be garbage collection, and with > these values you are talking about I wouldn't be surprised to learn you are > incurring large GC pauses. I would look into that rather than just bumping > up the timeouts. > > - Erik > > [1] since the heartbeating is done via disk writes (worker to supervisor) > and ZooKeeper (worker to nimbus), it is *possible* that enough delay is > being introduced in those channels to cause the 30 second timeouts to > expire, but it seems pretty unlikely. > > On Tue, Jul 26, 2016 at 11:12 PM, Navin Ipe < > [email protected]> wrote: > >> Hi, >> >> Recently, I needed to increase the zookeeper timeout to: >> >> - tickTime=20000 >> - initLimit=10 >> - syncLimit=15 >> >> >> and storm.yaml defaults to: >> >> - supervisor.worker.timeout.secs: 600 >> - nimbus.task.timeout.secs: 600 >> - nimbus.supervisor.timeout.secs: 600 >> >> >> Did this because each of my bolts had to write >> <http://programmers.stackexchange.com/questions/325681/concurrent-inserts-to-mysql-or-write-to-separate-tables-and-consolidate-it> >> at least 100000 rows in batches of 1000 to the same table in MySQL, and >> each bolt took time to ack because it couldn't insert to SQL until another >> bolt had finished inserting. This caused Zookeeper to not receive a >> heartbeat and timeout. >> >> My supervisor advised that the better way to tackle this problem would be >> to: >> >> - Leave the zookeeper and storm defaults as it is because the >> creators of Storm had designed the defaults for a reason. >> - Also because everytime we upgrade Storm to a newer version, we'd >> have to remember to change those parameters. >> - Perhaps the design of the topology could be changed to have the >> bolts write fewer rows and ack more quickly so that there is no timeout. >> >> *My questions:* >> Are the storm and zookeeper defaults better left alone for the above >> reasons? >> >> >> -- >> Regards, >> Navin >> > >
