As I said, you should figure out *why* your processes are locking up IMHO.
  Changing the timeouts simply covers up the behavior.

On Wed, Jul 27, 2016 at 12:00 AM, Navin Ipe <[email protected]
> wrote:

> @Satish: I'm not using table locks, but concurrent inserts to the same
> table get queued up, right?
> http://stackoverflow.com/questions/32087233/concurrent-insert-with-mysql/32288484#32288484
>
> The original question is still open. Are the defaults best left alone?
>
> On Wed, Jul 27, 2016 at 12:15 PM, Satish Duggana <[email protected]
> > wrote:
>
>> Why would each bolt need to wait till others to complete? Are you using
>> table locks while inserting the data to mysql, is it really intended?
>>
>> Thanks,
>> Satish.
>>
>> On Wed, Jul 27, 2016 at 11:54 AM, Erik Weathers <[email protected]>
>> wrote:
>>
>>> The heartbeating is done in separate threads from the work execution
>>> threads, so the reason for your need to increase the timeouts isn't as
>>> straight forward as it may appear.  In order for the
>>> nimbus.task.timeout.secs & supervisor.worker.timeout.secs timeouts to be
>>> exceeded, the worker process (normally [1]) needs to have completely locked
>>> up.  The most common reason for that would be garbage collection, and with
>>> these values you are talking about I wouldn't be surprised to learn you are
>>> incurring large GC pauses.  I would look into that rather than just bumping
>>> up the timeouts.
>>>
>>> - Erik
>>>
>>> [1] since the heartbeating is done via disk writes (worker to
>>> supervisor) and ZooKeeper (worker to nimbus), it is *possible* that enough
>>> delay is being introduced in those channels to cause the 30 second timeouts
>>> to expire, but it seems pretty unlikely.
>>>
>>> On Tue, Jul 26, 2016 at 11:12 PM, Navin Ipe <
>>> [email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> Recently, I needed to increase the zookeeper timeout to:
>>>>
>>>>    - tickTime=20000
>>>>    - initLimit=10
>>>>    - syncLimit=15
>>>>
>>>>
>>>> and storm.yaml defaults to:
>>>>
>>>>    - supervisor.worker.timeout.secs: 600
>>>>    - nimbus.task.timeout.secs: 600
>>>>    - nimbus.supervisor.timeout.secs: 600
>>>>
>>>>
>>>> Did this because each of my bolts had to write
>>>> <http://programmers.stackexchange.com/questions/325681/concurrent-inserts-to-mysql-or-write-to-separate-tables-and-consolidate-it>
>>>> at least 100000 rows in batches of 1000 to the same table in MySQL, and
>>>> each bolt took time to ack because it couldn't insert to SQL until another
>>>> bolt had finished inserting. This caused Zookeeper to not receive a
>>>> heartbeat and timeout.
>>>>
>>>> My supervisor advised that the better way to tackle this problem would
>>>> be to:
>>>>
>>>>    - Leave the zookeeper and storm defaults as it is because the
>>>>    creators of Storm had designed the defaults for a reason.
>>>>    - Also because everytime we upgrade Storm to a newer version, we'd
>>>>    have to remember to change those parameters.
>>>>    - Perhaps the design of the topology could be changed to have the
>>>>    bolts write fewer rows and ack more quickly so that there is no timeout.
>>>>
>>>> *My questions:*
>>>> Are the storm and zookeeper defaults better left alone for the above
>>>> reasons?
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Navin
>>>>
>>>
>>>
>>
>
>
> --
> Regards,
> Navin
>

Reply via email to