Re: Are storm yaml and zookeeper defaults sacrosanct?

Navin Ipe Wed, 27 Jul 2016 00:01:50 -0700

@Satish: I'm not using table locks, but concurrent inserts to the same
table get queued up, right?
http://stackoverflow.com/questions/32087233/concurrent-insert-with-mysql/32288484#32288484


The original question is still open. Are the defaults best left alone?

On Wed, Jul 27, 2016 at 12:15 PM, Satish Duggana <[email protected]>
wrote:

> Why would each bolt need to wait till others to complete? Are you using
> table locks while inserting the data to mysql, is it really intended?
>
> Thanks,
> Satish.
>
> On Wed, Jul 27, 2016 at 11:54 AM, Erik Weathers <[email protected]>
> wrote:
>
>> The heartbeating is done in separate threads from the work execution
>> threads, so the reason for your need to increase the timeouts isn't as
>> straight forward as it may appear.  In order for the
>> nimbus.task.timeout.secs & supervisor.worker.timeout.secs timeouts to be
>> exceeded, the worker process (normally [1]) needs to have completely locked
>> up.  The most common reason for that would be garbage collection, and with
>> these values you are talking about I wouldn't be surprised to learn you are
>> incurring large GC pauses.  I would look into that rather than just bumping
>> up the timeouts.
>>
>> - Erik
>>
>> [1] since the heartbeating is done via disk writes (worker to supervisor)
>> and ZooKeeper (worker to nimbus), it is *possible* that enough delay is
>> being introduced in those channels to cause the 30 second timeouts to
>> expire, but it seems pretty unlikely.
>>
>> On Tue, Jul 26, 2016 at 11:12 PM, Navin Ipe <
>> [email protected]> wrote:
>>
>>> Hi,
>>>
>>> Recently, I needed to increase the zookeeper timeout to:
>>>
>>>    - tickTime=20000
>>>    - initLimit=10
>>>    - syncLimit=15
>>>
>>>
>>> and storm.yaml defaults to:
>>>
>>>    - supervisor.worker.timeout.secs: 600
>>>    - nimbus.task.timeout.secs: 600
>>>    - nimbus.supervisor.timeout.secs: 600
>>>
>>>
>>> Did this because each of my bolts had to write
>>> <http://programmers.stackexchange.com/questions/325681/concurrent-inserts-to-mysql-or-write-to-separate-tables-and-consolidate-it>
>>> at least 100000 rows in batches of 1000 to the same table in MySQL, and
>>> each bolt took time to ack because it couldn't insert to SQL until another
>>> bolt had finished inserting. This caused Zookeeper to not receive a
>>> heartbeat and timeout.
>>>
>>> My supervisor advised that the better way to tackle this problem would
>>> be to:
>>>
>>>    - Leave the zookeeper and storm defaults as it is because the
>>>    creators of Storm had designed the defaults for a reason.
>>>    - Also because everytime we upgrade Storm to a newer version, we'd
>>>    have to remember to change those parameters.
>>>    - Perhaps the design of the topology could be changed to have the
>>>    bolts write fewer rows and ack more quickly so that there is no timeout.
>>>
>>> *My questions:*
>>> Are the storm and zookeeper defaults better left alone for the above
>>> reasons?
>>>
>>>
>>> --
>>> Regards,
>>> Navin
>>>
>>
>>
>


-- 
Regards,
Navin

Re: Are storm yaml and zookeeper defaults sacrosanct?

Reply via email to