[ 
https://issues.apache.org/jira/browse/STORM-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16745154#comment-16745154
 ] 

Stig Rohde Døssing commented on STORM-2359:
-------------------------------------------

Thanks, responded there.

Just to give an update on where I am with this, I took another look at whether 
we can avoid all the extra tracking of anchors in the critical path. It turns 
out we can, if the JCTools queues are updated to allow another thread to peek 
at the queue contents. I have a local branch of JCTools that seems like it can 
do this, so I'll try suggesting adding this method to the JCTools guys.

If we can peek at the queue contents, we can do resets for all queued tuples 
without adding any extra logic to the critical path. The timeout resetter 
thread can just look at the queue contents, without having to involve the 
critical path code in maintaining a ConcurrentHashMap of anchors. 

Resetting timeouts for queued tuples will let users reset timeouts that have 
been delivered to a bolt's exeute() without risking queued tuples timing out 
(the method is effectively useless right now). 

As a convenience, we can add a component-level configuration that will do 
automatic timeout reset for tuples that have been delivered to execute() but 
not acked or failed, using the same technique (a ConcurrentHashMap) as I 
proposed above for the worker.

This solution would be much better than what I suggested before, because it 
means users can choose to enable the expensive tracking only for specific 
problem bolts, and only if they don't want to/can't manually call 
collector.resetTimeout. Resetting for tuples that are queued will be pretty 
cheap, because it doesn't add any slowdown to the critical path, and only 
produces reset tuples when processing is actually slow.

> Revising Message Timeouts
> -------------------------
>
>                 Key: STORM-2359
>                 URL: https://issues.apache.org/jira/browse/STORM-2359
>             Project: Apache Storm
>          Issue Type: Sub-task
>          Components: storm-core
>    Affects Versions: 2.0.0
>            Reporter: Roshan Naik
>            Assignee: Stig Rohde Døssing
>            Priority: Major
>         Attachments: STORM-2359-with-auto-reset.ods, STORM-2359.ods
>
>
> A revised strategy for message timeouts is proposed here.
> Design Doc:
>  
> https://docs.google.com/document/d/1am1kO7Wmf17U_Vz5_uyBB2OuSsc4TZQWRvbRhX52n5w/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to