[ https://issues.apache.org/jira/browse/STORM-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350457#comment-15350457 ]
ASF GitHub Bot commented on STORM-1742: --------------------------------------- Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1379 Also created pull request against master. > More accurate 'complete latency' > -------------------------------- > > Key: STORM-1742 > URL: https://issues.apache.org/jira/browse/STORM-1742 > Project: Apache Storm > Issue Type: Improvement > Components: storm-core > Reporter: Jungtaek Lim > Assignee: Jungtaek Lim > > I already initiated talking thread on dev@ list. Below is copy of the content > in my mail. > http://mail-archives.apache.org/mod_mbox/storm-dev/201604.mbox/%3CCAF5108gn=rskundfs7-sgy_pd-_prgj2hf2t5e5zppp-knd...@mail.gmail.com%3E > While thinking about metrics improvements, I doubt how many users know that > what 'exactly' is complete latency. In fact, it's somewhat complicated > because additional waiting time could be added to complete latency because > of single-thread model event loop of spout. > Long running nextTuple() / ack() / fail() can affect complete latency but > it's behind the scene. No latency information provided, and someone even > didn't know about this characteristic. Moreover, calling nextTuple() could > be skipped due to max spout waiting, which will make us harder to guess > when avg. latency of nextTuple() will be provided. > I think separation of threads (tuple handler to separate thread, as JStorm > provides) would resolve the gap, but it requires our spout logic to be > thread-safe, so I'd like to find workaround first. > My sketched idea is let Acker decides end time for root tuple. > There're two subsequent ways to decide start time for root tuple, > 1. when Spout about to emit ACK_INIT to Acker (in other words, keep it as > it is) > - Acker sends ack / fail message to Spout with timestamp, and Spout > calculates time delta > - pros. : It's most accurate way since it respects the definition of > 'complete latency'. > - cons. : The sync of machine time between machines are very important. > Sub-millisecond of precision would be required. > 2. when Acker receives ACK_INIT from Spout > - Acker calculates time delta itself, and sends ack / fail message to > Spout with time delta > - pros. : No requirement to sync the time between servers so strictly. > - cons. : It doesn't contain the latency to send / receive ACK_INIT > between Spout and Acker. -- This message was sent by Atlassian JIRA (v6.3.4#6332)