Github user d2r commented on the pull request: https://github.com/apache/storm/pull/392#issuecomment-71714975 > Not to derail the discussion but personally, I would much rather not store errors in zk at all if its just for rendering the errors in UI. If the spouts/bolts could just store this in memory with some expiration that should suffice and we could expose an API at worker layer to get this information directly from it. If the host dies you lose some errors but that does not seem like a big deal. The only downside will be ui would now have to make requests against worker hosts to get erros but that seems ok to me, you would also get parallelism as all these worker calls can be made in parallel. I haven't thought this through completely and its probably much more work but I would love to hear your opinion. Yeah, we were thinking about distributing things this way too. We figured that the bigger problem is the heartbeats, and if we could get an improvement with less effort here, it would be worth it. It would be a much bigger change to distribute the errors out of ZK, yet maybe it is not a bad idea. (Also, I think it is good to persist the errors anyway, not just in memory. Users would like to see errors on the UI even if there was some issue that brought the supervisor downâlike a rolling upgrade of the cluster.) Maybe we could file a JIRA for better gathering of errors. This change was intended to be small in scope and just give a way to get errors more efficiently when a topology has many, many components. It was prompted by seeing topology page load times of minutes from one of our customers. Plus, this may be less of a problem once heartbeats (and their metrics) are no longer getting sent around, but still it may not a bad idea to use a more distributed model like you suggest.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---