Re: Significant Bug

Thomas L. Redman Thu, 29 Oct 2020 07:39:04 -0700

Thanks so much. I will look into this and it’s good to hear there is a team 
intact, as we have just migrated to Storm on a big project, my boss would tar 
and feather me if it went south. I will looking into this and post back what I 
find.



> On Oct 29, 2020, at 8:07 AM, Kishor Patil <[email protected]> wrote:
> 
> Hello Thomas,
> 
> Apologies for delay in responding here. I tested the topology code provided 
> in storm-issue repo. 
> *only one machine gets peggeg*: Although it appears, his is not a bug. This 
> is related to Locality Awareness. Please refer to 
> https://github.com/apache/storm/blob/master/docs/LocalityAwareness.md
> It appears spout to bolt ratio is 200, so if there are enough bolt's on 
> single node to handle events generated by the spout, it won't send events out 
> to another node unless it runs out of capacity on single node. If you do not 
> like this and want to distribute events evenly, you can try disabling this 
> feature. You can turn off LoadAwareShuffleGrouping by setting 
> topology.disable.loadaware.messaging to true.
> -Kishor
> 
> On 2020/10/28 15:21:54, "Thomas L. Redman" <[email protected]> wrote: 
>> What’s the word on this? I sent this out some time ago, including a GitHub 
>> project that clearly demonstrates the brokenness, yet I have not heard a 
>> word. Is there anybody supporting Storm?
>> 
>>> On Sep 30, 2020, at 9:03 AM, Thomas L. Redman <[email protected]> wrote:
>>> 
>>> I believe I have encountered a significant bug. It seems topologies 
>>> employing anchored tuples do not distribute across multiple nodes, 
>>> regardless of the computation demands of the bolts. It works fine on a 
>>> single node, but when throwing multiple nodes into the mix, only one 
>>> machine gets pegged. When we disable anchoring, it will distribute across 
>>> all nodes just fine, pegging each machine appropriately.
>>> 
>>> This bug manifests from version 2.1 forward. I first encountered this issue 
>>> with my own production cluster on an app that does significant NLP 
>>> computation across hundreds of millions of documents. This topology is 
>>> fairly complex, so I developed a very simple exemplar that demonstrates the 
>>> issue with only one spout and bolt. I pushed this demonstration up to 
>>> github to provide the developers with a mechanism to easily isolate the 
>>> bug, and maybe provide some workaround. I used gradle to build this simple 
>>> topology and software and package the results. This code is well 
>>> documented, so it should be fairly simple to reproduce the issue. I first 
>>> encountered this issue on 3 32 core nodes, but when I started 
>>> experimenting, I set up a test cluster with 8 cores, and then I increased 
>>> each node to 16 cores, and plenty of memory in every case.
>>> 
>>> The topology can be accessed from github at 
>>> https://github.com/cowchipkid/storm-issue.git 
>>> <https://github.com/cowchipkid/storm-issue.git>. Please feel free to 
>>> respond to me directory if you have any questions that are beyond the scope 
>>> of this mail list.
>> 
>>

Re: Significant Bug

Reply via email to