you need to create a unit test that shows this happening, so people can run it. http://docs.jboss.org/drools/release/5.4.0.Final/droolsjbpm-introduction-docs/html/gettingstarted.html
Mark On 11 Jan 2013, at 14:28, Wolfgang Laun <wolfgang.l...@gmail.com> wrote: > Any takers? > > -W > > ---------- Forwarded message ---------- > From: Svenja Brunstein <svenja.brunst...@gmail.com> > Date: Fri, 11 Jan 2013 15:00:07 +0100 > Subject: Re: [rules-users] 14GB of NotNodeLeftTuples produced by one rule? > To: Rules Users List <rules-us...@lists.jboss.org> > >> >> As you can see, I insert 300,000 facts, and there wasn't a problem >> with memory, so I didn't check the node count. Are there any other >> rules in the rule base? > > Thanks! We have six other rules in the rule base, but none of them has a > NOT in it. So I hope all tuples belong to that rule, and as I tested > without that specific rule there were no NotNodeLeftTuples. > In the case where the 14GB of tupels were produced our distribution > actually was *not *randomly distributed, but still with a random > distribution we have a lot of NOT node tuples, e.g., 6 million NOT node > tuples for 128*128=16,384 possible nodes. > > Indeed, I'd expect the >> NotNodeLeftTuples to have the same order of magnitude. > > Can you explain when a NotNodeLeftTuple is created exactly, and when it is > removed/not created? Unfortunately, I did not find any information or > documentation on this. We definitely want to understand in which case so > many tuples are created to be able to avoid this behavior or construct > rules in other ways. > > 2013/1/7 Wolfgang Laun <wolfgang.l...@gmail.com> > >> On 07/01/2013, Svenja Brunstein <svenja.brunst...@gmail.com> wrote: >>>> >>>> The system will create network nodes even when only one pattern matches. >>>> 150,000/50,000 = 3 exactly, or average? >>> >>> 3 exactly. >> >> I was running your rule with this data using Drools 5.2/3/4/5.0: >> >> WorkingMemoryEntryPoint ep = >> kSession.getWorkingMemoryEntryPoint("internalstream"); >> Event e; >> Random rand = new Random(); >> for( int id = 1; id <= 100000; id++ ){ >> int iName = rand.nextInt( 1000 ) + 1; >> e = new Event( "a", "Joe" + iName ); >> e.put( "id", "id"+id ); >> ep.insert( e ); >> e = new Event( "a", "Jack" + iName ); >> e.put( "id", "id"+id ); >> ep.insert( e ); >> e = new Event( "a", "Jill" + iName ); >> e.put( "id", "id"+id ); >> ep.insert( e ); >> } >> As you can see, I insert 300,000 facts, and there wasn't a problem >> with memory, so I didn't check the node count. Are there any other >> rules in the rule base? >> >> As you have written: you'll get 50,000 x 6 = 300,000 fact reference >> pairs to represent matching EventA pairs. Indeed, I'd expect the >> NotNodeLeftTuples to have the same order of magnitude. >> >>>> Is the distribution of id/user combinations realistic? >>> >>> What do you mean by realistic? >> >> Is this as it will be in production? >> >>> >>> >>>> What else do >>>> you need to do with Event type "a"? Similar? Completely different? - >>>> There would be a simple solution to significantly reduce the memory >>>> requirements, but it may not be feasible due to these answers. >>> >>> At the moment we are just designing a generic solution, which might be >>> extended by rules afterwards, so that "old" events might need to be >> reused. >>> In a real environment, of course, we would retract some events not needed >>> any longer. But for now we are doing some performance testing and were >>> surprised that we could "crash" the system with one single rule. Of >> course, >>> with a lot of events ;-) >> >> You can crash with even simpler rules :-) >> >> -W >> >>> >>> 2013/1/7 Wolfgang Laun <wolfgang.l...@gmail.com> >>> >>>> On 07/01/2013, Svenja Brunstein <svenja.brunst...@gmail.com> wrote: >>>>> Thanks for the input. For 150,000 type "a" events we had about 50,000 >>>>> different ids and 1,000 user values. >>>>> After all, combinations possible for type "b" were only 1,000,000 >>>>> (1,000 >>>>> users * 1,000 users), which is why I am surprised to have 88 million >>>>> instances. >>>> >>>> The system will create network nodes even when only one pattern matches. >>>> 150,000/50,000 = 3 exactly, or average? >>>> >>>> If you have 3 events A, B, C with identical ids and different users, >>>> you'll get the following candidates for an activation: (A,B), (B,A), >>>> (A,C), (C,A), (B,C), (C,B) >>>> and this increases O(n^2). - Since you know the exact distribution of >>>> your data, you might compute this precisely. >>>> >>>> Is the distribution of id/user combinations realistic? What else do >>>> you need to do with Event type "a"? Similar? Completely different? - >>>> There would be a simple solution to significantly reduce the memory >>>> requirements, but it may not be feasible due to these answers. >>>> >>>>> >>>>> Yes, it is intentional to have the rule fire twice for each >> combination >>>> :-) >>>>> Unfortunately, retracting events is not an option right now. >>>> >>>> Then, at least, generate both in a single rule. >>>> >>>>> >>>>> I started another round, where I ensured to insert a lot more "b" >>>>> events: >>>>> The memory used by NotNodeLeftTuples is a lot less, even though these >>>> nodes >>>>> still use most of the memory. >>>>> Concluding from all that, I guess it is possible that the nodes take >>>>> that >>>>> much space (up to many GB), and the more events are inserted which >>>>> invalidate the NOT nodes, the less memory is used by them? >>>> >>>> Well, you don't need the NOT node, and their number depends on the >>>> distribution of your data. >>>> >>>> -W >>>> >>>>> >>>>> 2013/1/7 Wolfgang Laun <wolfgang.l...@gmail.com> >>>>> >>>>>> The amount of memory required for 150K type "a" depends on the actual >>>>>> distribution of this data w.r.t. fields id and user, and other >>>>>> circumstances; it is not only the rule that is to blame. >>>>>> >>>>>> There is one flaw, though: The rule would fire twice for a matching >>>>>> pair of events of type "a". It's possible that you do want to have a >>>>>> type "b" for both combinations of user and friendid, but you could >>>>>> create both in a single rule, which should halve your memory >>>>>> requirements. If there is no ordered attribute, use the timestamp to >>>>>> restrict a pair to only one combination (hint: "after"). >>>>>> >>>>>> This will still generate a lot of network nodes. >>>>>> >>>>>> Other ideas for reduction may have to take the entire application >>>>>> scenario into account, e.g., can you retract events after they have >>>>>> been paired, or how do you do inserts and calls to fireAllRules, etc. >>>>>> Most importantly, however, is the actual frequency of id and user >>>>>> values in relation to type "a" events. >>>>>> >>>>>> -W >>>>>> >>>>>> >>>>>> >>>>>> On 07/01/2013, Svenja Brunstein <svenja.brunst...@gmail.com> wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> we observe a strange behavior with one of our rules. After >>>>>>> deployment >>>>>>> and sending lots of events (~150,000 of type "a"), the server slows >>>>>>> down >>>>>>> rapidly until it runs out of memory. >>>>>>> We checked with VisualVM which objects are filling the memory: In >>>>>>> one >>>>>>> moment there were almost 14GB of NotNodeLeftTuples (88,933,186 >>>>>> Instances)! >>>>>>> >>>>>>> This is our rule: >>>>>>> >>>>>>> rule "example" >>>>>>> when >>>>>>> $evt1:EventObject(type=='a', $id:data['id'], $user:user) from >>>>>>> entry-point >>>>>>> internalstream >>>>>>> $evt2:EventObject(type=='a', data['id']==$id, user!=$user, >>>> $user2:user) >>>>>>> from entry-point internalstream >>>>>>> not(EventObject(type=='b', user==$user, data['friendid']==$user2) >>>>>>> from >>>>>>> entry-point internalstream) >>>>>>> then >>>>>>> EventObject evt = new EventObject(); >>>>>>> evt.setType('b'); >>>>>>> evt.setUser($evt1.getUser()); >>>>>>> evt.put('friendid', $evt2.getUser()); >>>>>>> entryPoints['internalstream'].insert(evt); >>>>>>> end >>>>>>> >>>>>>> Is that behavior correct for such a size of event combinations when >>>>>> using a >>>>>>> NOT in the rule? >>>>>>> >>>>>>> Thanks, >>>>>>> Svenja >>>>>>> >>>>>> _______________________________________________ >>>>>> rules-users mailing list >>>>>> rules-us...@lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/rules-users >>>>>> >>>>> >>>> _______________________________________________ >>>> rules-users mailing list >>>> rules-us...@lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/rules-users >>>> >>> >> _______________________________________________ >> rules-users mailing list >> rules-us...@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/rules-users >> > _______________________________________________ > rules-dev mailing list > rules-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/rules-dev _______________________________________________ rules-dev mailing list rules-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-dev