@Mark: Note that I've been counting the FactClass::hashCode calls only, not hashCode calls of all Objects. So you're hunting 5000 calls but that extra one won't spoil your day ;-)
-W On 23/02/2014, Mark Proctor <mproc...@codehaus.org> wrote: > We need to check where the 4999 are coming from. It makes sense that there > are 5000 hashCode calls on object insertion, but unless i'm missing > something there should be just one hashCode call for the accumulate node. > > Line 692 should be called just once, per left input propagation for the > accumulate per fireAllRules call: > https://github.com/droolsjbpm/drools/blob/master/drools-core/src/main/java/org/drools/core/phreak/PhreakAccumulateNode.java#L692 > > If line 692 is called just once, a profiler will highlight where else > hashCode is being called. We can then investigate if it's necessary, or if > it can be avoided. > > Mark > > On 23 Feb 2014, at 15:03, Wolfgang Laun <wolfgang.l...@gmail.com> wrote: > >> Using OPs rule: >> >> 5.5.0 >> 5000 facts, hashCode computed 12507500 times >> >> 6.0.0 >> 5000 facts, hashCode computed 10000 times >> >> After 5000 insertions and before fireAllRules, there were 5000 >> computations of hashCode. After the Engine is started, another 5000 >> computations occur. >> This might bear investigation, although O(n) ist much better than O(n^2). >> >> -W >> >> >> On 23/02/2014, Mark Proctor <mproc...@codehaus.org> wrote: >>> Please do report back to the list if 6.0 alleviates this problem, >>> without >>> needing a blocking fact. >>> >>> Mark >>> On 23 Feb 2014, at 13:30, Mark Proctor <mproc...@codehaus.org> wrote: >>> >>>> "I have seen a few related posts on what I have found, the #hashCode of >>>> my >>>> ComplexClass is taking nearly all of the time. >>>> Upon further investigation, I found that this is happening because >>>> every >>>> insert to the session is causing the `result` portion of the accumulate >>>> to >>>> "recalculate". " >>>> >>>> There isn't much we can do about your has code performance time. But if >>>> you upgrade to 6.0, the batch oriented propagation will reduce how >>>> often >>>> this is done, and hopefully minimise the difference. You won't then need >>>> a >>>> blocking fact. >>>> >>>> Mark >>>> On 23 Feb 2014, at 03:38, mikerod <mjr4...@gmail.com> wrote: >>>> >>>>> @laune >>>>> >>>>> You are correct that I actually put an incorrect time up before. >>>>> Thanks >>>>> for >>>>> pointing that out and sorry for the confusion. >>>>> The behavioral difference I have found was actually much large between >>>>> the 2 >>>>> classes, SimpleClass and ComplexClass, than I originally >>>>> thought. >>>>> >>>>> The SimpleClass accumulate is very quick, around ~300 ms. The >>>>> ComplexClass >>>>> accumulate (with the exact same rule beyond the object type) >>>>> spikes to around ~140 *seconds*. In both cases this is with either 5K >>>>> objects of one type inserted. >>>>> >>>>> @Mark >>>>> >>>>> I am sure that it is not the object creation time. I create all of >>>>> the >>>>> objects before the timers and they are not lazy in any initialization. >>>>> However, you were right that I needed to run some profiling on this to >>>>> dig >>>>> into the real issue. >>>>> >>>>> To start off, the culprit for this issue is the accumulate. A rule >>>>> without >>>>> it like: >>>>> ``` >>>>> rule "not collecting" >>>>> when >>>>> ComplexClass() ; swap for SimpleClass on another run >>>>> then >>>>> System.out.println("Done"); >>>>> end >>>>> >>>>> ``` >>>>> runs about the same, no matter if it has a SimpleClass or >>>>> ComplexClass. >>>>> >>>>> Also, I'd like to just clarify, a SimpleClass here is just a class >>>>> with >>>>> 2 >>>>> Integer fields (for this example). >>>>> However, the ComplexClass has around 15 fields and about half of these >>>>> are >>>>> Collections (aggregate) types with more nested classes underneath. >>>>> This is the difference I mean between "simple" and "complex" in a >>>>> class; >>>>> if >>>>> that wasn't clear before. >>>>> >>>>> Furthermore, there is only a single rule with only this very simple, >>>>> contrived LHS logic in my example. Drools is not needing >>>>> to traverse any of the objects and no additional work is done. This >>>>> is >>>>> purely just a single rule being evaluated during an insert. >>>>> This is Drools v5.5.0.Final in this specific example (sorry for not >>>>> mentioning that before). >>>>> >>>>> --- >>>>> >>>>> I have seen a few related posts on what I have found, the #hashCode of >>>>> my >>>>> ComplexClass is taking nearly all of the time. >>>>> Upon further investigation, I found that this is happening because >>>>> every >>>>> insert to the session is causing the `result` portion of the >>>>> accumulate >>>>> to >>>>> "recalculate". During this step, the AccumulateContext `result` >>>>> RightTuple >>>>> is having its FactHandle reset to the newly calculated result. >>>>> This calls the #hashCode of the Collection that is holding all of the >>>>> current ComplexClass object instances; and Collection calls the >>>>> #hashCode >>>>> of >>>>> each of these (in j.u.Collection impl's such as j.u.AbstractList and >>>>> j.u.AbstractSet). >>>>> >>>>> So, I have a Collection, that is increasingly growing with >>>>> ComplexClass >>>>> object instances, and each time it grows by one, the #hashCode of the >>>>> entire >>>>> Collection of ComplexClass objects is being calculated. >>>>> >>>>> The ComplexClass #hashCode is an aggregate of a recursive walk along >>>>> across >>>>> all of the objects' #hashCode it reaches through its fields, just like >>>>> many >>>>> aggregate types. I think I can see that this could be expensive if >>>>> this >>>>> is >>>>> being calculated for nearly 5K objects as each of the final objects >>>>> are >>>>> inserted causing the `result` recalculation. >>>>> >>>>> --- >>>>> >>>>> I do realize that one potential workaround would be to put a blocking >>>>> constraint above the accumulate: >>>>> ``` >>>>> rule "collecting" >>>>> when >>>>> BlockingFactClass() >>>>> $res : Collection() from >>>>> accumulate( $s : ComplexClass(), >>>>> init( Collection c = new ArrayList(); ), >>>>> action( c.add($s); ), >>>>> result( c )) >>>>> >>>>> then >>>>> System.out.println("Done"); >>>>> end >>>>> ``` >>>>> where the BlockingFactClass is not inserted until *after* all of the >>>>> ComplexClass objects. This speeds up the performance significantly; >>>>> the >>>>> time is nearly the same as the SimpleClass run actually. >>>>> >>>>> --- >>>>> >>>>> I found that this was an interesting discovery and I did not expect >>>>> this >>>>> behavior. >>>>> >>>>> So @Mark it does seem (to me) that a deeply nested ComplexClass can >>>>> hurt >>>>> performance on an AccumulateNode when the `result` can be repeatedly >>>>> calculated; even when the `result` is not "doing" anything besides >>>>> returning >>>>> what has been accumulated/collected. I understand this is probably >>>>> just >>>>> a >>>>> "gotcha" that I have to deal with. This behavior is also the same for >>>>> the >>>>> Drools `collect` functionality, which I think just uses accumulate in >>>>> the >>>>> impl anyways (perhaps I'm incorrect). Also, I note that this isn't >>>>> necessary a direct "Object size affecting session insertion >>>>> performance", >>>>> as >>>>> I originally titled this thread. >>>>> >>>>> I also think that the new Phreak-based impl for Drools in v6.x may not >>>>> behave like this anymore, since it is more lazy and delays work more >>>>> until >>>>> firing rules (an assumption here; haven't tested that). >>>>> >>>>> With that said, I'm open to anymore suggestions about how to avoid >>>>> this >>>>> issue in pre-Phreak Drools (v6.x) >>>>> (I am not sure how long until I am able to make that jump in >>>>> version.). >>>>> >>>>> >>>>> Also, I'm open to be corrected if my findings are >>>>> incorrect/incomplete. >>>>> :) >>>>> >>>>> Thanks again for the feedback! It is helpful. >>>>> >>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>>> http://drools.46999.n3.nabble.com/Object-size-impact-on-session-insertion-performance-tp4028244p4028251.html >>>>> Sent from the Drools: User forum mailing list archive at Nabble.com. >>>>> _______________________________________________ >>>>> rules-users mailing list >>>>> rules-users@lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/rules-users >>>> >>> >>> >>> _______________________________________________ >>> rules-users mailing list >>> rules-users@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/rules-users >>> >> _______________________________________________ >> rules-users mailing list >> rules-users@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/rules-users > > > _______________________________________________ > rules-users mailing list > rules-users@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/rules-users > _______________________________________________ rules-users mailing list rules-users@lists.jboss.org https://lists.jboss.org/mailman/listinfo/rules-users