Please do report back to the list if 6.0 alleviates this problem, without 
needing a blocking fact.

Mark
On 23 Feb 2014, at 13:30, Mark Proctor <mproc...@codehaus.org> wrote:

> "I have seen a few related posts on what I have found, the #hashCode of my
> ComplexClass is taking nearly all of the time.
> Upon further investigation, I found that this is happening because every
> insert to the session is causing the `result` portion of the accumulate to
> "recalculate”.  “
> 
> There isn’t much we can do about your has code performance time. But if you 
> upgrade to 6.0, the batch oriented propagation will reduce how often this is 
> done, and hopefully minimise the difference. You won’t then need a blocking 
> fact.
> 
> Mark
> On 23 Feb 2014, at 03:38, mikerod <mjr4...@gmail.com> wrote:
> 
>> @laune
>> 
>> You are correct that I actually put an incorrect time up before.  Thanks for
>> pointing that out and sorry for the confusion.
>> The behavioral difference I have found was actually much large between the 2
>> classes, SimpleClass and ComplexClass, than I originally 
>> thought.
>> 
>> The SimpleClass accumulate is very quick, around ~300 ms.  The ComplexClass
>> accumulate (with the exact same rule beyond the object type)
>> spikes to around ~140 *seconds*.  In both cases this is with either 5K
>> objects of one type inserted.
>> 
>> @Mark 
>> 
>> I am sure that it is not the object creation time.  I create all of the
>> objects before the timers and they are not lazy in any initialization.
>> However, you were right that I needed to run some profiling on this to dig
>> into the real issue.
>> 
>> To start off, the culprit for this issue is the accumulate.  A rule without
>> it like:
>> ```
>> rule "not collecting"
>>      when
>>      ComplexClass() ; swap for SimpleClass on another run
>>      then
>>      System.out.println("Done");
>> end
>> 
>> ```
>> runs about the same, no matter if it has a SimpleClass or ComplexClass.
>> 
>> Also, I'd like to just clarify, a SimpleClass here is just a class with 2
>> Integer fields (for this example). 
>> However, the ComplexClass has around 15 fields and about half of these are
>> Collections (aggregate) types with more nested classes underneath.
>> This is the difference I mean between "simple" and "complex" in a class; if
>> that wasn't clear before.
>> 
>> Furthermore, there is only a single rule with only this very simple,
>> contrived LHS logic in my example.  Drools is not needing
>> to traverse any of the objects and no additional work is done.  This is
>> purely just a single rule being evaluated during an insert.
>> This is Drools v5.5.0.Final in this specific example (sorry for not
>> mentioning that before).
>> 
>> --- 
>> 
>> I have seen a few related posts on what I have found, the #hashCode of my
>> ComplexClass is taking nearly all of the time.
>> Upon further investigation, I found that this is happening because every
>> insert to the session is causing the `result` portion of the accumulate to
>> "recalculate".  During this step, the AccumulateContext `result` RightTuple
>> is having its FactHandle reset to the newly calculated result.
>> This calls the #hashCode of the Collection that is holding all of the
>> current ComplexClass object instances; and Collection calls the #hashCode of
>> each of these (in j.u.Collection impl's such as j.u.AbstractList and
>> j.u.AbstractSet).
>> 
>> So, I have a Collection, that is increasingly growing with ComplexClass
>> object instances, and each time it grows by one, the #hashCode of the entire
>> Collection of ComplexClass objects is being calculated.
>> 
>> The ComplexClass #hashCode is an aggregate of a recursive walk along across
>> all of the objects' #hashCode it reaches through its fields, just like many
>> aggregate types.  I think I can see that this could be expensive if this is
>> being calculated for nearly 5K objects as each of the final objects are
>> inserted causing the `result` recalculation.
>> 
>> ---
>> 
>> I do realize that one potential workaround would be to put a blocking
>> constraint above the accumulate:
>> ```
>> rule "collecting"
>>      when
>>      BlockingFactClass()
>>      $res : Collection() from
>>                      accumulate( $s : ComplexClass(),
>>                              init( Collection c = new ArrayList(); ),
>>                              action( c.add($s); ),
>>                              result( c ))
>>                              
>>      then
>>      System.out.println("Done");
>> end
>> ```
>> where the BlockingFactClass is not inserted until *after* all of the
>> ComplexClass objects.  This speeds up the performance significantly; the
>> time is nearly the same as the SimpleClass run actually.
>> 
>> ---
>> 
>> I found that this was an interesting discovery and I did not expect this
>> behavior.  
>> 
>> So @Mark it does seem (to me) that a deeply nested ComplexClass can hurt
>> performance on an AccumulateNode when the `result` can be repeatedly
>> calculated; even when the `result` is not "doing" anything besides returning
>> what has been accumulated/collected.  I understand this is probably just a
>> "gotcha" that I have to deal with.  This behavior is also the same for the
>> Drools `collect` functionality, which I think just uses accumulate in the
>> impl anyways (perhaps I'm incorrect).  Also, I note that this isn't
>> necessary a direct "Object size affecting session insertion performance", as
>> I originally titled this thread.
>> 
>> I also think that the new Phreak-based impl for Drools in v6.x may not
>> behave like this anymore, since it is more lazy and delays work more until
>> firing rules (an assumption here; haven't tested that).
>> 
>> With that said, I'm open to anymore suggestions about how to avoid this
>> issue in pre-Phreak Drools (v6.x)
>> (I am not sure how long until I am able to make that jump in version.).  
>> 
>> Also, I'm open to be corrected if my findings are incorrect/incomplete. :)  
>> 
>> Thanks again for the feedback!  It is helpful.
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://drools.46999.n3.nabble.com/Object-size-impact-on-session-insertion-performance-tp4028244p4028251.html
>> Sent from the Drools: User forum mailing list archive at Nabble.com.
>> _______________________________________________
>> rules-users mailing list
>> rules-users@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/rules-users
> 


_______________________________________________
rules-users mailing list
rules-users@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users

Reply via email to