Hi Wolfgang and Mark, Thank you for your replies! You were correct: my eval() functions could generally be rewritten into Drools directly.
I had one function "connectsDetail" that was constraining unidirectional edges, and could be rewritten from: detail : DetailWire ( ) eval ( functions.connectsDetail(detail, source, target) ) to: detail : DetailWire ( from == source, to == target ) Another function, "connects", was constraining bidirectional edges, and could be rewritten from: sync : SyncWire( ) eval ( functions.connects(sync, source, target) ) to: sync : SyncWire( (from == source && to == target) || (from == target && to == source) ) Finally, the "veto" function could be rewritten from: detail : DetailWire ( ) eval ( handler.veto(detail) ) to: detail : DetailWire ( overridden == false ) I took each of these three changes, and evaluated them separately [1]. I found that: 1. Inlining 'connectsDetail' made a huge difference - 10-30% faster execution and 50-60% less allocated heap. 2. Inlining 'connects' made very little difference - 10-30% faster execution, but 0-20% more allocated heap. 3. Inlining 'veto' made no difference - no significant change in execution speed or allocated heap. I think I understand why inlining 'connects' would improve heap usage - because the rules essentially have more conditionals? I also understand why 'veto' made no difference - for most of my test models, "overridden" was never true, so adding this conditional was not making the cross product set any smaller. Finally, I also tested simply joining all of the rules together into one file. This happily made no difference at all (although made it more difficult to edit). So I think I can safely conclude that eval() should be used as little as possible - however, this means that the final rules are made more complicated and less human-readable, so a DSL may be best for my common rule patterns in the future. Thanks again! Jevon [1]: http://www.jevon.org/wiki/Improving_Drools_Memory_Performance On Sat, Jul 10, 2010 at 12:28 AM, Wolfgang Laun <[email protected]> wrote: > On 9 July 2010 14:14, Mark Proctor <[email protected]> wrote: >> You have many objects there that are not constrained; > > I have an inkling that the functions.*() are hiding just these contraints, > It's certainly the wrong way, starting with oodles of node pairs, just to > pick out connected ones by fishing for the connecting edge. And this > is worsened by trying to find two such pairs which meet at some > DomainSource > > Guesswork, hopefully educated ;-) > > -W > > >> if there are >> multiple versions of those objects you are going to get massive amounts >> of cross products. Think in terms of SQL, each pattern you add is like >> an SQL join. >> >> Mark >> On 09/07/2010 09:20, Jevon Wright wrote: >>> Hi everyone, >>> >>> I am working on what appears to be a fairly complex rule base based on >>> EMF. The rules aren't operating over a huge number of facts (less than >>> 10,000 EObjects) and there aren't too many rules (less than 300), but >>> I am having a problem with running out of Java heap space (set at ~400 >>> MB). >>> >>> Through investigation, I came to the conclusion that this is due to >>> the design of the rules, rather than the number of facts. The engine >>> uses less memory inserting many facts that use simple rules, compared >>> with inserting few facts that use many rules. >>> >>> Can anybody suggest some tips for reducing heap memory usage in >>> Drools? I don't have a time constraint, only a heap/memory constraint. >>> A sample rule in my project looks like this: >>> >>> rule "Create QueryParameter for target container of DetailWire" >>> when >>> container : Frame( ) >>> schema : DomainSchema ( ) >>> domainSource : DomainSource ( ) >>> instance : DomainIterator( ) >>> selectEdge : SelectEdge ( eval ( >>> functions.connectsSelect(selectEdge, instance, domainSource )) ) >>> schemaEdge : SchemaEdge ( eval ( >>> functions.connectsSchema(schemaEdge, domainSource, schema )) ) >>> source : VisibleThing ( eContainer == container ) >>> target : Frame ( ) >>> instanceSet : SetWire ( eval(functions.connectsSet(instanceSet, >>> instance, source )) ) >>> detail : DetailWire ( ) >>> eval ( functions.connectsDetail(detail, source, target )) >>> pk : DomainAttribute ( eContainer == schema, primaryKey == true ) >>> not ( queryPk : QueryParameter ( eContainer == target, name == >>> pk.name ) ) >>> eval ( handler.veto( detail )) >>> >>> then >>> QueryParameter qp = handler.generatedQueryParameter(detail, target); >>> handler.setName(qp, pk.getName()); >>> queue.add(qp, drools); // wraps insert(...) >>> >>> end >>> >>> I try to order the select statements in an order that will reduce the >>> size of the cross-product (in theory), but I also try and keep the >>> rules fairly human readable. I try to avoid comparison operators like >>> < and>. Analysing a heap dump shows that most of the memory is being >>> used in StatefulSession.nodeMemories> PrimitiveLongMap. >>> >>> I am using a StatefulSession; if I understand correctly, I can't use a >>> StatelessSession with sequential mode since I am inserting facts as >>> part of the rules. If I also understand correctly, I'd like the Rete >>> graph to be tall, rather than wide. >>> >>> Some ideas I have thought of include the following: >>> 1. Creating a separate intermediary meta-model to split up the sizes >>> of the rules. e.g. instead of (if A and B and C then insert D), using >>> (if A and B then insert E; if E and C then insert D). >>> 2. Moving eval() statements directly into the Type(...) selectors. >>> 3. Removing eval() statements. Would this allow for better indexing by >>> the Rete algorithm? >>> 4. Reducing the height, or the width, of the class hierarchy of the >>> facts. e.g. Removing interfaces or abstract classes to reduce the >>> possible matches. Would this make a difference? >>> 5. Conversely, increasing the height, or the width, of the class >>> hierarchy. e.g. Adding interfaces or abstract classes to reduce field >>> accessors. >>> 6. Instead of using EObject.eContainer, creating an explicit >>> containment property in all of my EObjects. >>> 7. Creating a DSL that is human-readable, but allows for the >>> automation of some of these approaches. >>> 8. Moving all rules into one rule file, or splitting up rules into >>> smaller files. >>> >>> Is there kind of profiler for Drools that will let me see the size (or >>> the memory usage) of particular rules, or of the memory used after >>> inference? Ideally I'd use this to profile any changes. >>> >>> Thanks for any thoughts or tips! :-) >>> >>> Jevon >>> _______________________________________________ >>> rules-users mailing list >>> [email protected] >>> https://lists.jboss.org/mailman/listinfo/rules-users >>> >>> >> >> >> _______________________________________________ >> rules-users mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/rules-users >> > > _______________________________________________ > rules-users mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/rules-users > _______________________________________________ rules-users mailing list [email protected] https://lists.jboss.org/mailman/listinfo/rules-users
