Re: [rules-users] Improving Drools Memory Performance

Mark Proctor Thu, 15 Jul 2010 23:48:16 -0700


detail : DetailWire ( (from == source&&  to == target) || (from == target&&  to 
== source) )
The above is turned effectively into an MVEL statement, you might get better 
performance with a ConditionalElement 'or' as lont as the
two are mutually exclusive:

 ( DetailWire (from == source, to == target ) or
   DetailWire (from == target, to == source) )

I saw you did this:
not ( form : InputForm ( eContainer == container, name == iterator.name ) )

The 'form' is not accessible outside the 'not', and that rule does not need it.

Is this not a bug. You bind "text". And then i'm not sure what it is you are 
doing in the second two rules, but it looks wrong.
text : InputTextField ( eContainer == form, eval 
(functions.getAutocompleteInputName(attribute).equals(name)) )
onInput : EventTrigger ( text.onInput == onInput
currentInput : Property ( text.currentInput == currentInput )

It doesn't look like you are updating the session with facts, i.e. it's a 
stateless session. See if this helps

KnowledgeBaseConfiguration kconf = 
KnowledgeBaseFactory.newKnowledgeBaseConfiguration();
kconf.setOption( SequentialOption.YES );

KnowledgeBase kbase = KnowledgeBaseFactory.newKnowledgeBase( kconf );
final StatelessKnowledgeSession ksession = kbase.newStatelessKnowledgeSession();
ksession.execute(....);

In the execute you can provie it with a batch of commands to execute, or just a 
list of objects, up to you. see stateless session for
more details.

The SequentialOption may help memory, a small mount, if you aren't doing any 
working memory modifications (insert/modify/update/retract).

Mark


On 16/07/2010 04:16, Jevon Wright wrote:

Hi again,

By removing all of the simple eval()s from my rules, I have cut heapusage by at least an order of magnitude. However this still isn't enough.

Since I am trying to reduce the cross-product size (as in SQL), Irecall that most SQL implementations have a "DESCRIBE SELECT" querywhich provides real-time information about the complexity of a givenSQL query - i.e. the size of the tables, indexes used, and so on. Isthere any such tool available for Drools? Are there any tools whichcan provide clues as to which rules are using the most memory?

Alternatively, I am wondering what kind of benefit I could expect fromusing materialized views to create summary tables; that is, derivingand inserting additional facts. This would allow Drools to rewritequeries that currently use eval(), but would increase the size ofworking memory, so would this actually save heap size?

To what extent does Drools rewrite queries? Is there any documentationdescribing the approaches used?

Any other ideas on how to reduce heap memory usage? I'd appreciate anyideas :)


Thanks
Jevon

On Mon, Jul 12, 2010 at 5:56 PM, Jevon Wright <je...@jevon.org<mailto:je...@jevon.org>> wrote:


    Hi Wolfgang and Mark,

    Thank you for your replies! You were correct: my eval() functions
    could generally be rewritten into Drools directly.

    I had one function "connectsDetail" that was constraining
    unidirectional edges, and could be rewritten from:
     detail : DetailWire ( )
     eval ( functions.connectsDetail(detail, source, target) )

    to:
     detail : DetailWire ( from == source, to == target )

    Another function, "connects", was constraining bidirectional edges,
    and could be rewritten from:
     sync : SyncWire( )
     eval ( functions.connects(sync, source, target) )

    to:
     sync : SyncWire( (from == source && to == target) || (from == target
    && to == source) )

    Finally, the "veto" function could be rewritten from:
     detail : DetailWire ( )
     eval ( handler.veto(detail) )

    to:
     detail : DetailWire ( overridden == false )

    I took each of these three changes, and evaluated them separately [1].
    I found that:

    1. Inlining 'connectsDetail' made a huge difference - 10-30% faster
    execution and 50-60% less allocated heap.
    2. Inlining 'connects' made very little difference - 10-30% faster
    execution, but 0-20% more allocated heap.
    3. Inlining 'veto' made no difference - no significant change in
    execution speed or allocated heap.

    I think I understand why inlining 'connects' would improve heap usage
    - because the rules essentially have more conditionals?

    I also understand why 'veto' made no difference - for most of my test
    models, "overridden" was never true, so adding this conditional was
    not making the cross product set any smaller.

    Finally, I also tested simply joining all of the rules together into
    one file. This happily made no difference at all (although made it
    more difficult to edit).

    So I think I can safely conclude that eval() should be used as little
    as possible - however, this means that the final rules are made more
    complicated and less human-readable, so a DSL may be best for my
    common rule patterns in the future.

    Thanks again!
    Jevon

    [1]: http://www.jevon.org/wiki/Improving_Drools_Memory_Performance

    On Sat, Jul 10, 2010 at 12:28 AM, Wolfgang Laun
    <wolfgang.l...@gmail.com <mailto:wolfgang.l...@gmail.com>> wrote:
    > On 9 July 2010 14:14, Mark Proctor <mproc...@codehaus.org
    <mailto:mproc...@codehaus.org>> wrote:
    >>  You have many objects there that are not constrained;
    >
    > I have an inkling that the functions.*() are hiding just these
    contraints,
    > It's certainly the wrong way, starting with oodles of node
    pairs, just to
    > pick out connected ones by fishing for the connecting edge. And this
    > is worsened by trying to find two such pairs which meet at some
    > DomainSource
    >
    > Guesswork, hopefully educated ;-)
    >
    > -W
    >
    >
    >> if there are
    >> multiple versions of those objects you are going to get massive
    amounts
    >> of cross products. Think in terms of SQL, each pattern you add
    is like
    >> an SQL join.
    >>
    >> Mark
    >> On 09/07/2010 09:20, Jevon Wright wrote:
    >>> Hi everyone,
    >>>
    >>> I am working on what appears to be a fairly complex rule base
    based on
    >>> EMF. The rules aren't operating over a huge number of facts
    (less than
    >>> 10,000 EObjects) and there aren't too many rules (less than
    300), but
    >>> I am having a problem with running out of Java heap space (set
    at ~400
    >>> MB).
    >>>
    >>> Through investigation, I came to the conclusion that this is
    due to
    >>> the design of the rules, rather than the number of facts. The
    engine
    >>> uses less memory inserting many facts that use simple rules,
    compared
    >>> with inserting few facts that use many rules.
    >>>
    >>> Can anybody suggest some tips for reducing heap memory usage in
    >>> Drools? I don't have a time constraint, only a heap/memory
    constraint.
    >>> A sample rule in my project looks like this:
    >>>
    >>>    rule "Create QueryParameter for target container of DetailWire"
    >>>      when
    >>>        container : Frame( )
    >>>        schema : DomainSchema ( )
    >>>        domainSource : DomainSource ( )
    >>>        instance : DomainIterator( )
    >>>        selectEdge : SelectEdge ( eval (
    >>> functions.connectsSelect(selectEdge, instance, domainSource )) )
    >>>        schemaEdge : SchemaEdge ( eval (
    >>> functions.connectsSchema(schemaEdge, domainSource, schema )) )
    >>>        source : VisibleThing ( eContainer == container )
    >>>        target : Frame ( )
    >>>        instanceSet : SetWire (
    eval(functions.connectsSet(instanceSet,
    >>> instance, source )) )
    >>>        detail : DetailWire ( )
    >>>        eval ( functions.connectsDetail(detail, source, target ))
    >>>        pk : DomainAttribute ( eContainer == schema, primaryKey
    == true )
    >>>        not ( queryPk : QueryParameter ( eContainer == target,
    name == pk.name <http://pk.name> ) )
    >>>        eval ( handler.veto( detail ))
    >>>
    >>>      then
    >>>        QueryParameter qp =
    handler.generatedQueryParameter(detail, target);
    >>>        handler.setName(qp, pk.getName());
    >>>        queue.add(qp, drools); // wraps insert(...)
    >>>
    >>>    end
    >>>
    >>> I try to order the select statements in an order that will
    reduce the
    >>> size of the cross-product (in theory), but I also try and keep the
    >>> rules fairly human readable. I try to avoid comparison
    operators like
    >>> <  and>. Analysing a heap dump shows that most of the memory
    is being
    >>> used in StatefulSession.nodeMemories>  PrimitiveLongMap.
    >>>
    >>> I am using a StatefulSession; if I understand correctly, I
    can't use a
    >>> StatelessSession with sequential mode since I am inserting
    facts as
    >>> part of the rules. If I also understand correctly, I'd like
    the Rete
    >>> graph to be tall, rather than wide.
    >>>
    >>> Some ideas I have thought of include the following:
    >>> 1. Creating a separate intermediary meta-model to split up the
    sizes
    >>> of the rules. e.g. instead of (if A and B and C then insert
    D), using
    >>> (if A and B then insert E; if E and C then insert D).
    >>> 2. Moving eval() statements directly into the Type(...) selectors.
    >>> 3. Removing eval() statements. Would this allow for better
    indexing by
    >>> the Rete algorithm?
    >>> 4. Reducing the height, or the width, of the class hierarchy
    of the
    >>> facts. e.g. Removing interfaces or abstract classes to reduce the
    >>> possible matches. Would this make a difference?
    >>> 5. Conversely, increasing the height, or the width, of the class
    >>> hierarchy. e.g. Adding interfaces or abstract classes to
    reduce field
    >>> accessors.
    >>> 6. Instead of using EObject.eContainer, creating an explicit
    >>> containment property in all of my EObjects.
    >>> 7. Creating a DSL that is human-readable, but allows for the
    >>> automation of some of these approaches.
    >>> 8. Moving all rules into one rule file, or splitting up rules into
    >>> smaller files.
    >>>
    >>> Is there kind of profiler for Drools that will let me see the
    size (or
    >>> the memory usage) of particular rules, or of the memory used after
    >>> inference? Ideally I'd use this to profile any changes.
    >>>
    >>> Thanks for any thoughts or tips! :-)
    >>>
    >>> Jevon
    >>> _______________________________________________
    >>> rules-users mailing list
    >>> rules-users@lists.jboss.org <mailto:rules-users@lists.jboss.org>
    >>> https://lists.jboss.org/mailman/listinfo/rules-users
    >>>
    >>>
    >>
    >>
    >> _______________________________________________
    >> rules-users mailing list
    >> rules-users@lists.jboss.org <mailto:rules-users@lists.jboss.org>
    >> https://lists.jboss.org/mailman/listinfo/rules-users
    >>
    >
    > _______________________________________________
    > rules-users mailing list
    > rules-users@lists.jboss.org <mailto:rules-users@lists.jboss.org>
    > https://lists.jboss.org/mailman/listinfo/rules-users
    >



_______________________________________________
rules-users mailing list
rules-users@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users

_______________________________________________
rules-users mailing list
rules-users@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users

Re: [rules-users] Improving Drools Memory Performance

Reply via email to