Sebastian,
The bottom line is that matching is NP-Hard. It reduces to the subGraph isomporphism problem. So in general, there is no solution.
In your case, you might get better results by moving the highly constrained condition down the set of conditions to improve sharing. This might penalize an individual production, while improving the overall performance.
There has been work done on creating a better ordering (I've summarized what limited amount I know below). However, there is no perfect solution.
Gary
---- Re-ordering conditions
However, I know of several solution attempts for re-ordering conditions that have been tried in the research/development commumunity.
1) Automatic ordering of conditions (CMU does this) but the orderings I've seen have always been done based upon a static analysis of the conditions. I've never seen re-ordering to maximize sharing. Given re-ordered conditions, and a few rete tricks Bob Doorenbos had a production system with a million productions, that didn't slow down when more productions were added. However, if the problem was like your's (with a growing number of constants), he could only achieve a linear slowdown as more productions were added.
2) Growing the Rete as a binary tree. Given conditions a b c d and e the rete would look like
a b c d e
\ / \ / /
J J /
\ / /
J /
\ /
J
Inference used to do this. If you have lots of common sub-expressions this might improve sharing. Inference thought so. I've never seen any results though, and I wonder if the beta memories (at the J nodes) have unrelated alpha memories then that might result in large cross products.
-----Original Message-----
From: Sebastian Varges [mailto:[EMAIL PROTECTED]]
Sent: Monday, October 01, 2001 1:54 PM
To: [EMAIL PROTECTED]; Pelton, Gary
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: JESS: Rete
Hi Gary and Ernest,
let me answer your emails together. I see why sharing partial matches
among rules is the big plus in Rete. My simplified network without any
join nodes was a bit provocative, in fact. It'll place a heavy burden
in terms of bookkeeping on the final node responsible for deciding
about firing a rule.
In my application I have about 400 rules. The Rete network shares lots
of one-input nodes, but no two input nodes. I think this is because
I'm placing a condition that only matches a very specific fact at the
beginning of each rule. A simplified example:
(defrule rule1
(a1)
(b)
(c)
(d)
(e)
=>
)
(defrule rule2
(a2)
(b)
(c)
(d)
=>
)
My idea was to use specific facts like a1 and a2 to switch individual
rules on and off. Unfortunately, it looks like this makes sharing join
nodes impossible. In fact, the Rete network for the rules above has
four join nodes for rule1, and three join nodes for rule2 - and no
sharing between them.
My question would be if there is another way to somehow blocking
unwanted rules and share more join nodes? If I place those specific
conditions last, there is more sharing (only 5 versus 7 join nodes)
but many join nodes might be activated unnecessarily without any
chance of firing a rule in many cases.
I suspect the lack of sharing is due to the fact that join nodes
always get their right input from a one-input node, if I'm not
mistaken. This is probably a good idea, but couldn't one try to
optimize the sharing of join nodes, I wonder? In cases of shared
variables on a rule LHS there is no other way than building a join
node to check consistency. But couldn't one try some compile-time
optimization for all the other possible join nodes to minimize their
number and maximise sharing? In the above case, join nodes b-c and c-d
could be shared and then further linked to the more specific
nodes. The disadvantage I would see is that the rule developer has no
direct influence on the network anymore (and he might know what a good
rule ordering would be in the light of application data). Any comments
on this?
thanks for the nice discussion!
--Sebastian
BTW: 1) the Jess website: http://herzberg.ca.sandia.gov/jess/
appears without any letters in Netscape 4.76 recently (black
letters on black background, it seems). In IE it's okay.
2) I'll check the (matches) function but the licence agreement
seems to have changed, so maybe it takes a while. Basically,
the (matches) function in Jess 6.0a2 only gives me information
one join node even if there are many in the network.
-------------------------------------------------------
>>>>> In article <[EMAIL PROTECTED]>, "Pelton, Gary" <[EMAIL PROTECTED]> writes:
> The performance improvement comes from sharing among different rules. If a
> different rule has a partial match then the rete saves doing the partial
> match twice in some cases. In your example if rule 2 matched a b c and d
> then it could share all the effort (which could grow quadratically at each
> join node) with rule 1. Typically, the most computationally expensive part
> of the match is in the join nodes, because join nodes have to check all the
> previous partial matches, and the number of partial matches can grow to be
> the complete cross product of the nodes above.
> I don't know exactly how Jess does the matching, there have been Rete's that
> maintain beta-memories (the cross-products above) as cross-products rather
> than instantiated cross-products. But it really doesn't matter about the
> details of the matching. The performance improvement from sharing amongst
> multiple rules is a really big WIN in non-trivial applications.
> Gary
>>>>> In article <[EMAIL PROTECTED]>, [EMAIL PROTECTED] (friedman_hill ernest j) writes:
> I think Sebastian Varges wrote:
>> Hi Ejfried,
> Hi [EMAIL PROTECTED],
>>
>> I'm trying to understand why the Rete network looks the way it
>> does... Now, I understand that join nodes are necessary in cases where
>> variables are shared between patterns in a rule. But this is not the
>> case here. So why not just something like the following for the simple
>> rule above:
>> start
>> / / | \ \
>> a b c d e
>> \ \ | / /
>> FIRE: rule-4
>>
>> The single nodes would have a memory, and that's it.
> Well, even for your very simple example, your "rule-4" node above has
> to do some bookkeeping. It has to know has to know it has five inputs,
> and when a fact has been asserted on each of its five inputs, and it
> it has to somehow be able to tell the inputs apart. It also has to do
> this all rather efficiently. It's not an impossible job, but think
> about what a generic version of this would look like -- remember that
> it has to be created by a rule compiler, not by a human.
> Now, make your sample problem just a little bit more complex: one or
> more of the input facts can look like (a 1) or (a 2) instead of just
> (a). Now in your proposed scheme the rule node would have to keep
> track of all the permutations of possible inputs, because it needs to
> know which permuations have been fired already, and which haven't. It
> would have to be able to decide if a particular permutation was
> already in the list, and find permutations if it was necesasary to
> retract them. These lookups need to be very fast, so each of the five
> inputs would need a sophisticated indexer, and then of course there
> would need to be structures to hold and index the
> permutations... very quickly, the rule node would need to contain all
> of the internal structure that is built up as the Rete network.
>> (I found the
>> mention of `alpha nodes' for one-input nodes, but does JESS have any?)
> Not sure what the term "alpha nodes" refers to; there are alpha and
> beta memories in the join nodes.
>> BTW: the useful (matches) function does not seem to work properly in
>> Jess 6.0a2 what I'm still using; does it in 6.0b1?
> Well, I guess I'm not sure -- that depends on what you perceive as
> being wrong with in Jess 6.0a2. Please let me know.
> ---------------------------------------------------------
> Ernest Friedman-Hill
> Distributed Systems Research Phone: (925) 294-2154
> Sandia National Labs FAX: (925) 294-2234
> Org. 8920, MS 9012 [EMAIL PROTECTED]
> PO Box 969 http://herzberg.ca.sandia.gov
> Livermore, CA 94550
---------------------------------------------------------------------
To unsubscribe, send the words 'unsubscribe jess-users [EMAIL PROTECTED]'
in the BODY of a message to [EMAIL PROTECTED], NOT to the
list (use your own address!) List problems? Notify [EMAIL PROTECTED]
---------------------------------------------------------------------
