Re: JESS: On the Performance of Logical Retractions

2011-06-10 Thread Md Oliya
@Peter: I werent interested to plug into Rete at first place, neither
had should I use RETE or how does RETE perform in mind. Rather, I was
trying to find a solution for my problem at hand, and the more and more i
developed my own solution, i found it to be more and more similar to the
Rete. So I intended not to reinvent the wheel, and tap into the existing
implementations. By performance of RETE i mean the cost of building and
maintaining the network and not the data storage and retrieval costs.

@Ernest: I understand your point and i think the main problem would be the
cascading effect incurred by liberal use of the logical keyword, as you
mentioned.

As said before, I am using the Open Rule
Bench,http://rulebench.projects.semwebcentral.org/which is a set of
test cases for a number of rule engines such as XSB, Jess,
and Jena (etc.). It is perfectly self contained and you can set it up and
test the Jess within 15 minutes.

But still I have a question:what type of truth maintenance method is
implemented in jess? Do you solely rely on the Rete memory nodes and tokens
for this purpose?


On Fri, Jun 10, 2011 at 1:21 AM, Peter Lin wool...@gmail.com wrote:

 By performance of RETE what are you referring to?

 There are many aspects of RETE, which one must study carefully. It's
 good that you're translating RDF to OWL, but the larger question is
 why use OWL/RDF in the first place? Unless the knowledge easily fits
 into axioms like sky is blue or typical RDF examples, there's no
 benefit to storing or using RDF. My own bias perspective on RDF/OWL.

 The real question isn't should I use RETE or how does RETE perform.
 The real question is how do I solve the problem efficiently?

 I've built compliance engines for trading systems using JESS. I can
 say from first hand experience, it's how you use the engine that has
 the biggest factor. I've done things like load 500K records to check
 compliance across a portfolio set with minimal latency for nightly
 batch processes. the key though is taking time to study existing
 literature and understanding things before jumping to a solution.

 providing concrete examples of what your doing will likely get better
 advice than making general statements.


 On Thu, Jun 9, 2011 at 12:17 PM, Md Oliya md.ol...@gmail.com wrote:
  Thank you very much Peter for the useful information. I will definitely
 look
  into that.
  but in the context of this message, i am not loading a huge (subjective
  interpretation?) knowledge base. It's 100k assertions, with the
 operations
  taking around 400 MB.
  Secondly, in my experiments, I subtracted the loading time of the
  assertions/retractions in jess, as I'm focusing on the performance of the
  Rete.
  Lastly, I am not doing an RDF based mapping; rather, I follow the method
 of
  Description Logic Programs for translating each Class/Property of OWL
 into
  its corresponding template.
 
 
  --Oli.
 
 
  On Fri, Jun 10, 2011 at 12:03 AM, Peter Lin wool...@gmail.com wrote:
 
  Although it may be obvious to some people, I thought I'd mention
  this well known lesson.
 
  Do not load huge knowledge base into memory. This lesson is well
  documented in existing literature on knowledge base systems. it's also
  been discussed on JESS mailing list numerous times over the years, so
  I would suggest searching JESS mailing list to learn from other
  people's experience.
 
  It's better to intelligently load knowledge base into memory as
  needed, rather than blindly load everything. Even in the case where
  someone has 256Gb of memory, one should ask why load all that into
  memory up front.
 
  If the test is using RDF triples, it's well known that RDF triples
  produces excessive partial matches and often results in
  OutOfMemoryException. The real issue isn't JESS, it's how one tries to
  solve a problem. I would recommend reading Gary Riley's book on expert
  systems to avoid repeating a lot of mistakes that others have already
  documented.
 
 
  On Thu, Jun 9, 2011 at 11:41 AM, Md Oliya md.ol...@gmail.com wrote:
   Thank you Ernest.
   I am experimenting with the Lehigh university benchmark, where i
   transfer
   OWL TBox into their equivalent rules in Jess, with the logical
   construct.
   Specifically, I am using the dataset and transformations, as used in
 the
   OpenRuleBench.
   As for the runtimes, I missed a point about the retractions. The fact
   is,
   even if the session does not contain any rules (no defrules, just
   assertions), loading the same set of retractions takes a considerable
   time.
   This indicates that the high runtime is mostly incurred by jess
 internal
   operations.
   but still, when the number of changes grows high (say more than 10%)
 the
   runtime is not acceptable, and rerunning with the retracted kb would
 be
   faster.
   I have another question as well: what type of truth maintenance method
   is
   implemented in jess? Do you solely rely on the Rete memory nodes and
   tokens
   for this purpose?
  

Re: JESS: On the Performance of Logical Retractions

2011-06-10 Thread Peter Lin
I've looked at OpenRuleBench in the past and I just looked at it again
real quick.

The way the test was done is the wrong way to use a production rule
engine. That's my bias opinion. I understand the intent was to measure
the performance with the same data, and similar rules. The point I'm
trying to make is that encoding knowledge as triples is pointless and
useless for practical applications. Many researchers have extended
triples to quads and others convert complex object models to triples
back-and-forth. If knowledge naturally fits in a complex object, why
decompose it to triples or quads?

To draw an absurd anology. Would you dismantle your car every night to
store it away and then re-assemble it every morning?

Think of it this way, say we want to use Lego bricks to capture
knowledge. If the subject happens to work well with a 1x3 brick, then
all you need is 1x3 bricks. If the subject is complex, just 1x3 brick
probably isn't going to work. In the real world, there's a lot more
than 1x3 brick and the things we want to capture usually requires a
wide variety of bricks.

If you need to assert a bunch of facts and then retract 50% of those
facts, the first question should be why am I doing that? and is that
a pointless exercise? The first question I would ask is, can I use
backward chaining or query approach instead?


On Fri, Jun 10, 2011 at 12:58 AM, Md Oliya md.ol...@gmail.com wrote:
 @Peter: I werent interested to plug into Rete at first place, neither
 had should I use RETE or how does RETE perform in mind. Rather, I was
 trying to find a solution for my problem at hand, and the more and more i
 developed my own solution, i found it to be more and more similar to the
 Rete. So I intended not to reinvent the wheel, and tap into the existing
 implementations. By performance of RETE i mean the cost of building and
 maintaining the network and not the data storage and retrieval costs.
 @Ernest: I understand your point and i think the main problem would be the
 cascading effect incurred by liberal use of the logical keyword, as you
 mentioned.
 As said before, I am using the Open Rule Bench, which is a set of test cases
 for a number of rule engines such as XSB, Jess, and Jena (etc.). It is
 perfectly self contained and you can set it up and test the Jess within 15
 minutes.
 But still I have a question:what type of truth maintenance method is
 implemented in jess? Do you solely rely on the Rete memory nodes and tokens
 for this purpose?

 On Fri, Jun 10, 2011 at 1:21 AM, Peter Lin wool...@gmail.com wrote:

 By performance of RETE what are you referring to?

 There are many aspects of RETE, which one must study carefully. It's
 good that you're translating RDF to OWL, but the larger question is
 why use OWL/RDF in the first place? Unless the knowledge easily fits
 into axioms like sky is blue or typical RDF examples, there's no
 benefit to storing or using RDF. My own bias perspective on RDF/OWL.

 The real question isn't should I use RETE or how does RETE perform.
 The real question is how do I solve the problem efficiently?

 I've built compliance engines for trading systems using JESS. I can
 say from first hand experience, it's how you use the engine that has
 the biggest factor. I've done things like load 500K records to check
 compliance across a portfolio set with minimal latency for nightly
 batch processes. the key though is taking time to study existing
 literature and understanding things before jumping to a solution.

 providing concrete examples of what your doing will likely get better
 advice than making general statements.


 On Thu, Jun 9, 2011 at 12:17 PM, Md Oliya md.ol...@gmail.com wrote:
  Thank you very much Peter for the useful information. I will definitely
  look
  into that.
  but in the context of this message, i am not loading a huge (subjective
  interpretation?) knowledge base. It's 100k assertions, with the
  operations
  taking around 400 MB.
  Secondly, in my experiments, I subtracted the loading time of the
  assertions/retractions in jess, as I'm focusing on the performance of
  the
  Rete.
  Lastly, I am not doing an RDF based mapping; rather, I follow the method
  of
  Description Logic Programs for translating each Class/Property of OWL
  into
  its corresponding template.
 
 
  --Oli.
 
 
  On Fri, Jun 10, 2011 at 12:03 AM, Peter Lin wool...@gmail.com wrote:
 
  Although it may be obvious to some people, I thought I'd mention
  this well known lesson.
 
  Do not load huge knowledge base into memory. This lesson is well
  documented in existing literature on knowledge base systems. it's also
  been discussed on JESS mailing list numerous times over the years, so
  I would suggest searching JESS mailing list to learn from other
  people's experience.
 
  It's better to intelligently load knowledge base into memory as
  needed, rather than blindly load everything. Even in the case where
  someone has 256Gb of memory, one should ask why load all that into
  memory 

Re: JESS: On the Performance of Logical Retractions

2011-06-10 Thread Ernest Friedman-Hill
Yeah, I just had a look too, and I think the report on their site says  
it all. Jess and Drools are at the bottom of their performance results  
for a reason -- because they're being misapplied. If your problem  
looks like the kinds of problems they're benchmarking, then by all  
means use one of the tools that scored well on their tests. Use the  
proper tool for the job at hand.



On Jun 10, 2011, at 8:33 AM, Peter Lin wrote:


I've looked at OpenRuleBench in the past and I just looked at it again
real quick.

The way the test was done is the wrong way to use a production rule
engine. That's my bias opinion. I understand the intent was to measure
the performance with the same data, and similar rules. The point I'm
trying to make is that encoding knowledge as triples is pointless and
useless for practical applications. Many researchers have extended
triples to quads and others convert complex object models to triples
back-and-forth. If knowledge naturally fits in a complex object, why
decompose it to triples or quads?

To draw an absurd anology. Would you dismantle your car every night to
store it away and then re-assemble it every morning?

Think of it this way, say we want to use Lego bricks to capture
knowledge. If the subject happens to work well with a 1x3 brick, then
all you need is 1x3 bricks. If the subject is complex, just 1x3 brick
probably isn't going to work. In the real world, there's a lot more
than 1x3 brick and the things we want to capture usually requires a
wide variety of bricks.

If you need to assert a bunch of facts and then retract 50% of those
facts, the first question should be why am I doing that? and is that
a pointless exercise? The first question I would ask is, can I use
backward chaining or query approach instead?


On Fri, Jun 10, 2011 at 12:58 AM, Md Oliya md.ol...@gmail.com wrote:

@Peter: I werent interested to plug into Rete at first place, neither
had should I use RETE or how does RETE perform in mind. Rather,  
I was
trying to find a solution for my problem at hand, and the more and  
more i
developed my own solution, i found it to be more and more similar  
to the
Rete. So I intended not to reinvent the wheel, and tap into the  
existing
implementations. By performance of RETE i mean the cost of  
building and

maintaining the network and not the data storage and retrieval costs.
@Ernest: I understand your point and i think the main problem would  
be the
cascading effect incurred by liberal use of the logical keyword, as  
you

mentioned.
As said before, I am using the Open Rule Bench, which is a set of  
test cases
for a number of rule engines such as XSB, Jess, and Jena (etc.). It  
is
perfectly self contained and you can set it up and test the Jess  
within 15

minutes.
But still I have a question:what type of truth maintenance method is
implemented in jess? Do you solely rely on the Rete memory nodes  
and tokens

for this purpose?

On Fri, Jun 10, 2011 at 1:21 AM, Peter Lin wool...@gmail.com wrote:


By performance of RETE what are you referring to?

There are many aspects of RETE, which one must study carefully. It's
good that you're translating RDF to OWL, but the larger question is
why use OWL/RDF in the first place? Unless the knowledge easily fits
into axioms like sky is blue or typical RDF examples, there's no
benefit to storing or using RDF. My own bias perspective on RDF/OWL.

The real question isn't should I use RETE or how does RETE  
perform.

The real question is how do I solve the problem efficiently?

I've built compliance engines for trading systems using JESS. I can
say from first hand experience, it's how you use the engine that has
the biggest factor. I've done things like load 500K records to check
compliance across a portfolio set with minimal latency for nightly
batch processes. the key though is taking time to study existing
literature and understanding things before jumping to a solution.

providing concrete examples of what your doing will likely get  
better

advice than making general statements.


On Thu, Jun 9, 2011 at 12:17 PM, Md Oliya md.ol...@gmail.com  
wrote:
Thank you very much Peter for the useful information. I will  
definitely

look
into that.
but in the context of this message, i am not loading a huge  
(subjective

interpretation?) knowledge base. It's 100k assertions, with the
operations
taking around 400 MB.
Secondly, in my experiments, I subtracted the loading time of the
assertions/retractions in jess, as I'm focusing on the  
performance of

the
Rete.
Lastly, I am not doing an RDF based mapping; rather, I follow the  
method

of
Description Logic Programs for translating each Class/Property of  
OWL

into
its corresponding template.


--Oli.


On Fri, Jun 10, 2011 at 12:03 AM, Peter Lin wool...@gmail.com  
wrote:


Although it may be obvious to some people, I thought I'd mention
this well known lesson.

Do not load huge knowledge base into memory. This lesson is well
documented in existing literature on