Re: JESS: On the Performance of Logical Retractions
@Peter: I werent interested to plug into Rete at first place, neither had should I use RETE or how does RETE perform in mind. Rather, I was trying to find a solution for my problem at hand, and the more and more i developed my own solution, i found it to be more and more similar to the Rete. So I intended not to reinvent the wheel, and tap into the existing implementations. By performance of RETE i mean the cost of building and maintaining the network and not the data storage and retrieval costs. @Ernest: I understand your point and i think the main problem would be the cascading effect incurred by liberal use of the logical keyword, as you mentioned. As said before, I am using the Open Rule Bench,http://rulebench.projects.semwebcentral.org/which is a set of test cases for a number of rule engines such as XSB, Jess, and Jena (etc.). It is perfectly self contained and you can set it up and test the Jess within 15 minutes. But still I have a question:what type of truth maintenance method is implemented in jess? Do you solely rely on the Rete memory nodes and tokens for this purpose? On Fri, Jun 10, 2011 at 1:21 AM, Peter Lin wool...@gmail.com wrote: By performance of RETE what are you referring to? There are many aspects of RETE, which one must study carefully. It's good that you're translating RDF to OWL, but the larger question is why use OWL/RDF in the first place? Unless the knowledge easily fits into axioms like sky is blue or typical RDF examples, there's no benefit to storing or using RDF. My own bias perspective on RDF/OWL. The real question isn't should I use RETE or how does RETE perform. The real question is how do I solve the problem efficiently? I've built compliance engines for trading systems using JESS. I can say from first hand experience, it's how you use the engine that has the biggest factor. I've done things like load 500K records to check compliance across a portfolio set with minimal latency for nightly batch processes. the key though is taking time to study existing literature and understanding things before jumping to a solution. providing concrete examples of what your doing will likely get better advice than making general statements. On Thu, Jun 9, 2011 at 12:17 PM, Md Oliya md.ol...@gmail.com wrote: Thank you very much Peter for the useful information. I will definitely look into that. but in the context of this message, i am not loading a huge (subjective interpretation?) knowledge base. It's 100k assertions, with the operations taking around 400 MB. Secondly, in my experiments, I subtracted the loading time of the assertions/retractions in jess, as I'm focusing on the performance of the Rete. Lastly, I am not doing an RDF based mapping; rather, I follow the method of Description Logic Programs for translating each Class/Property of OWL into its corresponding template. --Oli. On Fri, Jun 10, 2011 at 12:03 AM, Peter Lin wool...@gmail.com wrote: Although it may be obvious to some people, I thought I'd mention this well known lesson. Do not load huge knowledge base into memory. This lesson is well documented in existing literature on knowledge base systems. it's also been discussed on JESS mailing list numerous times over the years, so I would suggest searching JESS mailing list to learn from other people's experience. It's better to intelligently load knowledge base into memory as needed, rather than blindly load everything. Even in the case where someone has 256Gb of memory, one should ask why load all that into memory up front. If the test is using RDF triples, it's well known that RDF triples produces excessive partial matches and often results in OutOfMemoryException. The real issue isn't JESS, it's how one tries to solve a problem. I would recommend reading Gary Riley's book on expert systems to avoid repeating a lot of mistakes that others have already documented. On Thu, Jun 9, 2011 at 11:41 AM, Md Oliya md.ol...@gmail.com wrote: Thank you Ernest. I am experimenting with the Lehigh university benchmark, where i transfer OWL TBox into their equivalent rules in Jess, with the logical construct. Specifically, I am using the dataset and transformations, as used in the OpenRuleBench. As for the runtimes, I missed a point about the retractions. The fact is, even if the session does not contain any rules (no defrules, just assertions), loading the same set of retractions takes a considerable time. This indicates that the high runtime is mostly incurred by jess internal operations. but still, when the number of changes grows high (say more than 10%) the runtime is not acceptable, and rerunning with the retracted kb would be faster. I have another question as well: what type of truth maintenance method is implemented in jess? Do you solely rely on the Rete memory nodes and tokens for this purpose?
Re: JESS: On the Performance of Logical Retractions
I've looked at OpenRuleBench in the past and I just looked at it again real quick. The way the test was done is the wrong way to use a production rule engine. That's my bias opinion. I understand the intent was to measure the performance with the same data, and similar rules. The point I'm trying to make is that encoding knowledge as triples is pointless and useless for practical applications. Many researchers have extended triples to quads and others convert complex object models to triples back-and-forth. If knowledge naturally fits in a complex object, why decompose it to triples or quads? To draw an absurd anology. Would you dismantle your car every night to store it away and then re-assemble it every morning? Think of it this way, say we want to use Lego bricks to capture knowledge. If the subject happens to work well with a 1x3 brick, then all you need is 1x3 bricks. If the subject is complex, just 1x3 brick probably isn't going to work. In the real world, there's a lot more than 1x3 brick and the things we want to capture usually requires a wide variety of bricks. If you need to assert a bunch of facts and then retract 50% of those facts, the first question should be why am I doing that? and is that a pointless exercise? The first question I would ask is, can I use backward chaining or query approach instead? On Fri, Jun 10, 2011 at 12:58 AM, Md Oliya md.ol...@gmail.com wrote: @Peter: I werent interested to plug into Rete at first place, neither had should I use RETE or how does RETE perform in mind. Rather, I was trying to find a solution for my problem at hand, and the more and more i developed my own solution, i found it to be more and more similar to the Rete. So I intended not to reinvent the wheel, and tap into the existing implementations. By performance of RETE i mean the cost of building and maintaining the network and not the data storage and retrieval costs. @Ernest: I understand your point and i think the main problem would be the cascading effect incurred by liberal use of the logical keyword, as you mentioned. As said before, I am using the Open Rule Bench, which is a set of test cases for a number of rule engines such as XSB, Jess, and Jena (etc.). It is perfectly self contained and you can set it up and test the Jess within 15 minutes. But still I have a question:what type of truth maintenance method is implemented in jess? Do you solely rely on the Rete memory nodes and tokens for this purpose? On Fri, Jun 10, 2011 at 1:21 AM, Peter Lin wool...@gmail.com wrote: By performance of RETE what are you referring to? There are many aspects of RETE, which one must study carefully. It's good that you're translating RDF to OWL, but the larger question is why use OWL/RDF in the first place? Unless the knowledge easily fits into axioms like sky is blue or typical RDF examples, there's no benefit to storing or using RDF. My own bias perspective on RDF/OWL. The real question isn't should I use RETE or how does RETE perform. The real question is how do I solve the problem efficiently? I've built compliance engines for trading systems using JESS. I can say from first hand experience, it's how you use the engine that has the biggest factor. I've done things like load 500K records to check compliance across a portfolio set with minimal latency for nightly batch processes. the key though is taking time to study existing literature and understanding things before jumping to a solution. providing concrete examples of what your doing will likely get better advice than making general statements. On Thu, Jun 9, 2011 at 12:17 PM, Md Oliya md.ol...@gmail.com wrote: Thank you very much Peter for the useful information. I will definitely look into that. but in the context of this message, i am not loading a huge (subjective interpretation?) knowledge base. It's 100k assertions, with the operations taking around 400 MB. Secondly, in my experiments, I subtracted the loading time of the assertions/retractions in jess, as I'm focusing on the performance of the Rete. Lastly, I am not doing an RDF based mapping; rather, I follow the method of Description Logic Programs for translating each Class/Property of OWL into its corresponding template. --Oli. On Fri, Jun 10, 2011 at 12:03 AM, Peter Lin wool...@gmail.com wrote: Although it may be obvious to some people, I thought I'd mention this well known lesson. Do not load huge knowledge base into memory. This lesson is well documented in existing literature on knowledge base systems. it's also been discussed on JESS mailing list numerous times over the years, so I would suggest searching JESS mailing list to learn from other people's experience. It's better to intelligently load knowledge base into memory as needed, rather than blindly load everything. Even in the case where someone has 256Gb of memory, one should ask why load all that into memory
Re: JESS: On the Performance of Logical Retractions
Yeah, I just had a look too, and I think the report on their site says it all. Jess and Drools are at the bottom of their performance results for a reason -- because they're being misapplied. If your problem looks like the kinds of problems they're benchmarking, then by all means use one of the tools that scored well on their tests. Use the proper tool for the job at hand. On Jun 10, 2011, at 8:33 AM, Peter Lin wrote: I've looked at OpenRuleBench in the past and I just looked at it again real quick. The way the test was done is the wrong way to use a production rule engine. That's my bias opinion. I understand the intent was to measure the performance with the same data, and similar rules. The point I'm trying to make is that encoding knowledge as triples is pointless and useless for practical applications. Many researchers have extended triples to quads and others convert complex object models to triples back-and-forth. If knowledge naturally fits in a complex object, why decompose it to triples or quads? To draw an absurd anology. Would you dismantle your car every night to store it away and then re-assemble it every morning? Think of it this way, say we want to use Lego bricks to capture knowledge. If the subject happens to work well with a 1x3 brick, then all you need is 1x3 bricks. If the subject is complex, just 1x3 brick probably isn't going to work. In the real world, there's a lot more than 1x3 brick and the things we want to capture usually requires a wide variety of bricks. If you need to assert a bunch of facts and then retract 50% of those facts, the first question should be why am I doing that? and is that a pointless exercise? The first question I would ask is, can I use backward chaining or query approach instead? On Fri, Jun 10, 2011 at 12:58 AM, Md Oliya md.ol...@gmail.com wrote: @Peter: I werent interested to plug into Rete at first place, neither had should I use RETE or how does RETE perform in mind. Rather, I was trying to find a solution for my problem at hand, and the more and more i developed my own solution, i found it to be more and more similar to the Rete. So I intended not to reinvent the wheel, and tap into the existing implementations. By performance of RETE i mean the cost of building and maintaining the network and not the data storage and retrieval costs. @Ernest: I understand your point and i think the main problem would be the cascading effect incurred by liberal use of the logical keyword, as you mentioned. As said before, I am using the Open Rule Bench, which is a set of test cases for a number of rule engines such as XSB, Jess, and Jena (etc.). It is perfectly self contained and you can set it up and test the Jess within 15 minutes. But still I have a question:what type of truth maintenance method is implemented in jess? Do you solely rely on the Rete memory nodes and tokens for this purpose? On Fri, Jun 10, 2011 at 1:21 AM, Peter Lin wool...@gmail.com wrote: By performance of RETE what are you referring to? There are many aspects of RETE, which one must study carefully. It's good that you're translating RDF to OWL, but the larger question is why use OWL/RDF in the first place? Unless the knowledge easily fits into axioms like sky is blue or typical RDF examples, there's no benefit to storing or using RDF. My own bias perspective on RDF/OWL. The real question isn't should I use RETE or how does RETE perform. The real question is how do I solve the problem efficiently? I've built compliance engines for trading systems using JESS. I can say from first hand experience, it's how you use the engine that has the biggest factor. I've done things like load 500K records to check compliance across a portfolio set with minimal latency for nightly batch processes. the key though is taking time to study existing literature and understanding things before jumping to a solution. providing concrete examples of what your doing will likely get better advice than making general statements. On Thu, Jun 9, 2011 at 12:17 PM, Md Oliya md.ol...@gmail.com wrote: Thank you very much Peter for the useful information. I will definitely look into that. but in the context of this message, i am not loading a huge (subjective interpretation?) knowledge base. It's 100k assertions, with the operations taking around 400 MB. Secondly, in my experiments, I subtracted the loading time of the assertions/retractions in jess, as I'm focusing on the performance of the Rete. Lastly, I am not doing an RDF based mapping; rather, I follow the method of Description Logic Programs for translating each Class/Property of OWL into its corresponding template. --Oli. On Fri, Jun 10, 2011 at 12:03 AM, Peter Lin wool...@gmail.com wrote: Although it may be obvious to some people, I thought I'd mention this well known lesson. Do not load huge knowledge base into memory. This lesson is well documented in existing literature on