Hello Dave and Alexis, Thanks for your advice, I will give this a try. (Especially the GenericRule-Reasoner) This plus removal of all non Whitelisted sameAs links could be a good combination.
Best Regards Andreas >>> Alexis Armin Huf <alexis...@gmail.com> 20.02.2018 13:43 >>> Hi Andreas, I had a similar scenario, but as Dave said there is no choose-a-enum-value reasoner for that. What I did was picking the RDFS and owl:sameAs rules from the rule files in Jena source and instantiate a GenericRuleReasoner with my custom rule file. Docs for how to do this are here: https://jena.apache.org/documentation/inference/index.html#rules. Below is a short walk through of how I set this up. First, build your rules file. Look for the files in jena-core/src/main/resources/etc <https://github.com/apache/jena/tree/master/jena-core/src/main/resources/etc>, specially rdfs-fb.rules and owl-fb-mini.rules. Your rule file will look like this: -> tableAll(). [rdfs7b: (?a rdf:type rdfs:Class) -> (?a rdfs:subClassOf rdfs:Resource)] [rdfs2: (?p rdfs:domain ?c) -> [(?x rdf:type ?c) <- (?x ?p ?y)] ] [rdfs3: (?p rdfs:range ?c) -> [(?y rdf:type ?c) <- (?x ?p ?y)] ] [rdfs5a: (?a rdfs:subPropertyOf ?b), (?b rdfs:subPropertyOf ?c) -> (?a rdfs:subPropertyOf ?c)] [rdfs5b: (?a rdf:type rdf:Property) -> (?a rdfs:subPropertyOf ?a)] # ... and this goes on ... # There are a lot of details around owl:sameAs, but you probably will need these: [sameAs1: (?A owl:sameAs ?B) -> (?B owl:sameAs ?A) ] [sameAs2: (?A owl:sameAs ?B) (?B owl:sameAs ?C) -> (?A owl:sameAs ?C) ] [equality1: (?X owl:sameAs ?Y), notEqual(?X,?Y) -> [(?X ?P ?V) <- (?Y ?P ?V)] [(?V ?P ?X) <- (?V ?P ?Y)] ] Save this file a a resource of your application, parse it and create a GenericRuleReasoner, like this: ClassLoader loader = SomeClass.class.getClassLoader(); try (BufferedReader reader = new BufferedReader(new InputStreamReader(loader.getResourceAsStream("rules/rdfs+sameAs.rules")))) { List<Rule> rules = Rule.parseRules(Rule.rulesParserFromReader(reader)); GenericRuleReasoner reasoner = new GenericRuleReasoner(rules); return ModelFactory.createModelForGraph(reasoner.bind(modelThatNeedsReasoning.getGraph())) } Hope that helps! Dave Reynolds <dave.e.reyno...@gmail.com> schrieb am Di., 20. Feb. 2018 um 06:03 Uhr: > Hi Andreas, > > Jena does not currently have any alternative built reasoner for RDFS + > owl:sameAs and I'm not aware of any such "equality reasoner" being in > development. You could try Pellet, which may offer better performance. > > In fact equality reasoning is notoriously expensive in the general case, > the logic is indeed simple but the cost can blow up easily because it > leads to a combinatorial number of deductions. > > Depending on what problem you are trying to solve your best bet may be > to avoid using owl:sameAs reasoning at run time altogether. For example, > in some cases it may be possible to do a pass over the data at ingest > time to identify all aliases and to only assert in the model some > "cannonical" URI for each alias equivalence set. > > Dave > > On 20/02/18 07:17, Andreas Kahl wrote: > > Hello everyone, > > > > I am currently developing a little Jena Model that should be able to do > > RDFS inferencing plus owl:sameAs. From the documentation I learned that > > the minimal Reasoner for that is OWLmini. During development I > > experienced some severe performance bottlenecks if a runtime model > > contains too many owl:sameAs links and generally for nearly all models > > exceeding 1000 Statements. Most of the tests simply freeze at some point > > if those performance bottlenecks occur, sometimes selecting a Statement > > with a SimpleSelector consisting of a subject URI, a predicate URI and a > > null Object takes 20secs. > > There should be not problems with blocking of threads as I run my > > integration tests single threaded - especially if I am experiencing > > failures. > > > > I could confine this by using models without inferencing while > > collecting and adding data spidered from the web, and especially adding > > Ontologies last, only where absolutely needed. Also I use a whitelist > > internally for domains my spider is allowed to fetch data from; > > therefore I remove all owl:sameAs Statements containing object URIs not > > in this whitelist. In the end, in my querying methods, I clone that > > basic model with the collected data and add it to an InfModel: > > > > protected static Model getInfModelFrom(Model model) { > > final long size = model.size(); > > LOG.debug("getInfModelFrom: Input size: " + > > Long.toString(size)); > > final Model copy = ModelFactory.createDefaultModel(); > > copy.add(model instanceof InfModel ? ((InfModel) > > model).getRawModel() : model); > > final InfModel infModel = > > ModelFactory.createInfModel(ReasonerRegistry.getOWLMiniReasoner(), > > copy); > > return infModel; > > } > > > > The only Ontology I am using is > > http://d-nb.info/standards/elementset/gnd# . > > > > I suppose that the Reasoner I use is much to mighty for the seemingly > > simple owl:sameAs. Is there any more basic option understanding > > owl:sameAs besides RDFS? All other OWL Axioms are not needed. > > Are there any best practices dealing with Inferencing for relatively > > small in memory models <10,000 Statements (most <5,000 Statements)? I > > found some information on the web that a simple 'Equality Reasoner' is > > in the works. Would that be a good choice? Will it be available any time > > soon? > > > > Thanks for any hints > > Andreas > > >