Dave, I will set up a complete repro and push it to github.
--benson On Sat, Jan 1, 2011 at 5:58 AM, Dave Reynolds <[email protected]> wrote: > On Fri, 2010-12-31 at 11:38 -0500, Benson Margulies wrote: >> On Fri, Dec 31, 2010 at 11:27 AM, Dave Reynolds >> <[email protected]> wrote: >> > On Fri, 2010-12-31 at 08:57 -0500, Benson Margulies wrote: >> >> Step 1: >> >> >> >> Model schema = ModelFactory.createDefaultModel(); >> >> schema.read(RdfUtils.getJugOntology(), >> >> RdfUtils.getJugOntologyUri(), "RDF/XML"); >> >> return ModelFactory.createRDFSModel(schema, data); >> > >> > What's in the data? >> >> typical item: >> >> <uri:jug:0618936a7a03bf236a291bcddbfde63b#e11> >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> >> rex:Person ; >> rex:hasEntityDetectionSource >> "statistical" ; >> rex:hasNormalizedText >> "Obama" ; >> rex:hasOriginalText >> "Obama" ; >> rex:root "true" ; >> owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e43> ; >> owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e25> ; >> owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e8> ; >> owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e104> ; >> owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e4> ; >> owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e18> ; >> owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e56> ; >> owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e115> ; >> owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e100> . >> >> >> > >> >> Step 2: about 50k tuples, many of them owl:sameAs >> > >> > What is 50k tuples, the data, the schema, both, something else? >> >> data. Schema is tiny. > > Can you show us the schema? > >> > >> >> Step 3: >> >> >> >> NodeIterator sameAsItems = model.listObjectsOfProperty(root, >> >> relatingProp); // prop is in fact owl:sameAs >> >> while (sameAsItems.hasNext()) { >> >> ... >> >> } >> >> >> >> Runs for a very long time, using a very large amount of memory. >> >> Eventually runs out of memory. >> > >> > Strange. The owl:sameAs reasoning can be hugely expensive (it is >> > fundamentally exponential) but the RDFS reasoner knows nothing about >> > owl:sameAs so isn't doing any of that reasoning. >> >> Interrupting it in Eclipse, it is definitely deep in the reasoner all >> the time until it runs out of memory and dies. > > Is there definitely not an outer loop running? > > I can imagine a space leak so repeated calls to the reasoner will use up > memory but find it hard to see how RDFS reasoning with a tiny schema > could blow up so badly. > > Do you have a complete minimal example we could take a look at? > > [I realize you've switched approach but I'd like to understand why RDFS > reasoning might blow up in this case.] > > Dave > > > >
