On Fri, 2010-12-31 at 11:38 -0500, Benson Margulies wrote: 
> On Fri, Dec 31, 2010 at 11:27 AM, Dave Reynolds
> <[email protected]> wrote:
> > On Fri, 2010-12-31 at 08:57 -0500, Benson Margulies wrote:
> >> Step 1:
> >>
> >>  Model schema = ModelFactory.createDefaultModel();
> >>         schema.read(RdfUtils.getJugOntology(),
> >> RdfUtils.getJugOntologyUri(), "RDF/XML");
> >>         return ModelFactory.createRDFSModel(schema, data);
> >
> > What's in the data?
> 
> typical item:
> 
> <uri:jug:0618936a7a03bf236a291bcddbfde63b#e11>
>       <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>                     rex:Person ;
>       rex:hasEntityDetectionSource
>                     "statistical" ;
>       rex:hasNormalizedText
>                     "Obama" ;
>       rex:hasOriginalText
>                     "Obama" ;
>       rex:root      "true" ;
>       owl:sameAs    <uri:jug:0618936a7a03bf236a291bcddbfde63b#e43> ;
>       owl:sameAs    <uri:jug:0618936a7a03bf236a291bcddbfde63b#e25> ;
>       owl:sameAs    <uri:jug:0618936a7a03bf236a291bcddbfde63b#e8> ;
>       owl:sameAs    <uri:jug:0618936a7a03bf236a291bcddbfde63b#e104> ;
>       owl:sameAs    <uri:jug:0618936a7a03bf236a291bcddbfde63b#e4> ;
>       owl:sameAs    <uri:jug:0618936a7a03bf236a291bcddbfde63b#e18> ;
>       owl:sameAs    <uri:jug:0618936a7a03bf236a291bcddbfde63b#e56> ;
>       owl:sameAs    <uri:jug:0618936a7a03bf236a291bcddbfde63b#e115> ;
>       owl:sameAs    <uri:jug:0618936a7a03bf236a291bcddbfde63b#e100> .
> 
> 
> >
> >> Step 2: about 50k tuples, many of them owl:sameAs
> >
> > What is 50k tuples, the data, the schema, both, something else?
> 
> data. Schema is tiny.

Can you show us the schema? 

> >
> >> Step 3:
> >>
> >> NodeIterator sameAsItems = model.listObjectsOfProperty(root,
> >> relatingProp); // prop is in fact owl:sameAs
> >>             while (sameAsItems.hasNext()) {
> >>             ...
> >>             }
> >>
> >> Runs for a very long time, using a very large amount of memory.
> >> Eventually runs out of memory.
> >
> > Strange.  The owl:sameAs reasoning can be hugely expensive (it is
> > fundamentally exponential) but the RDFS reasoner knows nothing about
> > owl:sameAs so isn't doing any of that reasoning.
> 
> Interrupting it in Eclipse, it is definitely deep in the reasoner all
> the time until it runs out of memory and dies.

Is there definitely not an outer loop running?

I can imagine a space leak so repeated calls to the reasoner will use up
memory but find it hard to see how RDFS reasoning with a tiny schema
could blow up so badly.

Do you have a complete minimal example we could take a look at?

[I realize you've switched approach but I'd like to understand why RDFS
reasoning might blow up in this case.]

Dave



Reply via email to