On Fri, Dec 31, 2010 at 11:27 AM, Dave Reynolds
<[email protected]> wrote:
> On Fri, 2010-12-31 at 08:57 -0500, Benson Margulies wrote:
>> Step 1:
>>
>> Model schema = ModelFactory.createDefaultModel();
>> schema.read(RdfUtils.getJugOntology(),
>> RdfUtils.getJugOntologyUri(), "RDF/XML");
>> return ModelFactory.createRDFSModel(schema, data);
>
> What's in the data?
typical item:
<uri:jug:0618936a7a03bf236a291bcddbfde63b#e11>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
rex:Person ;
rex:hasEntityDetectionSource
"statistical" ;
rex:hasNormalizedText
"Obama" ;
rex:hasOriginalText
"Obama" ;
rex:root "true" ;
owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e43> ;
owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e25> ;
owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e8> ;
owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e104> ;
owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e4> ;
owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e18> ;
owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e56> ;
owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e115> ;
owl:sameAs <uri:jug:0618936a7a03bf236a291bcddbfde63b#e100> .
>
>> Step 2: about 50k tuples, many of them owl:sameAs
>
> What is 50k tuples, the data, the schema, both, something else?
data. Schema is tiny.
>
>> Step 3:
>>
>> NodeIterator sameAsItems = model.listObjectsOfProperty(root,
>> relatingProp); // prop is in fact owl:sameAs
>> while (sameAsItems.hasNext()) {
>> ...
>> }
>>
>> Runs for a very long time, using a very large amount of memory.
>> Eventually runs out of memory.
>
> Strange. The owl:sameAs reasoning can be hugely expensive (it is
> fundamentally exponential) but the RDFS reasoner knows nothing about
> owl:sameAs so isn't doing any of that reasoning.
Interrupting it in Eclipse, it is definitely deep in the reasoner all
the time until it runs out of memory and dies.
I've removed all use reasoning.
To forstall, I've also concluded that the data model I've got with
items in RDF for all the entity refs is not viable, and I'll be
changing it.