RE: processing .rdf files for specific property types only

Gunaratna, Dalkandura Arachchige Kalpa Shashika Silva Sun, 22 Apr 2012 19:19:40 -0700

Hi Stephen,
   Will it increase the efficiency (speed) in processing? In you code,
 
            if (OWL.sameAs.asNode().equals(t.getPredicate()))
            {
                // You can either do something immediately with this
triple, or stick it a HashSet to enforce uniqueness
                sameAsTriples.add(t);
            }


you compare every statement in the model by reading each line in the file as I 
tried to do earlier like follows,

String predicate = st.getPredicate().getURI().toLowerCase();
if(predicate.contains("owl#sameas"))
{
do something to get the list of sameAs links
}

Thank you.

________________________________________
From: Stephen Allen [sal...@apache.org]
Sent: Sunday, April 22, 2012 10:05 PM
To: jena-users@incubator.apache.org
Subject: Re: processing .rdf files for specific property types only

On Sun, Apr 22, 2012 at 6:17 PM, Gunaratna, Dalkandura Arachchige
Kalpa Shashika Silva <gunaratn...@wright.edu> wrote:
> Hi,
>   I have a simple requirement and that is to read 
> <http://www.w3.org/2002/07/owl#sameAs> object values (sameAs link value) in a 
> rdf file. For that I create an ontology model and read the whole file. 
> Following is a code sample I sue for that.
>
> model=ModelFactory.createOntologyModel(OntModelSpec.RDFS_MEM);
>                SysRIOT.wireIntoJena() ;
>                model.read(url);
>                StmtIterator stmtItr=model.listStatements();
>
> This way of processing has a huge processing overhead for my program since 
> for every rdf file I just need to read the whole file to get sameAs links. Is 
> there any other way of doing this kind of work or only possible way is to 
> read the whole file to get the specific property type we want?
>
> And also, what happens if we do not call model.close() at the end? Will it be 
> a problem which will cause heap out of space problem?
>
> Thank you,
> Kalpa

Hi Kalpa,

If you are simply interested in parsing an RDF file in a streaming
fashion, you can do something like below.  If you know that you don't
have any duplicate triples, then you can eliminate the HashSet.

    final Set<Triple> sameAsTriples = new HashSet<Triple>();
    Sink<Triple> sink = new Sink<Triple>()
    {
        @Override
        public void send(Triple t)
        {
            if (OWL.sameAs.asNode().equals(t.getPredicate()))
            {
                // You can either do something immediately with this
triple, or stick it a HashSet to enforce uniqueness
                sameAsTriples.add(t);
            }
        }

        @Override
        public void flush() { }

        @Override
        public void close() { }
    };

    // To enable RDFS inferencing uncomment the following two lines.
    // You need to have your T-Box (ontology) loaded into some model
    //Model ontologyModel = ...
    //sink = InfFactory.infTriples(sink, ontologyModel);

    String filename = ...
    RiotReader.parseTriples(new FileInputStream(filename),
Lang.guess(filename), null, sink);

    // Now do something with sameAsTriples

RE: processing .rdf files for specific property types only

Reply via email to