yes, that is true, but I had a problem in calling this function,

RiotReader.parseTriples(is, Lang.RDFXML, null, sink);

I put Lang.RDFXML instead of your suggestion Lang.guess(filename) and then the 
processing seem to take for ever. May be I have done something wrong here. 
Following is the full code I used for testing.

        URL dataURL = new URL(url);
        URLConnection conn = dataURL.openConnection();
        InputStream is = conn.getInputStream();
        BufferedReader in = new BufferedReader(new InputStreamReader(is));
                
            final Set<Triple> sameAsTriples = new HashSet<Triple>();
            Sink<Triple> sink = new Sink<Triple>()
            {
                @Override
                public void send(Triple t)
                {
                        System.out.println("##");
                    if (OWL.sameAs.asNode().equals(t.getPredicate()))
                    {
                        // You can either do something immediately with this
//      triple, or stick it a HashSet to enforce uniqueness
                        sameAsTriples.add(t);
                    }
                }

                @Override
                public void flush() { }

                @Override
                public void close() { }
            };

            // To enable RDFS inferencing uncomment the following two lines.
            // You need to have your T-Box (ontology) loaded into some model
            //Model ontologyModel = ...
            //sink = InfFactory.infTriples(sink, ontologyModel);

//          String filename = ...
//          RiotReader.parseTriples(new FileInputStream(filename),
//      Lang.guess(filename), null, sink);
            
            RiotReader.parseTriples(is, Lang.RDFXML, null, sink);

            in.close();


even I didn't see the "##" I put to see whether the program is parsing the rdf 
file. Is there anything I missed? The program seems not to work. 
________________________________________
From: Stephen Allen [sal...@apache.org]
Sent: Monday, April 23, 2012 5:38 PM
To: jena-users@incubator.apache.org
Subject: Re: processing .rdf files for specific property types only

You can use any InputStream.  To get one for a URL, try something like this:

   URL url = new URL(urlString);
   InputStream in = url.openConnection().getInputStream();




On Mon, Apr 23, 2012 at 9:34 AM, Gunaratna, Dalkandura Arachchige
Kalpa Shashika Silva <gunaratn...@wright.edu> wrote:
> Stephen, I have a question here on how to run the code. Here is the question. 
> When I want to read a rdf file, I just get the url of the rdf file and create 
> a OntModel and read it. For the model, I just need to give the url only. But 
> for the code you have suggested needs a filename (locally). In this case, can 
> we do the same I did previously? For example, I just use url strings as 
> follows in my code.
>
> http://rdf.freebase.com/ns/m/067n4r
> http://rdf.freebase.com/ns/en.mountain_view
> http://dbpedia.org/resource/Mountain_View,_California
>
> I do not down download the rdf files in my code as of now but I do not know 
> whether Jena downloads files when giving a url string to the model to read. 
> Any help will be greatly appreciated. Thank you.
>
>
> ________________________________________
> From: Stephen Allen [sal...@apache.org]
> Sent: Monday, April 23, 2012 2:43 AM
> To: jena-users@incubator.apache.org
> Subject: Re: processing .rdf files for specific property types only
>
> Yes, we moved to the Apache community about a year ago.  The latest
> release version of ARQ is 2.9.0, and the latest of Jena Core is 2.7.0.
>  You can download them from the Apache distribution site [1], which is
> linked to by [2].
>
> -Stephen
>
> [1] http://www.apache.org/dist/incubator/jena/
> [2] http://incubator.apache.org/jena/download/index.html
>
>
> On Sun, Apr 22, 2012 at 10:38 PM, Gunaratna, Dalkandura Arachchige
> Kalpa Shashika Silva <gunaratn...@wright.edu> wrote:
>> One question to follow up. I am using ARQ 2.8.5 distribution and its content 
>> (jena packages). The class Triple does not seem to work with that 
>> distribution and I just downloaded 2.8.6 from source-forge release dated on 
>> 2011-04-21. Is there any other new package available for this cause or any 
>> newer distribution available other than in source-forge cite? Thank you.
>> ________________________________________
>> From: Stephen Allen [sal...@apache.org]
>> Sent: Sunday, April 22, 2012 10:38 PM
>> To: jena-users@incubator.apache.org
>> Subject: Re: processing .rdf files for specific property types only
>>
>> I'm not sure I understand your question.  The code I posted will read
>> the file in a single pass, and filter it down to only statements that
>> contain the owl:sameAs resource in the predicate position.  This is
>> about the fastest way you can parse your RDF.  It will also use a lot
>> less memory than storing it in an in-memory model, as it works in a
>> streaming fashion.  Also, if you don't need RDFS inferencing don't
>> include it as it adds overhead.
>>
>> Try it out with your code, and see what the performance difference is.
>>
>> As a side note, the comparison in your if statement will be a little
>> slower than mine since you are using String.contains(), and
>> potentially incorrect if some other predicate had the string
>> "owl#sameAs" in it, but wasn't the full
>> "http://www.w3.org/2002/07/owl#sameAs";.
>>
>> -Stephen
>>
>> On Sun, Apr 22, 2012 at 7:19 PM, Gunaratna, Dalkandura Arachchige
>> Kalpa Shashika Silva <gunaratn...@wright.edu> wrote:
>>> Hi Stephen,
>>>   Will it increase the efficiency (speed) in processing? In you code,
>>>
>>>            if (OWL.sameAs.asNode().equals(t.getPredicate()))
>>>            {
>>>                // You can either do something immediately with this
>>> triple, or stick it a HashSet to enforce uniqueness
>>>                sameAsTriples.add(t);
>>>            }
>>>
>>> you compare every statement in the model by reading each line in the file 
>>> as I tried to do earlier like follows,
>>>
>>> String predicate = st.getPredicate().getURI().toLowerCase();
>>> if(predicate.contains("owl#sameas"))
>>> {
>>> do something to get the list of sameAs links
>>> }
>>>
>>> Thank you.
>>>
>>> ________________________________________
>>> From: Stephen Allen [sal...@apache.org]
>>> Sent: Sunday, April 22, 2012 10:05 PM
>>> To: jena-users@incubator.apache.org
>>> Subject: Re: processing .rdf files for specific property types only
>>>
>>> On Sun, Apr 22, 2012 at 6:17 PM, Gunaratna, Dalkandura Arachchige
>>> Kalpa Shashika Silva <gunaratn...@wright.edu> wrote:
>>>> Hi,
>>>>   I have a simple requirement and that is to read 
>>>> <http://www.w3.org/2002/07/owl#sameAs> object values (sameAs link value) 
>>>> in a rdf file. For that I create an ontology model and read the whole 
>>>> file. Following is a code sample I sue for that.
>>>>
>>>> model=ModelFactory.createOntologyModel(OntModelSpec.RDFS_MEM);
>>>>                SysRIOT.wireIntoJena() ;
>>>>                model.read(url);
>>>>                StmtIterator stmtItr=model.listStatements();
>>>>
>>>> This way of processing has a huge processing overhead for my program since 
>>>> for every rdf file I just need to read the whole file to get sameAs links. 
>>>> Is there any other way of doing this kind of work or only possible way is 
>>>> to read the whole file to get the specific property type we want?
>>>>
>>>> And also, what happens if we do not call model.close() at the end? Will it 
>>>> be a problem which will cause heap out of space problem?
>>>>
>>>> Thank you,
>>>> Kalpa
>>>
>>> Hi Kalpa,
>>>
>>> If you are simply interested in parsing an RDF file in a streaming
>>> fashion, you can do something like below.  If you know that you don't
>>> have any duplicate triples, then you can eliminate the HashSet.
>>>
>>>    final Set<Triple> sameAsTriples = new HashSet<Triple>();
>>>    Sink<Triple> sink = new Sink<Triple>()
>>>    {
>>>        @Override
>>>        public void send(Triple t)
>>>        {
>>>            if (OWL.sameAs.asNode().equals(t.getPredicate()))
>>>            {
>>>                // You can either do something immediately with this
>>> triple, or stick it a HashSet to enforce uniqueness
>>>                sameAsTriples.add(t);
>>>            }
>>>        }
>>>
>>>        @Override
>>>        public void flush() { }
>>>
>>>        @Override
>>>        public void close() { }
>>>    };
>>>
>>>    // To enable RDFS inferencing uncomment the following two lines.
>>>    // You need to have your T-Box (ontology) loaded into some model
>>>    //Model ontologyModel = ...
>>>    //sink = InfFactory.infTriples(sink, ontologyModel);
>>>
>>>    String filename = ...
>>>    RiotReader.parseTriples(new FileInputStream(filename),
>>> Lang.guess(filename), null, sink);
>>>
>>>    // Now do something with sameAsTriples
>>>
>>>
>>
>>
>
>


Reply via email to