Re: Streaming a ResultSet as RDF using a custom vocabulary

Enrico Daga (enridaga) Fri, 16 Oct 2015 09:32:19 -0700

Thank you for the insight and the suggestion about compacting the code.

About streaming block formats,  keeping track of blocks in memory should not be 
a problem for my use case, as I can expect column numbers in select queries 
won’t be too many.
And as far as I know none of the RDF formats really require to load *all* the 
data in memory (you can define local prefixes in XML, and repeat them in 
Turtle, if you really want them).
But you are right that in general these are bad formats to use for large data 
streams as they end up to be very verbose.


However, Jena does not seem to support some of the RDF serializations for 
streaming, namely XML and JSON formats, resulting in a 
org.apache.jena.riot.RiotException: No serialization for language 
Lang:rdf/null, for example. Is this right or I am mistaking/missing something? 
I would really like this same code to support all available serialisation 
formats!

Thanks,
Enrico




> On 15 Oct 2015, at 17:34, A. Soroka <[email protected]> wrote:
> 
> I just re-read your message more carefully and realized that you are using a 
> version of Jena <3. In this case, I believe you will want to use, instead of 
> the type Function<>, the older type Map1<> if you want to use my suggestion. 
> I am sorry for any confusion.
> 
> ---
> A. Soroka
> The University of Virginia Library
> 
>> On Oct 15, 2015, at 12:00 PM, Enrico Daga (enridaga) <[email protected]> 
>> wrote:
>> 
>> Thank you for your reply.
>> Actually the problem is not really about the representation - for example I 
>> might use the DataCube vocabulary - but is more about how to use the Jena 
>> serialisers to stream custom triples adapted from a ResultSet efficiently.
>> The ResultSetFormatter.toModel approach is not the one I like, as it 
>> requires the RDF to be generated in memory before serialisation. 
>> I posted my solution to SO: 
>> http://stackoverflow.com/questions/33136916/streaming-a-resultset-as-rdf-using-a-custom-vocabulary/33153024#33153024
>>  
>> <http://stackoverflow.com/questions/33136916/streaming-a-resultset-as-rdf-using-a-custom-vocabulary/33153024#33153024>
>> (Are there better ways of doing that?)
>> 
>> However, it looks like the streaming features do not support all RDF syntax, 
>> as I got a RIOT exception when I ask for RDF/XML or RDF/JSON formats.
>> So now my problem is how to support all serialisations.
>> Or maybe my version of Jena is outdated (2.12.1) and I should use Jena 3?
>> 
>> Thanks,
>> 
>> Enrico
>> 
>> 
>>> On 14 Oct 2015, at 18:53, A. Soroka <[email protected]> wrote:
>>> 
>>> Perhaps you could say more about the representation you want to use? 
>>> ResultSetFormatter does feature methods that (to my understanding) do 
>>> stream using Jena serialization:
>>> 
>>> https://jena.apache.org/documentation/javadoc/arq/org/apache/jena/query/ResultSetFormatter.html#output-java.io.OutputStream-org.apache.jena.query.ResultSet-org.apache.jena.sparql.resultset.ResultsFormat-
>>>  
>>> <https://jena.apache.org/documentation/javadoc/arq/org/apache/jena/query/ResultSetFormatter.html#output-java.io.OutputStream-org.apache.jena.query.ResultSet-org.apache.jena.sparql.resultset.ResultsFormat->
>>> 
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>> 
>>>> On Oct 14, 2015, at 6:39 PM, Enrico Daga (enridaga) <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> in my use case I need to stream a ResultSet obtained from a query to a 
>>>> remote endpoint converted into an RDF output format.
>>>> I know Jena provides a ResultSetFormatter.toModel facility for that, 
>>>> however I have the following constraints:
>>>> - I want to use a different representation/vocabulary and not the one 
>>>> provided by Jena, and
>>>> - I don't want to load the data in memory. In other words I don't want to 
>>>> create a Model and fill it with the ResultSet, but streaming out the 
>>>> triples while I iterate on it, to control memory consumption.
>>>> - I still want to benefit by the Jena serializers
>>>> 
>>>> I have seen the StreamRDF interface, but I am not very clear about how to 
>>>> use it effectively.
>>>> What could be a correct approach in this scenario?
>>>> 
>>>> Thank you,
>>>> 
>>>> Enrico
>>>> 
>>>> —
>>>> Enrico Daga (enridaga)
>>>> http://www.enridaga.net <http://www.enridaga.net/> 
>>>> <http://www.enridaga.net/ <http://www.enridaga.net/>>
>>>> Il budda e’ nel parco.
>> 
>

Re: Streaming a ResultSet as RDF using a custom vocabulary

Reply via email to