[ 
https://issues.apache.org/jira/browse/JENA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398891#comment-13398891
 ] 

Andy Seaborne commented on JENA-267:
------------------------------------

Tricky point. 

If the objective of streaming is efficency/low-latency, not handling very large 
results sets (a significant proportion of heap), then the stream-if-you, 
buffer-if-you-can't can is a good solution.

For very large results, to keep a perfect contract, if the header is not 
available first, stream for a bit to see if you see the end, but if it gets too 
big, flush to disk (org.openjena.atlas.data.DataBag) then stream from disk 
after the headers arrive.

To weaken the contract, ignore the lack of header and stream parse (and lie for 
ResultSet.getResultVars). Even if we had a clean slate, the utility of 
getResultVars for formatting results is quite high so the best weakening of the 
contract I can think of is to add something like "it's the result vars if 
available".  But at such scale, switching to TSV is probably acceptable.

It does suggest JSON results are best for small and mid-range result sets.
                
> SPARQL JSON Parser is not streaming
> -----------------------------------
>
>                 Key: JENA-267
>                 URL: https://issues.apache.org/jira/browse/JENA-267
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>    Affects Versions: ARQ 2.9.1
>            Reporter: Rob Vesse
>            Assignee: Rob Vesse
>              Labels: json, sparql
>
> The current SPARQL JSON parser in ARQ parses the JSON by parsing the entire 
> response into a in-memory data structure and then iterating over that.  So 
> for large results it is trivial to hit an OutOfMemory exception.
> This task is to reimplement the SPARQL JSON parser to be fully streaming in 
> its implementation in line with all the other results parsers

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to