[
https://issues.apache.org/jira/browse/JENA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398987#comment-13398987
]
Rob Vesse commented on JENA-267:
--------------------------------
Using a DataBag is definitely a possibility, having got as far as looking at
that part of things the main issue I see is that we may need to keep the
results that we read sorted (because they may be the results of an ordered
query)
I guess in theory even if you haven't see the header provided you'd seen a few
results you should know what the variables used are (of have a decent guess at
it) even without seeing the header so you could just buffer the first 1000 or
so results if you don't see the header first and extrapolate the variables from
there. Given this a DataBag may actually be overkill
I'll carry on experimenting with this
> SPARQL JSON Parser is not streaming
> -----------------------------------
>
> Key: JENA-267
> URL: https://issues.apache.org/jira/browse/JENA-267
> Project: Apache Jena
> Issue Type: Improvement
> Components: ARQ
> Affects Versions: ARQ 2.9.1
> Reporter: Rob Vesse
> Assignee: Rob Vesse
> Labels: json, sparql
>
> The current SPARQL JSON parser in ARQ parses the JSON by parsing the entire
> response into a in-memory data structure and then iterating over that. So
> for large results it is trivial to hit an OutOfMemory exception.
> This task is to reimplement the SPARQL JSON parser to be fully streaming in
> its implementation in line with all the other results parsers
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira