[ 
https://issues.apache.org/jira/browse/JENA-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872146#comment-15872146
 ] 

Andy Seaborne edited comment on JENA-329 at 2/17/17 5:22 PM:
-------------------------------------------------------------

The duplicate removal could go into {{QueryExcutionBase.execConstructTriples}}, 
{{execConstructQuads}}.  There is no contract to say duplicates are delivered.

Even in {{TemplateLib.calcTriples}}, {{calcQuads}} (ouch - they don't stream), 
with bonus points for creating the triple/quad.  In teh case of a template, 
there is duplication from the row before, let alone a sliding window, because 
not all variables are used in each triple template.  That might be more trouble 
than it's work.

For Fuseki, reduction in bytes transferred is going to be beneficial.

Slight digression: in {{FactoryRDFCaching}} the code does a similar sliding 
window via a Google Guava cache. The performance impact was more than I 
expected.  I wonder if there is a more efficient way to a non-thread-safe 
sliding LRU window.



was (Author: andy.seaborne):
This could go into {{QueryExcutionBase.execConstructTriples}}, 
{{execConstructQuads}}.  There is no contract to say duplicates are delivered.

Even in {{TemplateLib.calcTriples}}, {{calcQuads}} (ouch - they don't stream), 
with bonus points for creating the triple/quad.  In teh case of a template, 
there is duplication from the row before, let alone a sliding window, because 
not all variables are used in each triple template.  That might be more trouble 
than it's work.

Digression: in {{FactoryRDFCaching}} the code does a similar sliding window via 
a Google Guava cache. The performance impact was more than I expected.  I 
wonder if there is a more efficient way to a non-thread-safe sliding LRU window.


> Add streaming CONSTRUCT results to Fuseki
> -----------------------------------------
>
>                 Key: JENA-329
>                 URL: https://issues.apache.org/jira/browse/JENA-329
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: Fuseki
>            Reporter: Stephen Allen
>
> As a result of JENA-205, streaming results are now available for CONSTRUCT 
> queries.  However there can be duplicate triples in the iterator.  This task 
> is to allow Fuseki to stream back results, while at the same time performing 
> a distinct operation.
> The fix would be to modify SPARQL_Query to use 
> QueryExecution.execConstructTriples() and filter the results through a 
> DistinctDataNet<Triple> as they are being streamed back to the client.
> This also requires RDFWriter implementations that can accept Iterator<Triple> 
> instead of Model.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to