[ 
https://issues.apache.org/jira/browse/JENA-330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506936#comment-13506936
 ] 

Stephen Allen commented on JENA-330:
------------------------------------

My work is now checked in in the streaming-update branch.

I have addressed all of the points in my comment except for 1 (the out-of-order 
blank nodes and lists for the syntactic shortcuts).  I would perhaps argue that 
this is an OKish situation as order shouldn't matter in BGPs.  However, this is 
causing a unit test to fail:

   Running com.hp.hpl.jena.sparql.TC_Scripted
   **** Test: syntax-forms-01.rq
   ** reparsed query hashCode does not equal parsed input query
   (com.hp.hpl.jena.sparql.junit.TestSerialization)

This unit test compares a parsed query with a serialized and reparsed version 
of it.  When the query is serialized, it is an expanded version without the 
syntactic shortcuts.  Basically the test is failing because the blank nodes 
have different internal ids in the two queries, and this causes 
Query.hashCode() and .equals() to be different.  For the example query in 
syntax-forms-01.rq, here is the internal structure of both the originally 
parsed query, and the reparsed version.

Parsed Query:
============

Original:
------
PREFIX : <http://example.org/ns#>
SELECT * WHERE { ( [ ?x ?y ] ) :p ( [ ?pa ?b ] 57 ) }

Internal Rep:
------
{ ??1 ?x ?y .
  ??0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> ??1 .
  ??0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .
  ??3 ?pa ?b .
  ??2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> ??3 .
  ??2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> ??4 .
  ??4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> 57 .
  ??4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .
  ??0 <http://example.org/ns#p> ??2
}

Reparsed Query:
============


{ ??0 ?x ?y .
  ??1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> ??0 .
  ??1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .
  ??2 ?pa ?b .
  ??3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> ??2 .
  ??3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> ??4 .
  ??4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> 57 .
  ??4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .
  ??1 <http://example.org/ns#p> ??3
}


[1] http://markmail.org/message/3aw72tcwdmoxa46b


                
> Streaming support for SPARQL Update queries and streaming support for quads 
> in INSERT DATA / DELETE DATA queries
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: JENA-330
>                 URL: https://issues.apache.org/jira/browse/JENA-330
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Stephen Allen
>            Priority: Minor
>         Attachments: config-null.ttl, JENA-330_20121016.patch, 
> TestLargeUpdates.java
>
>
> The SPARQL Update parser currently parses all update queries into a single 
> UpdateRequest object which holds them in memory.  Instead the parser should 
> insert queries into something like a Sink<Update>.  Additionally it should 
> put the quads from INSERT_DATA and DELETE_DATA into a Sink<Quad> instead of 
> an ArrayList.
> This should allow the creation of a streaming update parser, which could be 
> combined with JENA-309 to have full streaming into an underlying 
> transactional store and the ability to handle arbitrarily large INSERT_DATA 
> or DELETE_DATA queries (to the limits of the transaction system).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to