Hi Stephen,
Adding requirements to the official SPARQL grammar by parsable in
particular ways is adding too much. The goal (SPARQL 1.0 and 1.1) is
it''s LL(1) AKA simple technically and that communicates. People can
implement the language in different ways and the grammar defines the
language but it does not prescribe the implementation.
There may be other ways to do streaming ... at the moment the parser
builds a syntax tree (good for printing out request) but it could be
event generating without a syntax builder built in - non-trivial change.
So what is important to stream?
1/ INSERT DATA (and DELETE DATA)
Maybe the data should not end up in the parser tree but a data bag? It
cn be pulled back in for small requests and printing.
2/ Per operation? It could have a mode to emit each operation as it
goes along.
Possible grammar idea below ...
The grammar for Update in the SPARQL 1.1 PR is as follows:
[29] Update ::= Prologue ( Update1 ( ';' Update )? )?
W3C process foo:
The working group ends Dec 31. There is no chance of an extension -
we're under pressure to finish (not unreasonable ...) Revisions, errata
etc will noted afterwards.
If the language changes, the spec would have to go back to another Last
Call. Implementations, of which there are many, would be affected.
Just changing the grammar, not the language, could be argued to not
affect implementations because it's the language that matters. But that
argument also argues for no change (because the grammar isn't so
important to need a Last Call). Morally, there would be a strong case
for another Last Call on a grammar change.
This is currently implemented in our JavaCC parser as:
Prologue() (Update1() ( <SEMICOLON> Update() )? )?
Unfortunately, the best I non-recursive solution was able to come up with was:
Prologue() ( Update1() ( LOOKAHEAD(2) <SEMICOLON> Prologue()
Update1() )* ( <SEMICOLON> )? )?
Why not add a final optional prologue if the last SEMI is seen?
.... ( <SEMICOLON> (Prologue())? )? )?
[untested]
This is *almost* equivalent to the grammar in the spec, except for one
detail: it does not allow a trailing Prologue(), which the recursive
definition allows. I can't seem to get any closer, mainly due to that
optional semicolon and optional trailing prologue (although you cannot
have a trailing semicolon if you have a lone trailing prologue).
The more I look at the problem, the more I tend to think that maybe
the spec's Update grammar is faulty. I believe it should not allow
trailing prologues. It also should not allow just a prologue and
nothing else (Query forbids this). Examples of queries that I think
should be invalid (but are not currently):
==========
PREFIX : <http://example.org/>
==========
PREFIX : <http://example.org/>
insert data { } ;
PREFIX : <http://example.org/>
==========
Additionally, I would argue that the text of the Update spec [1]
contradicts the existing grammar. Specifically the definition in
section 3:
"A request is a sequence of operations and is terminated by
EOF (End of File). Multiple operations are separated by a ';'
(semicolon) character. A semicolon after the last operation
in a request is optional."
Sequences can be zero length :-)
A prologue by itself is not an operation as defined in section 4.3 [2].
I would propose to the working group that we instead adopt the
following grammar:
[29] Update ::= Prologue Update1 ( ';' Prologue Update1 )* ( ';' )?
This could be easily represented in JavaCC as:
Prologue() Update1() ( LOOKAHEAD(2) <SEMICOLON> Prologue()
Update1() )* ( <SEMICOLON> )?
The trailing semicolon seems to force us into using an LL(2) parser.
I cannot see a way to write this grammar in LL(1).
I have three questions that would be nice to have answered before I
post a comment to the WG:
1) Is there a non-recursive way to write the existing rule 29 that
exactly matches the semantics of the spec?
2) Is there a way to write my proposed rule 29 as LL(1) (even if has
to use recursion)?
3) Would the RDF WG be open to changing the grammar at this point? I
know it is in PR stage, but this would be feedback from attempting
implementation.
-Stephen
[1] http://www.w3.org/TR/2012/PR-sparql11-update-20121108/#updateLanguage
[2]
http://www.w3.org/TR/2012/PR-sparql11-update-20121108/#formalModelGraphUpdate