On 06/06/12 19:01, Stephen Allen wrote:
Hi Andy,

I was curious about a statement you made on the user list yesterday:

On Tue, Jun 5, 2012 at 2:43 PM, Andy Seaborne<[email protected]>  wrote:

Updates don't log. Form submitted updates are buffered - the entire string
is available to be printed but ones sent as "application/sparql-update" are
stream read (e.g. a large INSERT DATA { .... })


I was looking at the parsing code, and it's true that
"application/x-www-form-urlencoded" updates are buffered into a String
early in the process, although it appears to me that for
"application/sparql-update", the ARQParser and SPARQLParser11 also
have to buffer all the update data in UpdateRequest objects (which for
the DATA methods are an in-memory list of Quads).

Yes and no. The input stream is directly parsed to a syntax tree so the string (the body) of the POST is not available to be printed. There is "just" the one copy.

It also means if there is a parse error, the request is not printed in normal set up.

This is a balance - HTTP is generally about validate-execute and also it is good to know the operation is valid before starting (not everything is transactional).

Maybe it should be less clever and do string-log-parse-execute.

I have been thinking about how to make this process streaming but I
didn't know whether it made sense to try to modify the JavaCC parsers
to be streaming or try to build a hybrid parser for just SPARQL
Update.  This hybrid would handle INSERT DATA and DELETE DATA in a
streaming manner, and delegate regular updates to the existing parser.
  Do you have any thoughts or advice?

There is a tension between operations of just INSERT/DELETE DATA and combined, complex multi-part operations. The latter leans towards complex parsing of whole sequences of actions before any operation.

So I think a separate, streaming, bulk-focused parser for INSERT DATA and DELETE DATA would be the way to go (and update processor etc).

javacc sharing is not something I have ever managed to get working to separate the grammar from actions without distorting the entire thing to be dominated by that design goal. I have tried to remove all code from the parser, and just use events and the parser is streaming, the super class code builds the state. It could be redone to pass in a builder, not use the superclass. SPARQL Update does include the whole of SPARQL Query pattern matching.

So it's a bit of a mess from the multi-use point-of-view but the spec is stable so copy is tolerable (if somewhat irritating from an aesthetic POV).

Actually, this looks like the tip of a general need for a non-SPARQL (or SPARQL+ if you prefer) remote interface to Fuseki. See also the users@ question and transactions across several Fuseki operations.

So may there is a language lurking around here somewhere. It would stream-execute. More fine grained than GSP, less than full SPARQL Update.

INSERT DATA, DELETE DATA
BEGIN/COMMIT/ABORT
CLEAR/DROP, LOAD
CREATE DATASET, DROP DATASET
UNMOUNT DATASET
MOUNT DATASET
BACKUP DATASET
...

        Andy


-Stephen

Reply via email to