'Query' started as the AST from parsing, and should have been a fairly
opaque class. But it got out and people started creating queries
programmatically [*]. And some of this predates SPARQL algebra.
The contract should be create-seal-use.
"create-seal" can be single threaded "use" multithreaded.
Andy
[*] Most of the the time it's better to use the algebra these days:
http://jena.apache.org/documentation/query/algebra.html
>> (I can provide samples but they'd just be very mundane select SPARQL
>> stuff, I am not claiming these to be very complex queries...)
Thanks - it is useful to know that it is not in subqueries.
On 04/12/14 17:10, Kristian Rosenvold wrote:
And if the Query object "by design" is supposed to only contain
metadata that should be initialized once, it should be fairly simple
to patch it for full concurrent thread safety, at least for instances
that are constructed before any consuming threads are created....
(Some of the variable names within the "Query" object certainly had me
fooled at first reading; - the list called "resultNodes" certainly
implies request-specific state for the uninitiated reader)
Kristian
2014-12-04 18:02 GMT+01:00 Kristian Rosenvold <[email protected]>:
A single instance of the Query object is definitely not safe for
concurrent access. I assumed it was, and that blew up on my first
attempt at adding any kind of load. The first thing I see within the
Query#setResultVars method is a boolean toggle "resultVarsSet" that
only allows this to be set once per Query instance. If the Query
instance was to be safe for concurrent access, there'd have to be
synchronization around that.
Alternately you could say that a single instance of a Query object is
safe for sequential re-use (and that all the fields are really only
metadata fields), which I did not actually test. Are you saying this
is supposed to be the case ? If so, it looks like it's sequential
re-use on the single thread that created the Query, since that's the
only kind of memory model visibility I see this class handling..?
As for the queries in question they're pretty basic stuff, but we do a
lot of them. Re-parsing the queries is over 50% of our processing time
as measured by YJP. We run the same basic 10 different queries over
nearly a million different graphs, which adds up to a fair bit of
heat.
(I can provide samples but they'd just be very mundane select SPARQL
stuff, I am not claiming these to be very complex queries...)
Kristian
2014-12-04 17:36 GMT+01:00 Andy Seaborne <[email protected]>:
On 04/12/14 15:01, Kristian Rosenvold wrote:
And now I can seemingly use
QueryExecution exec = QueryExecutionFactory.create(copyOf(q1template),
model,params);
and actually get this working concurrently with re-use of the parsed
sparql expression. But of course there's ElementSubQuery pointing back
at the Query object, so a deep copy of the queryPattern Element
structure would probably also be required, which all seems a bit like
barking up the wrong tree. And I'm not really sure I understand all
the cross-references well enough to know that this would even
work....(?)
The QueryExecutionFactory should work - the query parsed object is not
modified. The params are used to seed the query execution. The Query
object is reusable. (It had better be -- Query must be immutable after they
are fully created to be valid as map keys.) A bit like an external VALUES
clause.
Where are the actual costs? Reparsing the string will be expensive at some
level of intense usage; manipulating the AST may not be.
RIOT uses custom tokenizing and parsing for Turtle etc because javacc was
slow for large data. SPARQL is a lot more complicated as a grammar (Turtle
is pretty trivial) and a parser generator is quite important yet it does all
that single char messing around that general tokenizing/parsing has to do.
There is a "query template" system that works by minimally rewriting the AST
-- it has not been incorported into the code base; I wrote it at a time we
were discussing ParameterizedQueryString and security issues of injection.
It has test cases though :-)
Do you have an exmaple of the sort of query you are working with?
Andy
https://github.com/afs/AFS-Dev/tree/master/src/main/java/element