On 10/04/12 17:08, Robert Vesse wrote:
The primary motivation of this is that if like us you are
intercepting queries and providing your own processing you have no
visibility back to the original query string since at the level of
QueryExecutionFactory and query execution you have only a Query
object and an algebra object
In our architecture queries may be very long running so we have a
queue into which we give users visibility but right now we can only
show them the serialized form of their parsed query. Due to the nice
syntax printing and possible optimization ARQ does on the query that
serialized form may look very different and users are confused by
this.
The ability to preserve comments is of particular interest because we
may want to use comments as a means to tag queries to indicate where
they originated from. Right now the only other mechanism that would
let us do this would be to define a fake prefix which encodes this
(perhaps with tag URLs) but that only covers one use case and still
doesn't allow us to preserve more free form description of the
queries in the form of comments.
Rob,
That use case make sense to me and I was just about to reply ... but I
went for a run and something occurred to me.
Query objects provide structure equality. They override .equals(Object)
and .hashCode().
This allows query objects to be used in hash tables for example. You
might have a cache of results by query to avoid re-execution of a query
(picked from a library by two different people?)
I have used this for a query to results cache (see my github and project
LD-Access for example).
Whether the label is part of the quality contract or not is tricky -
whether it is or isn't seems to get into a bit of trouble either way
round. If it isn't (aside from violating the general Java contract),
then the object in the cache/map/set/whatever may not be recognized by
the user a the one put in - the label may change or disappear. If you
do, then it a more specific instance.
In your system, what I read as happening is that there is a "Job" - at
the moment the Query is the Job but a job may have other characteristics
like submission time, who submitted it, priority etc etc. Putting a
label on the job seems the right thing because it can carry a lot of
other stuff like the submitter and also return the execution time, the
cost, etc. The Job then has Job.getQuery()
To put it another way, a query is a class - the job is the instance.
The tagging is a good example - a query may come from a library of
queries so is it labelled as from the library or the person submitting it?
There has, in the past, been META on queries for stashing away labels
etc. but it gets confusing. Better to put external to the query e.g. Job.
Andy