On 26/05/2014 09:25, Tommaso Teofili wrote:
> ...
> Also the efficiency is not evaluated on a "cost model", each QueryIndex
> implementation can return an arbitrary different number; on one hand this
> is ok as it allows to take very index specific constraint into account: on
> the other hand if one has to write a new QueryIndex implementation he/she
> will have to look into each other query index implementation to understand
> (and design) if / when its index is picked up; and even with already
> existing indexes it's not easy to say upfront which one will be selected
> (e.g. for debugging purposes).
I don't know if by using the IndexPlans it will be possible to says
beforehand which index will be pick up. It really depends by the query
engine logics and if there are other indexes that could perform faster
why not choose them?
> With the AdvancedQueryIndex, if I understood it correctly (I just had a
> look at it on Friday), a QueryIndex is selected upon its IndexPlan, which
> is supposed to address better both the cost (as it explicitly exposes the
> cost per execution, cost per entry and estimated entry count metrics) and
> the query index capability to handle a certain query (e.g. this is used for
> ordered property index).
> However, at the moment, only the OrderedPropertyIndex is using it so I
> think it'd be good to decide if we want to go further with the
> AdvancedQueryIndex also for the other QueryIndex implementations (and get
> rid for example of the FullTextQueryIndex interface as it seems useless to
> me) or not.
>
> One final question on query index selection, should we always select the
> fastest index ?
> Especially for full text ones this should be in some way configurable.
Yes we discussed off-line last time. It seems that it would be good for
the query engine to expose an API in which the client application could
state: run with this index. Something like
queryengine.forceIndex("path/to/oak:QueryIndexDefinition"). If
specified the query engine will know about it and skip the index
selection by forcing it.
Useful definitely for debugging as well as in edge cases where the
client application will know that the query has to be run with a
specific index for a reason or another.
> ...
> As discussed also offline last week with some other folks maybe one further
> metric to be taken into consideration for the index selection is if the
> index is synchronous or not
>
The current index plan of the AdvancedQueryIndex expose the sync/async
aspect by IndexPlan.isDelayed(). If that is taken into account yet I
don't know :)
Another aspect of improvement we were discussing is on the query engine
side. As the index selections and plan are expensive if the query engine
is asked to execute a query with bindings[0] it could cache the selected
index for either a fixed amount of time or other logics.
(0) http://goo.gl/LDQw1Y
Last one as we where discussing about the possibility of serving queries
for "all" the properties from Lucene/Solr a metrics of evaluation could
be that in case of the same property served by a sync property index and
a Lucene, the first one will have should be chosen as it would be local.
D.