[DISCUSS] - QueryIndex selection

Tommaso Teofili Mon, 26 May 2014 01:27:07 -0700

Hi all,

I'd like to start discussing how we may improve / simplify current way of
selecting a query engine to use for a certain query.


In the QueryIndex interface we have the plain old getCost method which
selects the index returning the lower cost for the given query but,
recently, also an AdvancedQueryIndex interface has been introduced which,
if I understood things correctly, uses the IndexPlan(s) returned by each
query index for the given query to select which one has to be used.
So I would like to discuss if it's possible to clean up things a bit in
order to have a unified query selection mechanism.

At the moment, in my opinion, one problem with the getCost() method is that
it inherently merges the following topics:
- index capability to handle a certain query (can the QueryIndex handle
that query?)
- index efficiency in handling a certain query (how fast will the
QueryIndex will be in handling that query?)

Also the efficiency is not evaluated on a "cost model", each QueryIndex
implementation can return an arbitrary different number; on one hand this
is ok as it allows to take very index specific constraint into account: on
the other hand if one has to write a new QueryIndex implementation he/she
will have to look into each other query index implementation to understand
(and design) if / when its index is picked up; and even with already
existing indexes it's not easy to say upfront which one will be selected
(e.g. for debugging purposes).

With the AdvancedQueryIndex, if I understood it correctly (I just had a
look at it on Friday), a QueryIndex is selected upon its IndexPlan, which
is supposed to address better both the cost (as it explicitly exposes the
cost per execution, cost per entry and estimated entry count metrics) and
the query index capability to handle a certain query (e.g. this is used for
ordered property index).
However, at the moment, only the OrderedPropertyIndex is using it so I
think it'd be good to decide if we want to go further with the
AdvancedQueryIndex also for the other QueryIndex implementations (and get
rid for example of the FullTextQueryIndex interface as it seems useless to
me) or not.

One final question on query index selection, should we always select the
fastest index ?
Especially for full text ones this should be in some way configurable.

What do others think?
Regards,
Tommaso

p.s.:
As discussed also offline last week with some other folks maybe one further
metric to be taken into consideration for the index selection is if the
index is synchronous or not

[DISCUSS] - QueryIndex selection

Reply via email to