Hi,

On Wed, Jun 18, 2014 at 11:31 AM, Tommaso Teofili
<tommaso.teof...@gmail.com> wrote:
> 2014-06-18 16:02 GMT+02:00 Jukka Zitting <jukka.zitt...@gmail.com>:
>> On Wed, Jun 18, 2014 at 4:26 AM, Tommaso Teofili
>> <tommaso.teof...@gmail.com> wrote:
>> > should we just return the number of estimated entries for the cost?
>>
>> Yes, that's what I think the contract should be.
>
> ok, that's different from what Thomas suggests, right? Just entry
> estimates, no network roundtrips / asynchronous index penalties, etc.

Right. I don't believe the cost of the index lookup is significant (at
least in the asymptotic sense) compared to the overall cost of
executing a query.

> ok, under such perspective the index is not returning a cost, but how many
> nodes it will provide to the engine, the cost of the query is then a
> function of the number of entries.

Exactly.

> At the moment node number estimates and performance of the index aspects
> seem kind of merged into the "getCost".
> Then we should probably decouple (at least) the concepts of:
> 1. how many nodes the index will return for this query (as an estimate)
> 2. how fast in retrieving the estimated nodes the index is

I would further argue that point 2 is mostly irrelevant for any decent
index. The only case where I would expect index performance to show up
as a significant factor is when n is small, but the best way to
optimize such queries is probably to just cache the results per query
instead of trying to make informed guesses about expected index
performance.

> Even with this distinction we would have to make some choices as given two
> indices returning the same number of estimated nodes for the same query, (I
> assume) the fastest should be chosen, but if two indices return two
> different node number estimates (e.g. that's likely if you have two
> different full text indices being able to handle the same query), which one
> should be chosen and why?

Unless there are other contributing factors (like preferring a
synchronous index over an asynchronous one, or an explicit preference
by a client), it shouldn't really matter much which one of equally
costly indexes is being selected.

BR,

Jukka Zitting

Reply via email to