Hi, >The problem with that assumption is that typically a single disk read >to the index would return n paths, whereas loading those n nodes might >well take n more disk reads.
Ideally, the cost returned of the index would reflect that. For single-property indexes (all property indexes are single property right now), the cost should be lower than for a multi-property index, because one disk read operation would fetch more paths. >Sure, but we don't use a covered index. Yes, we are not there yet. The node is currently loaded to check access rights, but that's an implementation detail of access control part. And it's not needed for the admin. If (when) this is changed, it depends on the query. If the query only returns the path (for example), and the indexed property, then the node does not need to be loaded. > At least in the current design >the query engine will always load all the matching nodes, regardless >of any extra information stored in the index. Thus we can't use the >performance of the index lookup as an accurate estimate of the overall >query performance. Yes, the query engine knows the cost overhead per entry, that's why the AdvancedQueryIndex interface is a bit different (does not just contain the cost). >The overhead of the index lookup is probably significant when only few >matching paths are returned (the UUID index would be the ultimate >example), but in those cases query performance is probably best >optimized in other ways (caching, configuration, profiling, etc.) than >making a more accurate estimate of the index lookup performance, >especially since in most such cases there is only a single index with >exact matches. We have queries that have multiple property constraints (for example "size = 'L' and color = 'red'"). If there are two indexes, one for size and one for color, it's important to know which one is faster. (If there is a multi-property index, that one might be faster.) >Agreed, but we should be making guesses about the overall query >performance, not just the index lookup time. Yes, but the index implementation doesn't know (can't know) the cost of the query engine. The query engine needs to calculate its own cost. Regards, Thomas
