hanahmily commented on pull request #10:
URL:
https://github.com/apache/skywalking-banyandb/pull/10#issuecomment-866431221
> > From your screenshots, the `Scan` is the first operation even fields
exist. From the design, `Select` which should be the first one parsed ChunkIDs
from `index`, then the results will be passed to `FetchEntity`(I didn't see
this step in the parsed plan). The `Scan` will be used only there's no any
`Selection` input, which means it will be picked rarely.
>
> As I understood, the logical plan does not care about the indexes, it just
prepares the metadata and resolves fields so that the existence of these fields
which are referenced can be guaranteed.
>
> As we've discussed in the last PR, we can select the indexes while
generating physical plans based on cost-first consideration.
Not that. cost-based optimization only takes place when a field belongs to
more than one index. Supposing `service_id` is indexed by `service_id` +
`instance_id` and `service_id` + `endpoint_id`, we should leverage cost-based
statistics to determine which one should be used. In this case, the combination
of `service_id` and `instance_id` has fewer cardinalities, which will get less
cost. Based on that, the query physical optimizer should pick up it instead of
`service_id` + `endpoint_id`.
As I mentioned in another PR, we don't support the composite index for now.
We don't have to implement the above plan optimization.
> the logical plan does not care about the indexes
That's the convention of a traditional SQL database due to the complex index
type, for example, single, composite, unique, group by and etc. Such a database
can't determine query paths without statistics. But the query of BanyanDB
doesn't, which knows how to access single fields index based on the criteria.
We also remove `tags` from query criteria, which's due to the huge cost of
`Scan` filtering. As I mentioned, the query plan picks up `Scan` when no fields
are input.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]