[GitHub] [skywalking-banyandb] hanahmily commented on pull request #10: Query Module: Logical plan Pt.

GitBox Tue, 22 Jun 2021 17:36:56 -0700


hanahmily commented on pull request #10:
URL: 
https://github.com/apache/skywalking-banyandb/pull/10#issuecomment-866431221



   > > From your screenshots, the `Scan` is the first operation even fields 
exist. From the design, `Select` which should be the first one parsed ChunkIDs 
from `index`, then the results will be passed to `FetchEntity`(I didn't see 
this step in the parsed plan). The `Scan` will be used only there's no any 
`Selection` input, which means it will be picked rarely.
   > 
   > As I understood, the logical plan does not care about the indexes, it just 
prepares the metadata and resolves fields so that the existence of these fields 
which are referenced can be guaranteed.
   > 
   > As we've discussed in the last PR, we can select the indexes while 
generating physical plans based on cost-first consideration.
   
   Not that. cost-based optimization only takes place when a field belongs to 
more than one index. Supposing `service_id` is indexed by `service_id` + 
`instance_id` and `service_id` + `endpoint_id`, we should leverage cost-based 
statistics to determine which one should be used. In this case, the combination 
of `service_id` and `instance_id` has fewer cardinalities, which will get less 
cost. Based on that, the query physical optimizer should pick up it instead of 
`service_id` + `endpoint_id`.
   
   As I mentioned in another PR, we don't support the composite index for now. 
We don't have to implement the above plan optimization. 
   
   >  the logical plan does not care about the indexes
   
   That's the convention of a traditional SQL database due to the complex index 
type, for example, single, composite, unique, group by and etc. Such a database 
can't determine query paths without statistics. But the query of BanyanDB 
doesn't, which knows how to access single fields index based on the criteria. 
   
   We also remove `tags` from query criteria, which's due to the huge cost of 
`Scan` filtering. As I mentioned, the query plan picks up `Scan` when no fields 
are input. 
   
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [skywalking-banyandb] hanahmily commented on pull request #10: Query Module: Logical plan Pt.

Reply via email to