A CBO can only make worse decisions than the status quo for what I presume are the majority of queries - i.e. those that touch only primary indexes. In general, there are plenty of use cases that prefer determinism. So I agree that there should at least be a CBO implementation that makes the same decisions as the status quo, deterministically. I do support the proposal, but would like to see some elements discussed in more detail. The maintenance and distribution of summary statistics in particular is worthy of its own CEP, and it might be preferable to split it out. The proposal also seems to imply we are aiming for coordinators to all make the same decision for a query, which I think is challenging, and it would be worth fleshing out the design here a little (perhaps just in Jira). While I’m not a fan of ALLOW FILTERING, I’m not convinced that this CEP deprecates it. It is a concrete qualitative guard rail, that I expect some users will prefer to a cost-based guard rail. Perhaps this could be left to the CBO to decide how to treat. There’s also not much discussion of the execution model: I think it would make most sense for this to be independent of any cost and optimiser models (though they might want to operate on them), so that EXPLAIN and hints can work across optimisers (a suitable hint might essentially bypass the optimiser, if the optimiser permits it, by providing a standard execution model) I think it would be worth considering providing the execution plan to the client as part of query preparation, as an opaque payload to supply to coordinators on first contact, as this might simplify the problem of ensuring queries behave the same without adopting a lot of complexity for synchronising statistics (which will never provide strong guarantees). Of course, re-preparing a query might lead to a new plan, though any coordinators with the query in their cache should be able to retrieve it cheaply. If the execution model is efficiently serialised this might have the ancillary benefit of improving the occupancy of our prepared query cache. On 13 Dec 2023, at 00:44, Jon Haddad <j...@jonhaddad.com> wrote:
|
- [DISCUSS] CEP-39: Cost Based Optimizer Benjamin Lerer
- Re: [DISCUSS] CEP-39: Cost Based Optimizer David Capwell
- Re: [DISCUSS] CEP-39: Cost Based Optimizer guo Maxwell
- Re: [DISCUSS] CEP-39: Cost Based Optimizer Jon Haddad
- Re: [DISCUSS] CEP-39: Cost Based Optimize... Benedict
- Re: [DISCUSS] CEP-39: Cost Based Opt... Maxim Muzafarov
- Re: [DISCUSS] CEP-39: Cost Based... guo Maxwell
- Re: [DISCUSS] CEP-39: Cost Based Opt... Benjamin Lerer
- Re: [DISCUSS] CEP-39: Cost Based... Benedict
- Re: [DISCUSS] CEP-39: Cost Based... Benjamin Lerer
- Re: [DISCUSS] CEP-39: Cost B... Benjamin Lerer
- Re: [DISCUSS] CEP-39: Cost B... Benedict
- Re: [DISCUSS] CEP-39: Cost B... Benjamin Lerer
- Re: [DISCUSS] CEP-39: Cost B... Benedict
- Re: [DISCUSS] CEP-39: Cost B... Benedict