If we can also build CBO capabilities for Cassandra in the future, it will be of great help to Cassandra in expanding its ability to analyze big data. If CEP-39 can be adopted, is it possible to divide the big task to some sub-tasks and combine the power of the community to jointly advance this matter?
But I wonder if the syntax for index selection can also be supported? I don't know the background of SAI optimization, but I think the issue of index selection is necessary and meaningful. This is why I opened this issue. For example, MySQL supports CBO, but also provides users with an entry to select/ignore indexes. Benjamin Lerer <b.le...@gmail.com> 于2024年12月31日周二 00:36写道: > I am personally -1 against the approach taken by DS for optimizing SAI > queries. I consider it as a quick fix rather than a proper long term > solution. > The issue with this approach is that it splits the optimisation logic of > queries. With one part of the optimization happening on the coordinator > while the other is performed on the replicas. By doing so it will make the > introduction of features such as global indexes much harder as those > features will need to be plugged to the optimization logic. > I strongly believe that the proper solution for query optimization is what > was proposed in CEP-39: a cost based query optimizer located on the > coordinator. > Unfortunately do not have the time to work on it. > > > Le lun. 23 déc. 2024 à 07:11, guo Maxwell <cclive1...@gmail.com> a écrit : > >> Wrong typing, it is DML not DDL ~~~😅 >> >> guo Maxwell <cclive1...@gmail.com> 于2024年12月23日周一 11:56写道: >> >>> Thanks Caleb for bringing it up. >>> >>> When I opened this issue, I originally wanted to propose the following >>> syntax: >>> 1、When we want to force the selection of certain indexes: >>> SELECT ...USE INDEX (ks.idx1, ks.idx2....) ... FROM ...WHERE ... >>> or >>> SELECT .../*+ USE INDEX (ks.idx1, ks.idx2....)*/ ... FROM ...WHERE ... >>> >>> 2、When we want to ignore the selection of some indexes: >>> SELECT ...IGNORE INDEX (ks.idx1, ks.idx2....) ... FROM ... WHERE ... >>> or >>> SELECT .../*+ IGNORE INDEX (ks.idx1, ks.idx2....) */ ... FROM ... WHERE >>> ... >>> >>> And In fact, what I originally wanted to do was to provide a hint >>> function >>> <https://dev.mysql.com/doc/refman/8.0/en/optimizer-hints.html#optimizer-hints-overview> >>> similar to MySQL, which can push down some rules dynamically. >>> The work to provide a hint for DDL needs to be accomplished in many >>> steps, CASSANDRA-18112 for index hint is part of it, maybe the first step. >>> >>> >>> Ekaterina Dimitrova <e.dimitr...@gmail.com> 于2024年12月21日周六 22:27写道: >>> >>>> Naming is hard but to me providing what Caleb mentioned through >>>> something like WITH OPTIONS sounds reasonable. Thanks for bringing it >>>> up. >>>> >>>> On Sat, 21 Dec 2024 at 2:46, Joel Shepherd <sheph...@amazon.com> wrote: >>>> >>>>> WITH INDEX (or something equivalent) seems really useful. >>>>> >>>>> Less opinionated on the specific syntax, but I think there is a lot of >>>>> value in the form of predictable, controllable performance, in giving >>>>> developers more direct control over query execution, whether that's >>>>> index selection or even lower-level decisions. If you've experienced >>>>> the >>>>> thrill of operating a database with a cost-based planner that abruptly >>>>> selects a new, sub-optimal plan due to a change in statistics or >>>>> configuration, you'll appreciate language features that yield some >>>>> planning control back to you. It does increase the burden on the >>>>> developer to understand how best to execute the query, but it makes >>>>> their intent much more obvious, and easier to adjust as the system >>>>> changes. >>>>> >>>>> -- Joel. >>>>> >>>>> On 12/20/2024 12:28 PM, Caleb Rackliffe wrote: >>>>> > Some of your are probably familiar with work in the DS fork to >>>>> improve >>>>> > the selection of indexes for SAI queries in >>>>> > >>>>> https://github.com/datastax/cassandra/commit/eeb33dd62b9b74ecf818a263fd73dbe6714b0df0#diff-2830028723b7f4af5ec7450fae2c206aeefa5a2c3455eff6f4a0734a85cb5424. >>>>> >>>>> > >>>>> > >>>>> > While I'm eagerly anticipating working on that in the new year, I'm >>>>> > also wondering whether we think some simple CQL extensions to >>>>> manually >>>>> > control index selection would be helpful. Maxwell proposed this a >>>>> > while back in CASSANDRA-18112, and I'd like to propose a syntax: >>>>> > >>>>> > >>>>> > ex. Do not use the specified index during the query. >>>>> > >>>>> > SELECT ... FROM ... WHERE ... WITHOUT INDEX <ks.idx> >>>>> > >>>>> > This could be helpful for intersection queries where one of the >>>>> > provided clauses is not very selective and could simply be handled >>>>> via >>>>> > post-filtering. >>>>> > >>>>> > ex. Require the specified index to be used. >>>>> > >>>>> > SELECT ... FROM ... WHERE ... WITH INDEX <ks.idx> >>>>> > >>>>> > This could be helpful in scenarios where multiple indexes exist on a >>>>> > column and was the primary motivation for CASSANDRA-18112. >>>>> > >>>>> > Thoughts? >>>>> >>>>