Re: [DISCUSS] The future of CREATE INDEX

2023-06-20 Thread Caleb Rackliffe
security. >> >> ____ >> From: Henrik Ingo >> Sent: Wednesday, May 17, 2023 22:32 >> To: dev@cassandra.apache.org >> Subject: Re: [DISCUSS] The future of CREATE INDEX >> >> NetApp Security WARNING: This is an exte

Re: [DISCUSS] The future of CREATE INDEX

2023-05-19 Thread Caleb Rackliffe
mpromise the security. > > > From: Henrik Ingo > Sent: Wednesday, May 17, 2023 22:32 > To: dev@cassandra.apache.org > Subject: Re: [DISCUSS] The future of CREATE INDEX > > NetApp Security WARNING: This is an external email. Do not click lin

Re: [DISCUSS] The future of CREATE INDEX

2023-05-18 Thread Miklosovic, Stefan
because different local configurations might compromise the security. From: Henrik Ingo Sent: Wednesday, May 17, 2023 22:32 To: dev@cassandra.apache.org Subject: Re: [DISCUSS] The future of CREATE INDEX NetApp Security WARNING: This is an external email

Re: [DISCUSS] The future of CREATE INDEX

2023-05-17 Thread Caleb Rackliffe
> 1. What's up with naming anything "legacy". Calling the current index type "2i" seems perfectly fine with me. From what I've heard it can work great for many users? We can give the existing default secondary index any public-facing name we like, but "2i" is too broad. It just stands for

Re: [DISCUSS] The future of CREATE INDEX

2023-05-17 Thread Henrik Ingo
I have read the thread but chose to reply to the top message... I'm coming to this with the background of having worked with MySQL, where both the storage engine and index implementation had many options, and often of course some index types were only available in some engines. I would humbly

Re: [DISCUSS] The future of CREATE INDEX

2023-05-16 Thread Caleb Rackliffe
I might as well weigh in... [POLL] Centralize existing syntax or create new syntax? 1.) CREATE INDEX ... USING ... WITH OPTIONS... (I think the more important protection for users WRT local indexes should come in the form of a guardrail prohibiting scatter/gather queries against them.) [POLL]

Re: [DISCUSS] The future of CREATE INDEX

2023-05-15 Thread guo Maxwell
> > [POLL] Centralize existing syntax or create new syntax? 1.) CREATE INDEX ... USING WITH OPTIONS... and I think we should keep CREATE CUSTOM INDEX [POLL] Should there be a default? (YES/NO) of course YES [POLL] What do do with the default? 4.) YAML config/guardrail to require

Re: [DISCUSS] The future of CREATE INDEX

2023-05-15 Thread Jonathan Ellis
On Fri, May 12, 2023 at 1:39 PM Caleb Rackliffe wrote: > [POLL] Centralize existing syntax or create new syntax? > 1 (Existing) [POLL] Should there be a default? (YES/NO) > YES > [POLL] What do do with the default? > 1 (Default SAI)

Re: [DISCUSS] The future of CREATE INDEX

2023-05-15 Thread Dinesh Joshi
> On May 12, 2023, at 11:36 AM, Caleb Rackliffe > wrote: > > [POLL] Centralize existing syntax or create new syntax? > > 1.) CREATE INDEX ... USING WITH OPTIONS... > 2.) CREATE LOCAL INDEX ... USING ... WITH OPTIONS... (same as 1, but adds > LOCAL keyword for clarity and separation from

Re: [DISCUSS] The future of CREATE INDEX

2023-05-15 Thread David Capwell
> [POLL] Centralize existing syntax or create new syntax? 1.) CREATE INDEX ... USING WITH OPTIONS... > [POLL] Should there be a default? (YES/NO) Yes > [POLL] What do do with the default? 3.) YAML config to override default index (legacy 2i remains the default) 4.) YAML config/guardrail

Re: [DISCUSS] The future of CREATE INDEX

2023-05-15 Thread Patrick McFadin
1 Yes 4 On Mon, May 15, 2023 at 3:00 AM Benedict wrote: > 3: CREATE INDEX (Otherwise 2) > No > If configurable, should be a distributed configuration. This is very > different to other local configurations, as the 2i selected has semantic > implications, not just performance (and the perf

Re: [DISCUSS] The future of CREATE INDEX

2023-05-15 Thread Benedict
3: CREATE INDEX (Otherwise 2)NoIf configurable, should be a distributed configuration. This is very different to other local configurations, as the 2i selected has semantic implications, not just performance (and the perf implications are also much greater)On 15 May 2023, at 10:45, Mike Adamson

Re: [DISCUSS] The future of CREATE INDEX

2023-05-15 Thread Mike Adamson
> > [POLL] Centralize existing syntax or create new syntax? > > 1.) CREATE INDEX ... USING WITH OPTIONS... > 2.) CREATE LOCAL INDEX ... USING ... WITH OPTIONS... (same as 1, but > adds LOCAL keyword for clarity and separation from future GLOBAL indexes) > 1.) CREATE INDEX ... USING

Re: [DISCUSS] The future of CREATE INDEX

2023-05-15 Thread Mick Semb Wever
[POLL] Centralize existing syntax or create new syntax? > > 1.) CREATE INDEX ... USING WITH OPTIONS... > 2.) CREATE LOCAL INDEX ... USING ... WITH OPTIONS... (same as 1, but > adds LOCAL keyword for clarity and separation from future GLOBAL indexes) > (1) CREATE INDEX … > [POLL] Should

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
I don’t think there’s going to be any real support for doing it in 5.0 anyway at this point.On May 12, 2023, at 1:48 PM, Benedict wrote:Given we have no data in front of us to make a decision regarding switching defaults, I don’t think it is suitable to include that option in this poll. In fact,

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Benedict
Given we have no data in front of us to make a decision regarding switching defaults, I don’t think it is suitable to include that option in this poll. In fact, until we have sufficient data to discuss that I’m going to put a hard veto on that on technical grounds.On 12 May 2023, at 19:41, Caleb

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Jeremiah D Jordan
> [POLL] Centralize existing syntax or create new syntax? 1.) CREATE INDEX ... USING WITH OPTIONS... > [POLL] Should there be a default? (YES/NO) YES > [POLL] What do do with the default? 3.) YAML config to override default index (legacy 2i remains the default) DESCRIBE should always

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
...and to clarify, answers should be what you'd like to see for 5.0 specifically On Fri, May 12, 2023 at 1:36 PM Caleb Rackliffe wrote: > [POLL] Centralize existing syntax or create new syntax? > > 1.) CREATE INDEX ... USING WITH OPTIONS... > 2.) CREATE LOCAL INDEX ... USING ... WITH

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
[POLL] Centralize existing syntax or create new syntax? 1.) CREATE INDEX ... USING WITH OPTIONS... 2.) CREATE LOCAL INDEX ... USING ... WITH OPTIONS... (same as 1, but adds LOCAL keyword for clarity and separation from future GLOBAL indexes) (In both cases, we deprecate w/ client warnings

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Benedict
But then we have to reconsider the existing syntax, or do we want LOCAL to be the default?We should be planning our language evolution along with our feature evolution.On 12 May 2023, at 19:28, Caleb Rackliffe wrote:If at some point in the glorious future we have global indexes, I'm sure we can

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
If at some point in the glorious future we have global indexes, I'm sure we can add GLOBAL to the syntax...sry, working on an ugly poll... On Fri, May 12, 2023 at 1:24 PM Benedict wrote: > If folk should be reading up on the index type, doesn’t that conflict with > your support of a default? >

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Benedict
If folk should be reading up on the index type, doesn’t that conflict with your support of a default? Should there be different global and local defaults, once we have global indexes, or should we always default to a local index? Or a global one? > On 12 May 2023, at 18:39, Mick Semb Wever

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Mick Semb Wever
> > > Given it seems most DBs have a default index (see Postgres, etc.), I tend > to lean toward having one, but that's me... > I'm for it too. Would be nice to enforce the setting is globally uniform to avoid the per-node problem. Or add a keyspace option. For users replaying <5 DDLs this

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Benedict
There remains the question of what the new syntax is - whether it augments CREATE INDEX to replace CREATE CUSTOM INDEX or if we introduce new syntax because we think it’s clearer.I can accept settling for modifying CREATE INDEX … USING, but I maintain that CREATE LOCAL INDEX is betterOn 12 May

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
Even if we don't want to allow a default, we can keep the same CREATE INDEX syntax in place, and have a guardrail forcing (or not) the selection of an implementation, right? This would be no worse than the YAML option we already have for enabling 2i creation as a whole. On Fri, May 12, 2023 at

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Benedict
I’m not convinced a default index makes any sense, no. The trade-offs in a distributed setting are much more pronounced.Indexes in a local-only RDBMS are much simpler affairs; the trade offs are much more subtle than here. On 12 May 2023, at 18:24, Caleb Rackliffe wrote:> Now, giving this

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
> Now, giving this thread, there is pushback for a config to allow default impl to change… but there is 0 pushback for new syntax to make this explicit…. So maybe we should [POLL] for what syntax people want? I think the essential question is whether we want the concept of a default index. If we

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Benedict
I still prefer introducing CREATE LOCAL INDEX, to help users understand the semantics of the index they’re creating.I think it will in future potentially be quite confusing to be able to create global and local indexes using the same DDL statement.But, depending on appetite, that could plausibly

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread David Capwell
> I really dislike the idea of the same CQL doing different things based upon a > per-node configuration. > I agree with Brandon that changing CQL behaviour like this based on node > config is really not ideal. I am cool adding such a config, and also cool keeping CREATE INDEX disabled by

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
So the weakest version of the plan that actually accomplishes something useful for 5.0: 1.) Just leave the CREATE INDEX default alone for now. Hard switch the default after 5.0. 2.) Add USING...WITH... support to CREATE INDEX, so we don't have to go to market using CREATE CUSTOM INDEX, which

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
I don't want to cut over for 5.0 either way. I was more contrasting a configurable cutover in 5.0 vs. a hard cutover later. On Fri, May 12, 2023 at 12:09 PM Benedict wrote: > If the performance characteristics are as clear cut as you think, then > maybe it will be an easy decision once the

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Benedict
If the performance characteristics are as clear cut as you think, then maybe it will be an easy decision once the evidence is available for everyone to consider?If not, then we probably can’t do the hard cutover and so the answer is still pretty simple? On 12 May 2023, at 18:04, Caleb Rackliffe

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
I don't particularly like the YAML solution either, but absent that, we're back to fighting about whether we introduce entirely new syntax or hard cut over to SAI at some point. We already have per-node configuration in the YAML that determines whether or not we can create a 2i at all, right?

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Benedict
A table is not a local concept at all, it has a global primary index - that’s the core idea of Cassandra.I agree with Brandon that changing CQL behaviour like this based on node config is really not ideal. New syntax is by far the simplest and safest solution to this IMO. It doesn’t have to use

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Brandon Williams
On Fri, May 12, 2023 at 11:29 AM Caleb Rackliffe wrote: > > Okay, so the proposal for 5.0... > > 1.) Add a YAML option that specifies a default implementation for CREATE > INDEX, and make this the legacy 2i for now. No existing DDL breaks. We don't > have to commit to the absolute superiority

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
...and if we decide before the 5.0 release that we have enough information to change the default (#1), we can change it in a matter of minutes. On Fri, May 12, 2023 at 11:28 AM Caleb Rackliffe wrote: > We don't need to know everything about SAI's performance profile to plan > and execute some

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
We don't need to know everything about SAI's performance profile to plan and execute some small, reasonable things now for 5.0. I'm going to try to summarize the least controversial package of ideas from the discussion above. I've left out creating any new syntax. For example, I think CREATE LOCAL

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Benedict
if we didn't have copious amounts of (not all public, I know, working on it) evidenceIf that’s the assumption on which this proposal is based, let’s discuss the evidence base first, as given the fundamentally different way they work (almost diametrically opposite), I would want to see a very high

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Caleb Rackliffe
> This creates huge headaches for everyone successfully using 2i today though, and SAI *is not* guaranteed to perform as well or better - it has a very different performance profile. We wouldn't have even advanced it to this point if we didn't have copious amounts of (not all public, I know,

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Aleksey Yeshchenko
This. I would also consider adding CREATE LEGACY INDEX syntax as an alias for today’s CREATE INDEX, the latter to be deprecated and (in very distant future) removed. > On 12 May 2023, at 13:14, Benedict wrote: > > This creates huge headaches for everyone successfully using 2i today though, >

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Benedict
This creates huge headaches for everyone successfully using 2i today though, and SAI *is not* guaranteed to perform as well or better - it has a very different performance profile.I think we should deprecate CREATE INDEX, and introduce new syntax CREATE LOCAL INDEX to make clear that this is not a

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Mick Semb Wever
On Thu, 11 May 2023 at 05:27, Patrick McFadin wrote: > Having pulled a lot of developers out of the 2i fire, > Yes. I'm keen not to leave 2i as the default once SAI lands. Otherwise I agree with the deprecated first principle, but 2i is just too problematic. Just having no default in 5.0,

Re: [DISCUSS] The future of CREATE INDEX

2023-05-10 Thread Patrick McFadin
There will be a LOT of content around using SAI in 5.0. CCing marketing ML On Wed, May 10, 2023 at 8:38 PM Jeff Jirsa wrote: > Changes like this always scare me, but the benefits probably outweigh the > risks. Probably obviously to whoever implements but please make sure if > this happens is

Re: [DISCUSS] The future of CREATE INDEX

2023-05-10 Thread Jeff Jirsa
Changes like this always scare me, but the benefits probably outweigh the risks. Probably obviously to whoever implements but please make sure if this happens is super visible in both NEWS and simultaneously updates the to-string / to-cql representation of the schema in cqlsh / drivers / snapshots

Re: [DISCUSS] The future of CREATE INDEX

2023-05-10 Thread Patrick McFadin
Having pulled a lot of developers out of the 2i fire, I would love it if defaults got a bit more sane. Adding USING...WITH... on CREATE INDEX seems like the right move for most developers that don't read docs and assume behavior. As much as I hate that 2i would be the configured default, I get

Re: [DISCUSS] The future of CREATE INDEX

2023-05-10 Thread David Capwell
> Having to revert to CREATE CUSTOM INDEX sounds pretty awful, so I'd prefer > allowing USING...WITH... for CREATE INDEX I have 0 issues with a new syntax to make this more clear > just deprecating CREATE CUSTOM INDEX (at least after 5.0), but that's more or > less what my original proposal

Re: [DISCUSS] The future of CREATE INDEX

2023-05-10 Thread Caleb Rackliffe
tl;dr If you take my original proposal and change only the fact that CREATE INDEX retains a configurable default, I think we get to the same place? (Then it's just a matter of what we do in 5.0 vs. after 5.0...) On Wed, May 10, 2023 at 11:00 AM Caleb Rackliffe wrote: > I see a broad desire

Re: [DISCUSS] The future of CREATE INDEX

2023-05-10 Thread Caleb Rackliffe
> We could introduce new syntax that properly appreciates there’s no default index, perhaps CREATE LOCAL [type] INDEX? To also make clear that these indexes involve a partition key or scatter gather I think this is something we should handle in guardrails space on the query side for all indexes.

Re: [DISCUSS] The future of CREATE INDEX

2023-05-10 Thread Caleb Rackliffe
I see a broad desire here to have a configurable (YAML) default implementation for CREATE INDEX. I'm not strongly opposed to that, as the concept of a default index implementation is pretty standard for most DBMS (see Postgres, etc.). However, keep in mind that if we do that, we still need to

Re: [DISCUSS] The future of CREATE INDEX

2023-05-10 Thread Benedict
I’m not convinced by the changing defaults argument here. The characteristics of the two index types are very different, and users with scripts that make indexes today shouldn’t have their behaviour change.We could introduce new syntax that properly appreciates there’s no default index, perhaps

Re: [DISCUSS] The future of CREATE INDEX

2023-05-09 Thread guo Maxwell
+1 , as we must Improve the image of your own default indexing ability. and As for *CREATE CUSTOM INDEX *, should we just left as it is and we can disable the ability for create SAI through *CREATE CUSTOM INDEX* in some version after 5.0? for as I know there may be users using this as a

Re: [DISCUSS] The future of CREATE INDEX

2023-05-09 Thread Dinesh Joshi
I agree. 5.0 is a major release and provides an opportunity to switch defaults. > On May 9, 2023, at 7:00 PM, Jonathan Ellis wrote: > > +1 for this, especially in the long term. CREATE INDEX should do the right > thing for most people without requiring extra ceremony. > > On Tue, May 9, 2023

Re: [DISCUSS] The future of CREATE INDEX

2023-05-09 Thread Jonathan Ellis
+1 for this, especially in the long term. CREATE INDEX should do the right thing for most people without requiring extra ceremony. On Tue, May 9, 2023 at 5:20 PM Jeremiah D Jordan wrote: > If the consensus is that SAI is the right default index, then we should > just change CREATE INDEX to be

Re: [DISCUSS] The future of CREATE INDEX

2023-05-09 Thread Jeremiah D Jordan
> If we assume SAI is what we should use by default for the cluster, would it > make sense to allow > > CREATE INDEX [IF NOT EXISTS] [name] ON () > > But use a new yaml config that switches from legacy to SAI? > > default_2i_impl: sai > > For 5.0 we can default to “legacy” (new features

Re: [DISCUSS] The future of CREATE INDEX

2023-05-09 Thread Jeremiah D Jordan
If the consensus is that SAI is the right default index, then we should just change CREATE INDEX to be SAI, and legacy 2i to be a CUSTOM INDEX. > On May 9, 2023, at 4:44 PM, Caleb Rackliffe wrote: > > Earlier today, Mick started a thread on the future of our index creation DDL > on Slack: >

Re: [DISCUSS] The future of CREATE INDEX

2023-05-09 Thread David Capwell
If we assume SAI is what we should use by default for the cluster, would it make sense to allow CREATE INDEX [IF NOT EXISTS] [name] ON () But use a new yaml config that switches from legacy to SAI? default_2i_impl: sai For 5.0 we can default to “legacy” (new features disabled by default),

[DISCUSS] The future of CREATE INDEX

2023-05-09 Thread Caleb Rackliffe
Earlier today, Mick started a thread on the future of our index creation DDL on Slack: https://the-asf.slack.com/archives/C018YGVCHMZ/p1683527794220019 At the moment, there are two ways to create a secondary index. *1.) CREATE INDEX [IF NOT EXISTS] [name] ON ()* This creates an optionally