+1 to making DROP CS experimental (i.e. disabling it by default) w/ a link
to the docs explaining the possible side effects

The sooner we do that, the more defensible
https://issues.apache.org/jira/browse/CASSANDRA-16675 (a proposed solution
to the query performance issue mentioned above) becomes.

On Mon, Jun 7, 2021 at 3:07 AM Oleksandr Petrov <oleksandr.pet...@gmail.com>
wrote:

> Thank you for bringing this subject up.
>
> > not ready for production use unless users fully understand what they are
> doing.
>
> Thing is, there's no easy way around dropping compact storage.  At the
> moment of writing of 10857, we have collectively decided that we'll
> document that the new columns are going to be shown, and have added a
> client protocol option that would hide/show columns depending on the mode
> we're running it in for anyone who upgrades. This makes it harder to make
> a transition for anyone who controls only the server side, since you have
> to account for how clients would behave whenever they see a new column. We
> did try to patch around the shown columns, but because of ColumnFilter this
> also turned out to be non-trivial, or at least not worth the effort for the
> moment.
>
> One of the things mentioned in this list (primary key liveness) is also
> existing as a difference between UPDATE and INSERT, but I'm not sure if
> it's properly documented. Similar to some other nuances, such as nulls in
> clustering keys on partitions that only have a static row. We did recently
> discuss some of these not-commonly-known cases with Benjamin and some other
> folks. So it might be worth documenting those, too.
>
> Problem with compact storage is that very few people want to touch it, and
> it requires a non-trivial amount of "institutional" knowledge and
> remembering things about Thrift. I think it's OK to mark the feature as
> experimental, but remembering how we haven't made significant improvements
> to things we have previously marked as experimental, this one may not
> materialise into something final, too.
>
> What would a complete, non-foot-gun solution for dropping compact storage
> entail? If we're talking about avoiding showing columns to users, there are
> ways to achieve this without rewriting sstables, for example, by
> introducing "hidden" columns in table metadata. However, if we want to
> preserve deletion semantics, I'm not sure if we're doing it right at all:
> we'll just trade one source of difference for partition liveness for insert
> queries for the other, so I'd say that, by executing ALTER TABLE statement,
> you're accepting that after it propagates, there will be at least some
> difference in behaviour and semantics. We did discuss this in C-16069, and
> my thesis back then was that replacing special-casing for compact tables
> with special casing for tables that "used to be compact" isn't  bringing us
> closer to the final solution.
>
> To summarise, I don't mind if we mark this feature experimental, but if we
> want to ever make it complete, we have to discuss what we do with each of
> the special cases. And it may very well be that we just need to add
> explicit hidden columns to metadata, and allow nulls for clusterings, maybe
> several more small changes. Unless we define what it would take to get this
> feature out of experimental state, and actually make an effort to resolve
> these issues, I'd just put a huge warning and call it a power-user feature.
>
>
> On Fri, Jun 4, 2021 at 5:01 PM Joshua McKenzie <jmcken...@apache.org>
> wrote:
>
> > >
> > > not ready for production use unless users fully understand what they
> are
> > > doing.
> >
> > This statement stood out to me - in my opinion we should think carefully
> > about the surface area of the user interfaces on new features before we
> add
> > more cognitive burden to our users. We already have plenty of "foot-guns"
> > in the project and should only add more if absolutely necessary.
> >
> > Further, marking this as experimental would be another feature we've
> > released and then retroactively marked as experimental; that's a habit we
> > should not get into.
> >
> > On balance, my .02 is the benefits to our end users and operators of
> > getting 4.0 to GA outweigh the costs of flagging this as experimental now
> > so I'm a +1 to the flagging idea, but I think there's some valuable
> lessons
> > for us to learn in retrospect from not just this feature but others like
> it
> > in the past.
> >
> > Curious to hear Alex' thoughts about this situation in particular as
> author
> > of C-10857. I recall that being a pretty painful slog so apologies in
> > advance for picking at this scab. :)
> >
> >
> >
> > On Fri, Jun 4, 2021 at 9:44 AM Brandon Williams <dri...@gmail.com>
> wrote:
> >
> > > +1
> > >
> > > On Fri, Jun 4, 2021, 3:53 AM Benjamin Lerer <ble...@apache.org> wrote:
> > >
> > > > Hi everybody,
> > > >
> > > > There are a significant amount of issues with DROP COMPACT STORAGE
> that
> > > can
> > > > be pretty surprising for users.
> > > > To name a few:
> > > > * Some hidden columns will show up changing the resultset returned
> for
> > > > wildcard queries
> > > > * As COMPACT tables did not have primary key liveness there empty
> rows
> > > > inserted AFTER the ALTER will be returned whereas the one inserted
> > before
> > > > the ALTER will not.
> > > > * Also due to the lack of primary key liveness the amount of SSTables
> > > being
> > > > read will increase resulting in slower queries
> > > > * After DROP COMPACT it becomes possible to ALTER the table in a way
> > that
> > > > makes all the row disappears
> > > > * There is a loss of functionality around null clustering when
> dropping
> > > > compact storage (CASSANDRA-16069)
> > > >
> > > > In my opinion DROP COMPACT STORAGE is not ready for production use
> > unless
> > > > users fully understand what they are doing.
> > > > By consequence, I am wondering if we should not mark it as
> experimental
> > > as
> > > > we did for the Materialized Views (CASSANDRA-13959).
> > > >
> > > > What is your opinion?
> > > >
> > >
> >
>
>
> --
> alex p
>

Reply via email to