Re: [DISCUSS] Introduce CREATE TABLE LIKE grammer

guo Maxwell Tue, 29 Oct 2024 20:28:34 -0700

So we should be able to start voting on this now.

guo Maxwell <[email protected]> 于2024年10月28日周一 17:20写道：


> Here  is the latest updated CEP-43
> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-43++Apache+Cassandra+CREATE+TABLE++LIKE>
>
>
> guo Maxwell <[email protected]> 于2024年10月24日周四 19:53写道：
>
>> yes，you are right. I will add this
>>
>> Štefan Miklošovič <[email protected]>于2024年10月24日 周四下午4:42写道：
>>
>>> The CEP should also mention that copying system tables or virtual tables
>>> or materialized views and similar are not supported and an attempt of doing
>>> so will error out.
>>>
>>> On Thu, Oct 24, 2024 at 7:16 AM Dave Herrington <[email protected]>
>>> wrote:
>>>
>>>> Strong +1 to copy all options by default. This is intuitive to me.
>>>> Then I would like to explicitly override any options of my choosing.
>>>>
>>>> -Dave
>>>>
>>>> On Wed, Oct 23, 2024 at 9:57 PM guo Maxwell <[email protected]>
>>>> wrote:
>>>>
>>>>> OK，thank you for your suggestions ，I will revise the CEP and copy
>>>>> table OPTIONS by default.
>>>>>
>>>>> Jon Haddad <[email protected]>于2024年10月23日 周三下午9:18写道：
>>>>>
>>>>>> Also strongly +1 to copying all the options.
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 23, 2024 at 5:52 AM Josh McKenzie <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> I'm a very strong +1 to having the default functionality be to copy
>>>>>>> *ALL* options.
>>>>>>>
>>>>>>> Intuitively, as a user, if I tell a software system to make a clone
>>>>>>> of something I don't expect it to be shallow or a subset defined by some
>>>>>>> external developer somewhere. I expect it to be a clone.
>>>>>>>
>>>>>>> Adding in some kind of "lean" mode or "column only" is fine if
>>>>>>> someone can make a cogent argument around its inclusion. I don't 
>>>>>>> personally
>>>>>>> see a use-case for it right now but definitely open to being educated.
>>>>>>>
>>>>>>> On Wed, Oct 23, 2024, at 3:03 AM, Štefan Miklošovič wrote:
>>>>>>>
>>>>>>> options are inherently part of that table as well, same as schema.
>>>>>>> In fact, _schema_ includes all options. Not just columns and its names. 
>>>>>>> If
>>>>>>> you change some option, you effectively have a different schema, schema
>>>>>>> version changes by changing an option. So if we do not copy options 
>>>>>>> too, we
>>>>>>> are kind of faking it (when we do not specify WITH OPTIONS).
>>>>>>>
>>>>>>> Also, imagine a situation where Accord is merged to trunk. It
>>>>>>> introduces a new schema option called "transactional = full" which is 
>>>>>>> not
>>>>>>> default. (I am sorry if I did the spelling wrong here). So, when you 
>>>>>>> have a
>>>>>>> table with transactional support and you do "create table ks.tb_copy 
>>>>>>> like
>>>>>>> ks.tb", when you _do not_ copy all options, this table will _not_ become
>>>>>>> transactional.
>>>>>>>
>>>>>>> The next thing you go to do is to execute some transactions against
>>>>>>> this table but well ... you can not do that, because your table is not
>>>>>>> transactional, because you have forgotten to add "WITH OPTIONS". So you
>>>>>>> need to go back to that and do "ALTER ks.tb_copy WITH transactional = 
>>>>>>> full"
>>>>>>> just to support that.
>>>>>>>
>>>>>>> I think that you see from this pattern that it is way better if we
>>>>>>> copy all options by default instead of consciously opt-in into them.
>>>>>>>
>>>>>>> also:
>>>>>>>
>>>>>>> "but I think there are also some users want to do basic column
>>>>>>> information copy"
>>>>>>>
>>>>>>> where is this coming from? Do you have this idea somehow empirically
>>>>>>> tested? I just do not see why somebody would want to have Cassandra's
>>>>>>> defaults instead of what a base table contains.
>>>>>>>
>>>>>>> On Wed, Oct 23, 2024 at 8:28 AM guo Maxwell <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> The reason for using OPTION keyword is that I want to provide users
>>>>>>> with more choices .
>>>>>>> The default behavior for copying a table is to copy the basic item
>>>>>>> of table （column and their data type，mask，constraint），others thing 
>>>>>>> belongs
>>>>>>> to the table like option，views，trigger
>>>>>>> are optional in my mind.
>>>>>>> You are absolutely right that users may want to copy all stuff but I
>>>>>>> think there are aslo some users want to do basic column information 
>>>>>>> copy，So
>>>>>>> I just give them a choice。As we know that the number of table 
>>>>>>> parameters is
>>>>>>> not small，compression，compaction，gc_seconds，bf_chance，speculative_retry 
>>>>>>> and
>>>>>>> so on.
>>>>>>>
>>>>>>> Besides we can see that pg have also the keyword COMMENT，COMPRESSION
>>>>>>> which have the similar behavior as our OPTION keyword。
>>>>>>>
>>>>>>> So that is why I add this keyword OPTION.
>>>>>>>
>>>>>>>
>>>>>>> Štefan Miklošovič <[email protected]>于2024年10月22日 周二下午11:40写道：
>>>>>>>
>>>>>>> The problem is that when I do this minimal CQL which shows this
>>>>>>> feature:
>>>>>>>
>>>>>>> CREATE TABLE ks.tb_copy LIKE ks.tb;
>>>>>>>
>>>>>>> then you are saying that when I _do not_ specify WITH OPTIONS then I
>>>>>>> get Cassandra's defaults. Only after I specify WITH OPTIONS, it would
>>>>>>> truly be a copy.
>>>>>>>
>>>>>>> This is not a good design. Because to have an exact copy, I have to
>>>>>>> make a conscious effort to include OPTIONS as well. That should not be 
>>>>>>> the
>>>>>>> case. I just want to have a copy, totally the same stuff, when I use the
>>>>>>> minimal version of that statement. It would be better to opt-out from
>>>>>>> options like
>>>>>>>
>>>>>>> CREATE TABLE ks.tb_copy LIKE ks.tb WITHOUT OPTIONS (you feel me) but
>>>>>>> we do not support this (yet).
>>>>>>>
>>>>>>> On Tue, Oct 22, 2024 at 5:28 PM Štefan Miklošovič <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> I just don't see OPTIONS as important. When I want to copy a table,
>>>>>>> I am copying a table _with everything_. Options included, by default. 
>>>>>>> Why
>>>>>>> would I want to have a copy of a table with options different from the 
>>>>>>> base
>>>>>>> one?
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 21, 2024 at 3:55 PM Bernardo Botella <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> Hi Guo,
>>>>>>>
>>>>>>> +1 for the CONSTRAINTS keyword to be added into the default behavior.
>>>>>>>
>>>>>>> Bernardo
>>>>>>>
>>>>>>> On Oct 21, 2024, at 12:01 AM, guo Maxwell <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> I think the CONSTRAINTS keyword  keyword may be in the same
>>>>>>> situation as datamask.
>>>>>>> Maybe it is better to include  constraints into  the default
>>>>>>> behavior of table copy together with column name, column data type and 
>>>>>>> data
>>>>>>> mask.
>>>>>>>
>>>>>>> guo Maxwell <[email protected]> 于2024年10月21日周一 14:56写道：
>>>>>>>
>>>>>>> To yifan :
>>>>>>> I don't mind adding the ALL keyword, and it has been updated into
>>>>>>> CEP.
>>>>>>>
>>>>>>> As all you can see, our original intention was that the grammar
>>>>>>> would not be too complicated, which is what I described in cep
>>>>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-43++Apache+Cassandra+CREATE+TABLE++LIKE>
>>>>>>> .
>>>>>>> We gave up PG-related grammar, including INCLUDING/EXCLUDING and so
>>>>>>> on .
>>>>>>>
>>>>>>> guo Maxwell <[email protected]> 于2024年10月21日周一 14:52写道：
>>>>>>>
>>>>>>> Hi ,
>>>>>>> To sefan :
>>>>>>> I may want to explain that if there is no OPTION keyword in the CQL
>>>>>>> statement, then the newly created table will only have the
>>>>>>> original table's  column name 、column type and data mask ,I think this 
>>>>>>> is
>>>>>>> the most basic choice when copying tables to users.
>>>>>>> Then  we do some  addition, we can add original table's table
>>>>>>> options like compaction strategy/compress strategy、index and so on.
>>>>>>>
>>>>>>> Recently, I have also thought about the situation of CONSTRAINTS
>>>>>>> keyword. I think it is similar to data mask. Agree that it should be
>>>>>>> included in the basic options of  table copy (column name, column data 
>>>>>>> type
>>>>>>> , column data mask and constraints).
>>>>>>>
>>>>>>>
>>>>>>> Dave Herrington <[email protected]> 于2024年10月19日周六 01:15写道：
>>>>>>>
>>>>>>> It seems like a natural extension of the CREATE TABLE statement.
>>>>>>> Looking forward to using it in the future.
>>>>>>>
>>>>>>> -Dave
>>>>>>>
>>>>>>> On Thu, Oct 17, 2024 at 5:11 PM Štefan Miklošovič <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> Right?! Reads like English, the impact on the existing CQL is
>>>>>>> minimal. One LIKE which basically needs to be there and keywords of 
>>>>>>> logical
>>>>>>> "components" which seamlessly integrate with WITH.
>>>>>>>
>>>>>>> I would _not_ use WITH CONSTRAINTS because constraints will be
>>>>>>> inherently part of a table schema. It is not an "option". We can not
>>>>>>> "opt-out" from them. Remember we are copying a table here so if a base 
>>>>>>> one
>>>>>>> has constraints, its copy will have them too. A user can subsequently
>>>>>>> "ALTER" them.
>>>>>>>
>>>>>>> On Thu, Oct 17, 2024 at 5:31 PM Dave Herrington <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> Basing it on CREATE TABLE, the BNF definition of the simple
>>>>>>> implementation would look something like this:
>>>>>>>
>>>>>>> create_table_statement::= CREATE TABLE [ IF NOT EXISTS ] table_name
>>>>>>> LIKE base_table_name
>>>>>>> [ WITH included_objects ] [ [ AND ] table_options ]
>>>>>>> table_options::= COMPACT STORAGE [ AND table_options ]
>>>>>>> | CLUSTERING ORDER BY '(' clustering_order ')'
>>>>>>> [ AND table_options ]  | options
>>>>>>> clustering_order::= column_name (ASC | DESC) ( ',' column_name (ASC
>>>>>>> | DESC) )*
>>>>>>> included_objects::= dependent_objects [ AND dependent_objects ]
>>>>>>> dependent_objects:= INDEXES | TRIGGERS | CONSTRAINTS | VIEWS
>>>>>>>
>>>>>>>
>>>>>>> CREATE TABLE [ IF NOT EXISTS ] [<keyspace_name>.]<table_name> LIKE
>>>>>>> [<keyspace_name>.]<base_table_name>
>>>>>>>   [ WITH [ <included_objects > ]
>>>>>>>   [ [ AND ] [ <table_options> ] ]
>>>>>>>   [ [ AND ] CLUSTERING ORDER BY [ <clustering_column_name> (ASC |
>>>>>>> DESC) ] ]
>>>>>>> ;
>>>>>>>
>>>>>>> Examples:
>>>>>>>
>>>>>>> -- Create base table:
>>>>>>> CREATE TABLE cycling.cyclist_name (
>>>>>>>   id UUID PRIMARY KEY,
>>>>>>>   lastname text,
>>>>>>>   firstname text
>>>>>>> );
>>>>>>>
>>>>>>> -- Create an exact copy of the base table, but do not create any
>>>>>>> dependent objects:
>>>>>>> CREATE TABLE cycling.cyclist_name2 LIKE cycling.cyclist_name;
>>>>>>>
>>>>>>> -- Create an exact copy with all dependent objects (constraints
>>>>>>> excluded for now):
>>>>>>> CREATE TABLE cycling.cyclist_name3 LIKE cycling.cyclist_name
>>>>>>> WITH INDEXES AND TRIGGERS AND VIEWS;
>>>>>>>
>>>>>>> -- Create a copy with LCS compaction, a default TTL and all
>>>>>>> dependent objects except indexes:
>>>>>>> CREATE TABLE cycling.cyclist_name4 LIKE cycling.cyclist_name
>>>>>>> WITH TRIGGERS AND VIEWS
>>>>>>> AND compaction = { 'class' :  'LeveledCompactionStrategy' }
>>>>>>> AND default_time_to_live = 86400;
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This seems pretty clean & straightforward.
>>>>>>>
>>>>>>> -Dave
>>>>>>>
>>>>>>> On Thu, Oct 17, 2024 at 4:05 PM Dave Herrington <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> This simple approach resonates with me.  I think the Cassandra doc
>>>>>>> uses "INDEXES" as the plural for index, i.e.:
>>>>>>> https://cassandra.apache.org/doc/stable/cassandra/cql/indexes.html
>>>>>>>
>>>>>>> -Dave
>>>>>>>
>>>>>>> On Thu, Oct 17, 2024 at 2:39 PM Štefan Miklošovič <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> Well we could do something like:
>>>>>>>
>>>>>>> CREATE TABLE ks.tb_copy LIKE ks.tb WITH INDICES AND TRIGGERS AND
>>>>>>> compaction = {'class': '.... } AND ...
>>>>>>>
>>>>>>> but I can admit it might be seen as an overreach and I am not sure
>>>>>>> at all how it would look like in the implementation because we would 
>>>>>>> need
>>>>>>> to distinguish WITH INDICES from table options.
>>>>>>>
>>>>>>> I would
>>>>>>>
>>>>>>>    1. +0 on ALL. - we don't need this. If we have just INDICES,
>>>>>>>    TRIGGERS, VIEWS at this point, I don't think enumerating it all is 
>>>>>>> too much
>>>>>>>    to ask. This is just an implementation detail and if we find it 
>>>>>>> necessary
>>>>>>>    we can add it later. If you feel strongly about this then add that 
>>>>>>> but it
>>>>>>>    is not absolutely necessary.
>>>>>>>    2. omit OPTIONS - aren't all options copied by default? That is
>>>>>>>    the goal of the CEP, no? We might just use normal CQL while
>>>>>>>    overriding from the base table
>>>>>>>    3. mix keywords like TRIGGERS / INDICES / CONSTRAINTS into
>>>>>>>    normal table creation statement
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Oct 17, 2024 at 3:20 PM Yifan Cai <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> I would second Štefan's option for functionality simplicity. It
>>>>>>> seems to be unnecessary to have the keywords for both inclusion and
>>>>>>> exclusion in the CEP. If needed, the exclusion (WITHOUT) can be 
>>>>>>> introduced
>>>>>>> later. It would still be backward compatible.
>>>>>>>
>>>>>>> Regarding "CREATE TABLE ks.tb_copy LIKE ks.tb WITH compaction =
>>>>>>> {'class': '.... } AND ... ", I think it only overrides the table 
>>>>>>> options.
>>>>>>> The CEP suggests the coarse-grained keyword for each category like table
>>>>>>> options, indexes, etc. The functionality provided is not identical.
>>>>>>>
>>>>>>> I understand that the suggestions are to make operators' life easier
>>>>>>> by achieving table creation in a single statement. What is being 
>>>>>>> proposed
>>>>>>> in the CEP seems to be at a good balance point. Operators can alter the
>>>>>>> table options if needed in the follow-up ALTER table statement.
>>>>>>>
>>>>>>> - Yifan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Oct 17, 2024 at 1:41 PM Štefan Miklošovič <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> I think we are starting to complicate it. For me the most important
>>>>>>> question is who is actually this feature for? If people want to just
>>>>>>> prototype something fast or they just want to have "the same table just
>>>>>>> under a different name", I think that is going to be used in 99% of 
>>>>>>> cases.
>>>>>>>
>>>>>>>
>>>>>>> My assumption of using WITH which I think I proposed first (4th post
>>>>>>> in this thread) was to just blindly copy the most important "parts"
>>>>>>> logically related to a table, be it indices, materialized views, or
>>>>>>> triggers and enable / disable them as we wish. If no "WITH" is used, 
>>>>>>> then
>>>>>>> we just get a table with nothing else. "WITH" will opt-in into that.
>>>>>>>
>>>>>>>
>>>>>>> Seeing us contemplating using "INCLUDING" and "EXCLUDING" on
>>>>>>> individual options makes me sad a little bit. I think we are
>>>>>>> over-engineering this. I just don't see a reasonable use-case where 
>>>>>>> users
>>>>>>> would need to cherry-pick what they want and what not. Isn't that just 
>>>>>>> too
>>>>>>> complicated? If a table being copied drifts away too much from the 
>>>>>>> original
>>>>>>> one then users would be better off with creating a brand new table with 
>>>>>>> CQL
>>>>>>> as they are used to, not dealing with "copying" at all. More we drift 
>>>>>>> from
>>>>>>> what the original table was like, the less useful this feature is.
>>>>>>>
>>>>>>> On Wed, Oct 16, 2024 at 10:03 PM Dave Herrington <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> Sorry that I overlooked the definition of the default in the CEP.  I
>>>>>>> did look for it but I didn’t see it.
>>>>>>>
>>>>>>> I think the default behavior you explained makes perfect sense &
>>>>>>> what one would expect.
>>>>>>>
>>>>>>> I like the flexibility of INCLUDING and EXCLUDING that you are
>>>>>>> considering.
>>>>>>>
>>>>>>> Would it make sense to use WITH for table options, which would make
>>>>>>> it easy (and less confusing IMHO) to override the defaults from the 
>>>>>>> source
>>>>>>> table, then use INCLUDING/EXCLUDING for all non-table options such as
>>>>>>> constraints and indices?
>>>>>>>
>>>>>>> It seems this would be easier to document as well, as it could just
>>>>>>> point to the CREATE TABLE doc for the options, rather than trying to
>>>>>>> explain a bunch of keywords that map to table options.
>>>>>>>
>>>>>>> -Dave
>>>>>>>
>>>>>>> David A. Herrington II
>>>>>>> President and Chief Engineer
>>>>>>> RhinoSource, Inc.
>>>>>>>
>>>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>>>
>>>>>>> www.rhinosource.com
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 16, 2024 at 7:57 PM guo Maxwell <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> To yifan :
>>>>>>> At the beginning of the period, I also thought about adding the
>>>>>>> keyword ALL, refer to pg
>>>>>>> <https://www.postgresql.org/docs/current/sql-createtable.html> ,
>>>>>>> but  I give up when writing cep as I find that there may be not so many
>>>>>>> properties (only three) to copy for C* and
>>>>>>> It is possible to decide what is needed and what is not in a very
>>>>>>> simple cql, as our ALL is only three properties here. I want to keep it 
>>>>>>> as
>>>>>>> simple as possible (based on the advice given by Benjamin), So I grouped
>>>>>>> the properties of the table into one category and expressed it with
>>>>>>> OPTION keyword.
>>>>>>>
>>>>>>> But if we are going to split the first keyword OPTION  to
>>>>>>> COMPRESSION 、COMPACTION、COMMENT and so on. I am +1 on adding ALL back as
>>>>>>> the properties are so many and it is simple to use ALL instead of
>>>>>>> list all properties. Besides I may change my keyword WITH to
>>>>>>> INCLUDING and adding another keyword EXCLUDING to flexibly copy table
>>>>>>> properties through simple sql statements, like using   1 not  2
>>>>>>>
>>>>>>>
>>>>>>>    1.  CREATE TABLE newTb like oldTb INCLUDING ALL EXCLUDING
>>>>>>>    INDEXES AND COMMENTS.
>>>>>>>    2.  CREATE TABLE newTb like oldTb INCLUDING COMPRESSION
>>>>>>>    CONSTRAINTS GENERATED IDENTITY STATISTICS STORAGE
>>>>>>>
>>>>>>> Conclusion: If there may be more keywords to consider in the future,
>>>>>>> such as more than 4 , I am +1 on adding ALL back .
>>>>>>>
>>>>>>> To Dave :
>>>>>>>    Default behavior is only copy column name, data type ,data mask
>>>>>>> , you can see more detail from  CEP-43
>>>>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-43++Apache+Cassandra+CREATE+TABLE++LIKE>
>>>>>>> .
>>>>>>>
>>>>>>>
>>>>>>> Patrick McFadin <[email protected]> 于2024年10月17日周四 06:43写道：
>>>>>>>
>>>>>>> +1 That makes much more sense in my experience.
>>>>>>>
>>>>>>> On Wed, Oct 16, 2024 at 12:12 PM Dave Herrington <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> I'm coming at this with both a deep ANSI SQL background as well as
>>>>>>> CQL background.
>>>>>>>
>>>>>>> Defining the default behavior is the starting point.  What gets
>>>>>>> copied if we do "CREATE TABLE new_table LIKE original_table;" without a
>>>>>>> WITH clause?
>>>>>>>
>>>>>>> Then, you build on that with the specific WITH options.  WITH ALL
>>>>>>> catches everything.
>>>>>>>
>>>>>>> -Dave
>>>>>>>
>>>>>>> On Wed, Oct 16, 2024 at 11:16 AM Yifan Cai <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> "WITH ALL" seems to be a natural addition to the directives. What do
>>>>>>> you think about adding the fifth keyword ALL to retain all fields of the
>>>>>>> table schema?
>>>>>>>
>>>>>>> For instance, CREATE TABLE new_table LIKE original_table WITH ALL,
>>>>>>> it replicates options, indexes, triggers, constraints and any applicable
>>>>>>> kinds that are introduced in the future.
>>>>>>>
>>>>>>> - Yifan
>>>>>>>
>>>>>>> On Wed, Oct 16, 2024 at 7:46 AM guo Maxwell <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Disscussed with Bernardo on slack，and +1 with his advice on adding a
>>>>>>> fourth keyword.
>>>>>>>
>>>>>>> The keyword would be  CONSTRAINTS ， any more suggestion ？
>>>>>>>
>>>>>>> guo Maxwell <[email protected]>于2024年10月16日 周三上午9:55写道：
>>>>>>>
>>>>>>> Hi yifan,
>>>>>>> Thanks for bringing this up. The SELECT permission on the original
>>>>>>> table is needed. Mysql and PG all have mentioned this, and I also
>>>>>>> specifically noticed this in my code.
>>>>>>>
>>>>>>> I probably missed this in the cep documentation. 😅
>>>>>>>
>>>>>>> Yifan Cai <[email protected]> 于2024年10月16日周三 07:46写道：
>>>>>>>
>>>>>>> Thanks for creating the CEP! I think it is missing Bernardo's
>>>>>>> comment on "the need for read permissions on the source table".
>>>>>>>
>>>>>>> CreateTableStatement does not check the permissions outside of the
>>>>>>> enclosing keyspace. Having the SELECT permission on the original table 
>>>>>>> is a
>>>>>>> requirement for CREATE TABLE LIKE.
>>>>>>>
>>>>>>> - Yifan
>>>>>>>
>>>>>>> On Sun, Sep 29, 2024 at 11:01 PM guo Maxwell <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hello, everyone ,
>>>>>>> I have finished the doc for CEP-43 for CREATE_TABLE_LIKE
>>>>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-43++Apache+Cassandra+CREATE+TABLE++LIKE>
>>>>>>>  as
>>>>>>> said before, looking forward to your suggestions.
>>>>>>>
>>>>>>> Štefan Miklošovič <[email protected]> 于2024年9月25日周三 03:51写道：
>>>>>>>
>>>>>>> I am sorry I do not follow what you mean, maybe an example would
>>>>>>> help.
>>>>>>>
>>>>>>> On Tue, Sep 24, 2024 at 6:18 PM guo Maxwell <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> If there are multiple schema information changes in one ddl
>>>>>>> statement, will there be schema conflicts in extreme cases?
>>>>>>> For example, our statement contains both table creation and index
>>>>>>> creation.
>>>>>>>
>>>>>>> guo Maxwell <[email protected]>于2024年9月24日 周二下午8:12写道：
>>>>>>>
>>>>>>> +1 on splitting this task  and adding the ability to copy tables
>>>>>>> through different keyspaces in the future.
>>>>>>>
>>>>>>> Štefan Miklošovič <[email protected]> 于2024年9月23日周一 22:05写道：
>>>>>>>
>>>>>>> If we have this table
>>>>>>>
>>>>>>> CREATE TABLE ks.tb2 (
>>>>>>>     id int PRIMARY KEY,
>>>>>>>     name text
>>>>>>> );
>>>>>>>
>>>>>>> I can either specify name of an index on my own like this:
>>>>>>>
>>>>>>> CREATE INDEX name_index ON ks.tb2 (name) ;
>>>>>>>
>>>>>>> or I can let Cassandra to figure that name on its own:
>>>>>>>
>>>>>>> CREATE INDEX ON ks.tb2 (name) ;
>>>>>>>
>>>>>>> in that case it will name that index "tb2_name_idx".
>>>>>>>
>>>>>>> Hence, I would expect that when we do
>>>>>>>
>>>>>>> ALTER TABLE ks.to_copy LIKE ks.tb2 WITH INDICES;
>>>>>>>
>>>>>>> Then ks.to_copy table will have an index which is called
>>>>>>> "to_copy_name_idx" without me doing anything.
>>>>>>>
>>>>>>> For types, we do not need to do anything when we deal with the same
>>>>>>> keyspace. For simplicity, I mentioned that we might deal with the same
>>>>>>> keyspace scenario only for now and iterate on that in the future.
>>>>>>>
>>>>>>> On Mon, Sep 23, 2024 at 8:53 AM guo Maxwell <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hello everyone,
>>>>>>>
>>>>>>> Cep is being written, and I encountered some problems during the
>>>>>>> process. I would like to discuss them with you. If you read the 
>>>>>>> description
>>>>>>> of this CASSANDRA-7662
>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-7662>, we will
>>>>>>> find that initially the original creator of this jira did not intend to
>>>>>>> implement structural copying of indexes, views, and triggers  only the
>>>>>>> column and its data type.
>>>>>>>
>>>>>>> However, after investigating some db related syntax and function
>>>>>>> implementation, I found that it may be necessary for us to provide some
>>>>>>> rich syntax to support the replication of indexes, views, etc.
>>>>>>>
>>>>>>> In order to support selective copy of the basic structure of the
>>>>>>> table (columns and types), table options, table-related indexes, views,
>>>>>>> triggers, etc. We need some new syntax, it seems that the syntax of pg 
>>>>>>> is
>>>>>>> relatively comprehensive, it use the keyword INCLUDING/EXCLUDING to
>>>>>>> flexibly control the removal and retention of indexes, table 
>>>>>>> information,
>>>>>>> etc. see pg create table like
>>>>>>> <https://www.postgresql.org/docs/8.1/sql-createtable.html> , the
>>>>>>> new created index name is different from the original table's index 
>>>>>>> name ,
>>>>>>> seenewly copied index names are different from original
>>>>>>> <https://github.com/postgres/postgres/blob/master/doc/src/sgml/ref/create_table.sgml#L749>
>>>>>>> , the name is based on some rule.
>>>>>>> Mysql is relatively simple and copies columns and indexes by
>>>>>>> default. see mysql create table like
>>>>>>> <https://dev.mysql.com/doc/refman/8.4/en/create-table-like.html>
>>>>>>> and the newly created index name is the same with the original table's
>>>>>>> index name.
>>>>>>>
>>>>>>> So for Casandra, I hope it can also support the information copy of
>>>>>>> index and even view/trigger. And I also hope to be able to flexibly 
>>>>>>> decide
>>>>>>> which information is copied like pg.
>>>>>>>
>>>>>>> Besides, I think the copy can happen between different keyspaces.
>>>>>>> And UDT needs to be taken into account.
>>>>>>>
>>>>>>> But as we know the index/view/trigger name are all under keyspace
>>>>>>> level, so it seems that the newly created index name (or view name/ 
>>>>>>> trigger
>>>>>>> name) must be different from the original tables' ,otherwise  names 
>>>>>>> would
>>>>>>> clash .
>>>>>>>
>>>>>>> So regarding the above problem, one idea I have is that for newly
>>>>>>> created types, indexes and views under different keyspaces and the same
>>>>>>> keyspace, we first generate random names for them, and then we can add 
>>>>>>> the
>>>>>>> ability of modifying the names(for types/indexes/views/triggers) so that
>>>>>>> users can manually change the names.
>>>>>>>
>>>>>>>
>>>>>>> guo Maxwell <[email protected]> 于2024年9月20日周五 08:06写道：
>>>>>>>
>>>>>>> No，I think still need some discuss on grammar detail after I finish
>>>>>>> the first version
>>>>>>>
>>>>>>> Patrick McFadin <[email protected]>于2024年9月20日 周五上午2:24写道：
>>>>>>>
>>>>>>> Is this CEP ready for a VOTE thread?
>>>>>>>
>>>>>>> On Sat, Aug 24, 2024 at 8:56 PM guo Maxwell <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Thank you for your replies, I will prepare a CEP later.
>>>>>>>
>>>>>>> Patrick McFadin <[email protected]> 于2024年8月20日周二 02:11写道：
>>>>>>>
>>>>>>> +1 This is a CEP
>>>>>>>
>>>>>>> On Mon, Aug 19, 2024 at 10:50 AM Jon Haddad <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Given the fairly large surface area for this, i think it should be a
>>>>>>> CEP.
>>>>>>>
>>>>>>> —
>>>>>>> Jon Haddad
>>>>>>> Rustyrazorblade Consulting
>>>>>>> rustyrazorblade.com
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Aug 19, 2024 at 10:44 AM Bernardo Botella <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> Definitely a nice addition to CQL.
>>>>>>>
>>>>>>> Looking for inspiration at how Postgres and Mysql do that may also
>>>>>>> help with the final design (I like the WITH proposed by Stefan, but I 
>>>>>>> would
>>>>>>> definitely take a look at the INCLUDING keyword proposed by Postgres).
>>>>>>> https://www.postgresql.org/docs/current/sql-createtable.html
>>>>>>> https://dev.mysql.com/doc/refman/8.4/en/create-table-like.html
>>>>>>>
>>>>>>> On top of that, and as part of the interesting questions, I would
>>>>>>> like to add the permissions to the mix. Both the question about copying
>>>>>>> them over (with a WITH keyword probably), and the need for read 
>>>>>>> permissions
>>>>>>> on the source table as well.
>>>>>>>
>>>>>>> Bernardo
>>>>>>>
>>>>>>>
>>>>>>> On Aug 19, 2024, at 10:01 AM, Štefan Miklošovič <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> BTW this would be cool to do as well:
>>>>>>>
>>>>>>> ALTER TABLE ks.to_copy LIKE ks.tb WITH INDICES;
>>>>>>>
>>>>>>> This would mean that if we create a copy of a table, later we can
>>>>>>> decide that we need indices too, so we might "enrich" that table with
>>>>>>> indices from the old one without necessarily explicitly re-creating 
>>>>>>> them on
>>>>>>> that new table.
>>>>>>>
>>>>>>> On Mon, Aug 19, 2024 at 6:55 PM Štefan Miklošovič <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> I think this is an interesting idea worth exploring. I definitely
>>>>>>> agree with Benjamin who raised important questions which needs to be
>>>>>>> answered first. Also, what about triggers?
>>>>>>>
>>>>>>> It might be rather "easy" to come up with something simple but it
>>>>>>> should be a comprehensive solution with predictable behavior we all 
>>>>>>> agree
>>>>>>> on.
>>>>>>>
>>>>>>> If a keyspace of a new table does not exist we would need to create
>>>>>>> that one too before. For the simplicity, I would just make it a must to
>>>>>>> create it on same keyspace. We might iterate on that in the future.
>>>>>>>
>>>>>>> UDTs are created per keyspace so there is nothing to re-create. We
>>>>>>> just need to reference it from a new table, right?
>>>>>>>
>>>>>>> Indexes and MVs are interesting but in theory they might be
>>>>>>> re-created too.
>>>>>>>
>>>>>>> Would it be appropriate to use something like this?
>>>>>>>
>>>>>>> CREATE TABLE ks.tb_copy LIKE ks.tb WITH INDEXES AND VIEWS AND
>>>>>>> TRIGGERS ....
>>>>>>>
>>>>>>> Without "WITH" it would just copy a table with nothing else.
>>>>>>>
>>>>>>> On Mon, Aug 19, 2024 at 6:10 PM guo Maxwell <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hello， everyone：
>>>>>>> As  Jira CASSANDRA-7662
>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-7662> has
>>>>>>> described , we would like to introduce a new grammer " CREATE TABLE 
>>>>>>> LIKE "
>>>>>>> ,which  simplifies creating new tables duplicating the existing ones .
>>>>>>> The format may be like : CREATE TABLE <new_table> LIKE <old_table>
>>>>>>> Before I implement this function, do you have any suggestions on
>>>>>>> this?
>>>>>>>
>>>>>>> Looking forward to your reply！
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -Dave
>>>>>>>
>>>>>>> David A. Herrington II
>>>>>>> President and Chief Engineer
>>>>>>> RhinoSource, Inc.
>>>>>>>
>>>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>>>
>>>>>>> www.rhinosource.com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -Dave
>>>>>>>
>>>>>>> David A. Herrington II
>>>>>>> President and Chief Engineer
>>>>>>> RhinoSource, Inc.
>>>>>>>
>>>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>>>
>>>>>>> www.rhinosource.com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -Dave
>>>>>>>
>>>>>>> David A. Herrington II
>>>>>>> President and Chief Engineer
>>>>>>> RhinoSource, Inc.
>>>>>>>
>>>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>>>
>>>>>>> www.rhinosource.com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -Dave
>>>>>>>
>>>>>>> David A. Herrington II
>>>>>>> President and Chief Engineer
>>>>>>> RhinoSource, Inc.
>>>>>>>
>>>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>>>
>>>>>>> www.rhinosource.com
>>>>>>>
>>>>>>>
>>>>>>>

Re: [DISCUSS] Introduce CREATE TABLE LIKE grammer

Reply via email to