Adding dev@ back now.

-Rui

On Wed, Aug 15, 2018 at 1:01 PM Andrew Pilloud <[email protected]> wrote:

> Did we drop the dev list from this on purpose? (I haven't added it back,
> but we probably should.)
>
> I'm in favor of sticking with the simple 'CREATE TABLE' and 'CREATE
> SCHEMA' if there is only to be one option. Sticking with those names
> minimizes both our deviation from other implementations and user surprise.
> There is a lot of value in making sure our common operations closely map to
> the equivalent common operations in other SQL dialects. I think users will
> be more confused to find that 'CREATE TABLE' doesn't exist then to learn
> that it might not always create a table. This minimizes the overhead of
> learning our dialect of SQL and maximizes the odds that a user will be able
> to guess at the syntax of something and have it work. (For example, a user
> guessing at the syntax of CREATE TABLE would have a better experience with
> the error being "field LOCATION not specified" rather than "operation
> CREATE TABLE not found".)
>
> If the goal is clarity of the operation, how about 'REGISTER EXTERNAL DATA
> SOURCE' and 'REGISTER EXTERNAL DATA SOURCE PROVIDER'? Those names remove
> the ambiguity around the operation creating and the data source being a
> table.
>
> Andrew
>
> On Wed, Aug 15, 2018 at 10:54 AM Anton Kedin <[email protected]> wrote:
>
>> My preference is to make `EXTERNAL` mandatory and only support `CREATE
>> EXTERNAL TABLE` for existing semantics. My main reasons are:
>>  - user friendliness, matching expectations, readability. Current `CREATE
>> TABLE` is basically a `CREATE EXTERNAL TABLE`. It is confusing to users
>> familiar with SQL who expect that `CREATE TABLE` will actually create a
>> table;
>>  - forward-compatibility. We could potentially support non-external
>> `CREATE TABLE` at some point in the future, whatever semantics it might
>> have. It will be wrong to use the same syntax for external and non-external
>> CREATEs;
>>
>> I agree that typing extra word each time is not ideal, but my opinion is
>> on the side that readability of code (including SQL) is important (how much
>> time you spend reading / understanding code vs writing it) and we should
>> try to improve it if we can. In case of DDL every non-trivial statement
>> will already have a ton of unavoidable words (field names, types, location,
>> options) so I would argue that adding extra one word would not noticeably
>> reduce your happiness of writing it :) But it would improve readability and
>> reduce ambiguity, which I think is worth it.
>>
>> I think that making it optional only introduces more confusion (e.g.
>> what's the difference between the two DDL statements without reading the
>> doc?) and would make situation worse.
>>
>> Regards,
>> Anton
>>
>>
>>
>>
>> On Wed, Aug 15, 2018 at 10:24 AM Mingmin Xu <[email protected]> wrote:
>>
>>> I prefer to `CREATE EXTERNAL TABLE`. My question is, do you plan to
>>> support both `CREATE TABLE` and `CREATE EXTERNAL TABLE`, by making
>>> `EXTERNAL` as optional?
>>>
>>> On Wed, Aug 15, 2018 at 10:01 AM, Andrew Pilloud <[email protected]>
>>> wrote:
>>>
>>>> I think 'CREATE EXTERNAL TABLE' might make things a bit clearer from a
>>>> documentation prospective, but I'd be really unhappy if I had to type out
>>>> 'EXTERNAL' every time. (I have the same concern with 'CREATE EXTERNAL
>>>> SCHEMA'.)
>>>>
>>>> Andrew
>>>>
>>>> On Tue, Aug 14, 2018 at 12:38 PM Rui Wang <[email protected]> wrote:
>>>>
>>>>> Hi guys,
>>>>>
>>>>> I know you are probably using CREATE TABLE, Can I know your thoughts
>>>>> on this?
>>>>>
>>>>> -Rui
>>>>>
>>>>>
>>>>> On Tue, Aug 14, 2018 at 10:22 AM Rui Wang <[email protected]> wrote:
>>>>>
>>>>>> Thanks Mikhail! "Import" is an alternative option. It might be better.
>>>>>>
>>>>>> "create external" is being widely used by different systems with
>>>>>> similar meaning so "create" usually is ok to external data sources.
>>>>>>
>>>>>> -Rui
>>>>>>
>>>>>> On Tue, Aug 14, 2018 at 9:38 AM Mikhail Gryzykhin <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> The idea of clarification sounds good to me. I'd appreciate that
>>>>>>> present, when I was triaging post-commit tests.
>>>>>>>
>>>>>>> Do we have any terms that specify connection to external table?
>>>>>>> "CREATE" word triggers this reaction in my brain that there will be a 
>>>>>>> new
>>>>>>> table created. Adding "EXTERNAL" would already add distinction, but 
>>>>>>> adding
>>>>>>> something more explicit for the task might be even better.
>>>>>>>
>>>>>>> --Mikhail
>>>>>>>
>>>>>>> Have feedback <http://go/migryz-feedback>?
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Aug 13, 2018 at 2:40 PM Rafael Fernandez <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Strictly speaking, they are not necessarily tables either. We could
>>>>>>>> also introduce something like CREATE EXTERNAL DATA SOURCE (a-la
>>>>>>>> T-SQL
>>>>>>>> <https://docs.microsoft.com/en-us/sql/t-sql/statements/create-external-data-source-transact-sql?view=sql-server-2017>),
>>>>>>>> if it's somehow advantageous for us to leverage access patterns or 
>>>>>>>> restrict
>>>>>>>> DML statements.
>>>>>>>>
>>>>>>>> I think your idea of CREATE EXTERNAL TABLE is practical :)
>>>>>>>>
>>>>>>>> On Mon, Aug 13, 2018 at 2:12 PM Rui Wang <[email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi Community,
>>>>>>>>>
>>>>>>>>> BeamSQL allows CREATE TABLE
>>>>>>>>> <https://beam.apache.org/documentation/dsls/sql/create-table/>
>>>>>>>>> statements to register virtual tables from external storage systems 
>>>>>>>>> (e.g.
>>>>>>>>> BigQuery).
>>>>>>>>>
>>>>>>>>> BeamSQL is not a storage system, so any table registered by
>>>>>>>>> "CREATE TABLE" statement is essentially equivalent to be registered by
>>>>>>>>> "CREATE EXTERNAL TABLE", which requires the user to provide a 
>>>>>>>>> LOCATION and
>>>>>>>>> BeamSQL will register the table outside of current execution 
>>>>>>>>> environment
>>>>>>>>> based on LOCATION.
>>>>>>>>>
>>>>>>>>> So I propose to add EXTERNAL keyword to "CREATE TABLE" in BeamSQL
>>>>>>>>> to help users understand they are registering tables, and BeamSQL 
>>>>>>>>> does not
>>>>>>>>> create non existing tables by running CREATE TABLE (at least on some
>>>>>>>>> storage systems, if not all).
>>>>>>>>>
>>>>>>>>> We can make the EXTERNAL keyword either required or optional.
>>>>>>>>>
>>>>>>>>> If we make the EXTERNAL keyword required:
>>>>>>>>>
>>>>>>>>> Pros:
>>>>>>>>> a. We can get rid of the registering table semantic on CREATE
>>>>>>>>> TABLE.
>>>>>>>>> b, We keep the room that we could add CREATE TABLE back in the
>>>>>>>>> future if we want CREATE TABLE to create, rather than not only 
>>>>>>>>> register
>>>>>>>>> tables in BeamSQL.
>>>>>>>>>
>>>>>>>>> Cons:
>>>>>>>>> 1. CREATE TABLE syntax will not be supported so existing BeamSQL
>>>>>>>>> pipelines which has CREATE TABLE require changes.
>>>>>>>>> 2. It's required to type tedious EXTERNAL keyword every time,
>>>>>>>>> especially in SQL Shell.
>>>>>>>>>
>>>>>>>>> If we make the EXTERNAL keyword optional, we will have reversed
>>>>>>>>> pros and cons above.
>>>>>>>>>
>>>>>>>>> Any thoughts on adding EXTERNAL keyword, and make it required or
>>>>>>>>> optional?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Rui
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>
>>>
>>> --
>>> ----
>>> Mingmin
>>>
>>

Reply via email to