GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/12618

    [SPARK-14857] [SQL] [WIP] Table/Database Name Validation in SessionCatalog

    #### What changes were proposed in this pull request?
    Their PR is to validate the database/table names before storing these 
information in `ExternalCatalog`. 
    
    This PR is part of the efforts to reduce the dependencies on Hive 
meta-store for catching any illegal inputs. 
    
    For example, if users use `backstick` to quote the table/database names 
containing illegal characters, these names are allowed by Spark Parser, but 
Hive metastore does not allow them. We need to catch them in SessionCatalog and 
issue an appropriate error message. We are facing the same issues for creating 
data source tables (e.g., from dataframes).
    ```
    CREATE TABLE `tab:1`  ...
    ```
    
    This PR enforces the name rules of Spark SQL for `table`/`database`/`view`: 
`only can contain alphanumeric and underscore characters.` Different from Hive, 
we allow the names with starting underscore characters. 
    
    When the `ExternalCatalog` is Hive Metastore, we use 
`MetaStoreUtils.validateName` to check its validity. **Question**: should we 
enforce both? My guess is Yes.
    
    In the future PRs, we will continue the work on validation of 
function/column names. 
    
    #### How was this patch tested?
    Todo: add more test cases for ensuring the coverage.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark nameValidation

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/12618.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #12618
    
----
commit 1ca5fccc1ab92e6a26cef1ae424cce98a4761c30
Author: gatorsmile <[email protected]>
Date:   2016-04-21T12:23:50Z

    fix.

commit 6f9143c1e9a46f7442106a159841158e266574bc
Author: gatorsmile <[email protected]>
Date:   2016-04-21T17:48:04Z

    rollback

commit 2696f8e69fbc0f31e49c2f00a2d771948be9a4b6
Author: gatorsmile <[email protected]>
Date:   2016-04-22T16:57:29Z

    Merge remote-tracking branch 'upstream/master' into nameValidation

commit 56f25b17b727c835869996e739315c65c27b1b0c
Author: gatorsmile <[email protected]>
Date:   2016-04-22T17:23:27Z

    code cleaning

commit ea977c1a41ac4be40b9d000bcf12745fe439fe96
Author: gatorsmile <[email protected]>
Date:   2016-04-22T19:49:02Z

    Also enforce checking in SessionCatalog

commit f2d6ad7d5b5715642df20819fb9f63a89fa03ef0
Author: gatorsmile <[email protected]>
Date:   2016-04-22T20:14:28Z

    remove unnecessary test case.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to