GitHub user gengliangwang opened a pull request:

    https://github.com/apache/spark/pull/21655

    [SPARK-24675][SQL]Rename table: validate existence of new location

    ## What changes were proposed in this pull request?
    If table is renamed to a existing new location, data won't show up.
    ```
    scala>  Seq("hello").toDF("a").write.format("parquet").saveAsTable("t")
                                                                                
    
    scala> sql("select * from t").show()
    +-----+
    |    a|
    +-----+
    |hello|
    +-----+
    
    
    scala> sql("alter table t rename to test")
    res2: org.apache.spark.sql.DataFrame = []
    
    scala> sql("select * from test").show()
    +---+
    |  a|
    +---+
    +---+
    ```
    The file layout is like 
    ```
    $ tree test
    test
    ├── gabage
    └── t
        ├── _SUCCESS
        └── 
part-00000-856b0f10-08f1-42d6-9eb3-7719261f3d5e-c000.snappy.parquet
    ```
    
    In Hive, if the new location exists, the renaming will fail even the 
location is empty.
    
    We should have the same validation in Catalog, in case of unexpected bugs.
    
    
    ## How was this patch tested?
    
    New unit test.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gengliangwang/spark validate_rename_table

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21655.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21655
    
----
commit 1622d9cb669c4a481d5f697c3f4f9a47993d9dbf
Author: Gengliang Wang <gengliang.wang@...>
Date:   2018-06-28T07:41:23Z

    Rename table: validate existence of new location

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to