GitHub user ottomata opened a pull request:

    https://github.com/apache/spark/pull/21012

    [SPARK-23890][SQL] Support CHANGE COLUMN to add nested fields to structs

    ## What changes were proposed in this pull request?
    
    SPARK-23525 added the ability to `ALTER TABLE CHANGE COLUMN` for simple 
COMMENT only changes.  For safety, column changes are generally restricted.  
However, `ALTER TABLE ADD COLUMN` is safe and allowed.  In order to add columns 
to a nested struct type, we must run an `ALTER TABLE CHANGE COLUMN` command, 
since struct type DDL look like single top level columns with a complex type.
    
    Given on origin column declared like
    ```
    nested struct<s1:string>
    ```
    
    This patch allows users to issue SQL DDL like:
    
    ```
    ALTER TABLE t1 CHANGE COLUMN nested nested struct<s1:string,s2:string>
    ```
    
    to add sub columns to a struct.  It does this by recursing over 
StructTypes, and comparing dataTypes for each shared field between the origin 
table and the new destination type.
    
    ## How was this patch tested?
    
    `testChangeColumn` in DDLSuite.scala was amended to include tests for 
altering struct type columns.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ottomata/spark SPARK-23890

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21012.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21012
    
----
commit 3aa98636a875528b53167459447d567eb7d73ff3
Author: Andrew Otto <acotto@...>
Date:   2018-04-09T18:59:03Z

    [SPARK-23890][SQL] Support ALTER TABLE CHANGE COLUMN to add fields to 
structs
    
    SPARK-23525 added the ability to ALTER TABLE CHANGE COLUMN for
    simple COMMENT only changes.  For safety, column changes are
    generally restricted.  However, ALTER TABLE ADD COLUMN is safe
    and allowed.  In order to add columns to a nested struct type,
    we must run an ALTER TABLE CHANGE COLUMN command, since
    struct type DDL look like single top level columns with
    a complex type.
    
    Given on origin column declared like
      nested struct<s1:string>
    
    This patch allows users to issue SQL DDL like:
    
      ALTER TABLE t1 CHANGE COLUMN nested nested struct<s1:string,s2:string>
    
    to add sub columns to a struct.  It does this by recursing over
    StructTypes, and comparing dataTypes for each shared field between
    the origin table and the new destination type.
    
    testChangeColumn in DDLSuite.scala was amended to include tests
    for altering struct type columns.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to