GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/14855

    [SPARK-17284] [SQL] Remove Statistics-related Table Properties from SHOW 
CREATE TABLE

    ### What changes were proposed in this pull request?
    The statistics-related table properties should be skipped by ```SHOW CREATE 
TABLE```, since it could be incorrect in the newly created table. See the Hive 
JIRA: https://issues.apache.org/jira/browse/HIVE-13792
    
    ```SQL
    CREATE TABLE t1 (
      c1 INT COMMENT 'bla',
      c2 STRING
    )
    LOCATION '$dir'
    TBLPROPERTIES (
      'prop1' = 'value1',
      'prop2' = 'value2'
    )
    ```
    The output of ```SHOW CREATE TABLE t1``` is 
    
    ```SQL
    CREATE EXTERNAL TABLE `t1`(`c1` int COMMENT 'bla', `c2` string)
    ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
    WITH SERDEPROPERTIES (
      'serialization.format' = '1'
    )
    STORED AS
      INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
      OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
    LOCATION 
'file:/private/var/folders/4b/sgmfldk15js406vk7lw5llzw0000gn/T/spark-ee317538-0f8c-42d0-b08c-cf077d94fe75'
    TBLPROPERTIES (
      'rawDataSize' = '-1',
      'numFiles' = '0',
      'transient_lastDdlTime' = '1472424052',
      'totalSize' = '0',
      'prop1' = 'value1',
      'prop2' = 'value2',
      'COLUMN_STATS_ACCURATE' = 'false',
      'numRows' = '-1'
    )
    ```
    
    After the fix, the output becomes
    ```SQL
    CREATE EXTERNAL TABLE `t1`(`c1` int COMMENT 'bla', `c2` string)
    ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
    WITH SERDEPROPERTIES (
      'serialization.format' = '1'
    )
    STORED AS
      INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
      OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
    LOCATION 
'file:/private/var/folders/4b/sgmfldk15js406vk7lw5llzw0000gn/T/spark-74058a6d-db8b-41c1-9bda-bd449f1a78ed'
    TBLPROPERTIES (
      'transient_lastDdlTime' = '1472423603',
      'prop1' = 'value1',
      'prop2' = 'value2'
    )
    ```
    
    ### How was this patch tested?
    Updated the existing test cases.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark showCreateTable

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14855.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14855
    
----
commit 92474c5a142fb9db2c86549c8347f910fc01fcbd
Author: gatorsmile <[email protected]>
Date:   2016-08-28T22:28:15Z

    remove stats-related props

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to