GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/14855
[SPARK-17284] [SQL] Remove Statistics-related Table Properties from SHOW
CREATE TABLE
### What changes were proposed in this pull request?
The statistics-related table properties should be skipped by ```SHOW CREATE
TABLE```, since it could be incorrect in the newly created table. See the Hive
JIRA: https://issues.apache.org/jira/browse/HIVE-13792
```SQL
CREATE TABLE t1 (
c1 INT COMMENT 'bla',
c2 STRING
)
LOCATION '$dir'
TBLPROPERTIES (
'prop1' = 'value1',
'prop2' = 'value2'
)
```
The output of ```SHOW CREATE TABLE t1``` is
```SQL
CREATE EXTERNAL TABLE `t1`(`c1` int COMMENT 'bla', `c2` string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1'
)
STORED AS
INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'file:/private/var/folders/4b/sgmfldk15js406vk7lw5llzw0000gn/T/spark-ee317538-0f8c-42d0-b08c-cf077d94fe75'
TBLPROPERTIES (
'rawDataSize' = '-1',
'numFiles' = '0',
'transient_lastDdlTime' = '1472424052',
'totalSize' = '0',
'prop1' = 'value1',
'prop2' = 'value2',
'COLUMN_STATS_ACCURATE' = 'false',
'numRows' = '-1'
)
```
After the fix, the output becomes
```SQL
CREATE EXTERNAL TABLE `t1`(`c1` int COMMENT 'bla', `c2` string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1'
)
STORED AS
INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'file:/private/var/folders/4b/sgmfldk15js406vk7lw5llzw0000gn/T/spark-74058a6d-db8b-41c1-9bda-bd449f1a78ed'
TBLPROPERTIES (
'transient_lastDdlTime' = '1472423603',
'prop1' = 'value1',
'prop2' = 'value2'
)
```
### How was this patch tested?
Updated the existing test cases.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gatorsmile/spark showCreateTable
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/14855.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #14855
----
commit 92474c5a142fb9db2c86549c8347f910fc01fcbd
Author: gatorsmile <[email protected]>
Date: 2016-08-28T22:28:15Z
remove stats-related props
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]