[ 
https://issues.apache.org/jira/browse/SPARK-23525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-23525.
---------------------------------

Issue resolved by pull request 20696
[https://github.com/apache/spark/pull/20696]

> ALTER TABLE CHANGE COLUMN doesn't work for external hive table
> --------------------------------------------------------------
>
>                 Key: SPARK-23525
>                 URL: https://issues.apache.org/jira/browse/SPARK-23525
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0, 2.3.0
>            Reporter: Pavlo Skliar
>            Assignee: Jiang Xingbo
>            Priority: Major
>             Fix For: 2.4.0, 2.3.1
>
>
> {code:java}
> print(spark.sql("""
> SHOW CREATE TABLE test.trends
> """).collect()[0].createtab_stmt)
> /// OUTPUT
> CREATE EXTERNAL TABLE `test`.`trends`(`id` string COMMENT '', `metric` string 
> COMMENT '', `amount` bigint COMMENT '')
> COMMENT ''
> PARTITIONED BY (`date` string COMMENT '')
> ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
> WITH SERDEPROPERTIES (
>   'serialization.format' = '1'
> )
> STORED AS
>   INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>   OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
> LOCATION 's3://xxxxx/xxxxx/xxxx'
> TBLPROPERTIES (
>   'transient_lastDdlTime' = '1519729384',
>   'last_modified_time' = '1519645652',
>   'last_modified_by' = 'pavlo',
>   'last_castor_run_ts' = '1513561658.0'
> )
> spark.sql("""
> DESCRIBE test.trends
> """).collect()
> // OUTPUT
> [Row(col_name='id', data_type='string', comment=''),
>  Row(col_name='metric', data_type='string', comment=''),
>  Row(col_name='amount', data_type='bigint', comment=''),
>  Row(col_name='date', data_type='string', comment=''),
>  Row(col_name='# Partition Information', data_type='', comment=''),
>  Row(col_name='# col_name', data_type='data_type', comment='comment'),
>  Row(col_name='date', data_type='string', comment='')]
> spark.sql("""alter table test.trends change column id id string comment 
> 'unique identifier'""")
> spark.sql("""
> DESCRIBE test.trends
> """).collect()
> // OUTPUT
> [Row(col_name='id', data_type='string', comment=''), Row(col_name='metric', 
> data_type='string', comment=''), Row(col_name='amount', data_type='bigint', 
> comment=''), Row(col_name='date', data_type='string', comment=''), 
> Row(col_name='# Partition Information', data_type='', comment=''), 
> Row(col_name='# col_name', data_type='data_type', comment='comment'), 
> Row(col_name='date', data_type='string', comment='')]
> {code}
> The strange is that I've assigned comment to the id field from hive 
> successfully, and it's visible in Hue UI, but it's still not visible in from 
> spark, and any spark requests doesn't have effect on the comments.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to