Pavlo Skliar created SPARK-23525:
------------------------------------

             Summary: Update column comment doesn't work from spark
                 Key: SPARK-23525
                 URL: https://issues.apache.org/jira/browse/SPARK-23525
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.2.0
            Reporter: Pavlo Skliar


{code:java}
print(spark.sql("""
SHOW CREATE TABLE test.trends
""").collect()[0].createtab_stmt)

/// OUTPUT
CREATE EXTERNAL TABLE `test`.`trends`(`id` string COMMENT '', `metric` string 
COMMENT '', `amount` bigint COMMENT '')
COMMENT ''
PARTITIONED BY (`date` string COMMENT '')
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
WITH SERDEPROPERTIES (
  'serialization.format' = '1'
)
STORED AS
  INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
  OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION 's3://xxxxx/xxxxx/xxxx'
TBLPROPERTIES (
  'transient_lastDdlTime' = '1519729384',
  'last_modified_time' = '1519645652',
  'last_modified_by' = 'pavlo',
  'last_castor_run_ts' = '1513561658.0'
)


spark.sql("""
DESCRIBE test.trends
""").collect()

// OUTPUT
[Row(col_name='id', data_type='string', comment=''),
 Row(col_name='metric', data_type='string', comment=''),
 Row(col_name='amount', data_type='bigint', comment=''),
 Row(col_name='date', data_type='string', comment=''),
 Row(col_name='# Partition Information', data_type='', comment=''),
 Row(col_name='# col_name', data_type='data_type', comment='comment'),
 Row(col_name='date', data_type='string', comment='')]


spark.sql("""alter table test.trends change column id id string comment 'unique 
identifier'""")


spark.sql("""
DESCRIBE test.trends
""").collect()

// OUTPUT
[Row(col_name='id', data_type='string', comment=''), Row(col_name='metric', 
data_type='string', comment=''), Row(col_name='amount', data_type='bigint', 
comment=''), Row(col_name='date', data_type='string', comment=''), 
Row(col_name='# Partition Information', data_type='', comment=''), 
Row(col_name='# col_name', data_type='data_type', comment='comment'), 
Row(col_name='date', data_type='string', comment='')]
{code}
The strange is that I've assigned comment to the id field from hive 
successfully, and it's visible in Hue UI, but it's still not visible in from 
spark, and any spark requests doesn't have effect on the comments.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to