[ https://issues.apache.org/jira/browse/SPARK-25367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
yy updated SPARK-25367: ----------------------- Description: We save the dataframe object as a hive table in orc/parquet format in the spark shell. After we modified the column type (int to double) of this table in hive jdbc, we found the column type queried in spark-shell didn't change, but changed in hive jdbc. After we restarted the spark-shell, this table's column type is still incompatible as showed in hive jdbc. The coding process are as follows: spark-shell: val df = spark.read.json("examples/src/main/resources/people.json"); df.write.format("orc").saveAsTable("people_test"); spark.catalog.refreshTable("people_test") spark.sql("desc people").show() hive: alter table people_test change column age age1 double; desc people_test; spark-shell: spark.sql("desc people").show() We also tested in spark-shell by creating a table using spark.sql("create table XXX()"), the modified columns are consistent. was: We save the created dataframe object as a hive table in orc/parquet format in the spark shell. After we modified the column type (int to double) of this table in hive jdbc, we found the column type queried in spark-shell didn't change, but changed in hive jdbc. After we restarted the spark-shell, this table's column type is still incompatible as showed in hive jdbc. The coding process are as follows: spark-shell: val df = spark.read.json("examples/src/main/resources/people.json"); df.write.format("orc").saveAsTable("people_test"); spark.catalog.refreshTable("people_test") spark.sql("desc people").show() hive: alter table people_test change column age age1 double; desc people_test; spark-shell: spark.sql("desc people").show() We also tested in spark-shell by creating a table using spark.sql("create table XXX()"), and the modified columns also changed in spark. > Hive table created by Spark dataFrame has incompatiable schema in spark and > hive > -------------------------------------------------------------------------------- > > Key: SPARK-25367 > URL: https://issues.apache.org/jira/browse/SPARK-25367 > Project: Spark > Issue Type: Bug > Components: Spark Shell > Affects Versions: 2.2.1 > Environment: spark2.2.1 > hive1.2.1 > Reporter: yy > Priority: Major > Labels: sparksql > > We save the dataframe object as a hive table in orc/parquet format in the > spark shell. > After we modified the column type (int to double) of this table in hive > jdbc, we found the column type queried in spark-shell didn't change, but > changed in hive jdbc. After we restarted the spark-shell, this table's column > type is still incompatible as showed in hive jdbc. > The coding process are as follows: > spark-shell: > val df = spark.read.json("examples/src/main/resources/people.json"); > df.write.format("orc").saveAsTable("people_test"); > spark.catalog.refreshTable("people_test") > spark.sql("desc people").show() > hive: > alter table people_test change column age age1 double; > desc people_test; > spark-shell: > spark.sql("desc people").show() > > We also tested in spark-shell by creating a table using spark.sql("create > table XXX()"), the modified columns are consistent. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org