[ https://issues.apache.org/jira/browse/SPARK-25367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
yy updated SPARK-25367: ----------------------- Description: We save the dataframe object as a hive table in orc/parquet format in the spark shell. After we modified the column type (int to double) of this table in hive jdbc, we found the column type queried in spark-shell didn't change, but changed in hive jdbc. After we restarted the spark-shell, this table's column type is still incompatible as showed in hive jdbc. The coding process are as follows: spark-shell: {code:java} val df = spark.read.json("examples/src/main/resources/people.json"); df.write.format("orc").saveAsTable("people_test"); spark.catalog.refreshTable("people_test") spark.sql("desc people").show() {code} hive: {code:java} alter table people_test change column age age1 double; desc people_test;{code} spark-shell: {code:java} spark.sql("desc people").show() {code} We also tested in spark-shell by creating a table using spark.sql("create table XXX()"), the modified columns are consistent. was: We save the dataframe object as a hive table in orc/parquet format in the spark shell. After we modified the column type (int to double) of this table in hive jdbc, we found the column type queried in spark-shell didn't change, but changed in hive jdbc. After we restarted the spark-shell, this table's column type is still incompatible as showed in hive jdbc. The coding process are as follows: spark-shell: val df = spark.read.json("examples/src/main/resources/people.json"); df.write.format("orc").saveAsTable("people_test"); spark.catalog.refreshTable("people_test") spark.sql("desc people").show() hive: alter table people_test change column age age1 double; desc people_test; spark-shell: spark.sql("desc people").show() We also tested in spark-shell by creating a table using spark.sql("create table XXX()"), the modified columns are consistent. > Hive table created by Spark dataFrame has incompatiable schema in spark and > hive > -------------------------------------------------------------------------------- > > Key: SPARK-25367 > URL: https://issues.apache.org/jira/browse/SPARK-25367 > Project: Spark > Issue Type: Bug > Components: Spark Shell, SQL > Affects Versions: 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1 > Environment: spark2.2.1-hadoop-2.6.0-chd-5.4.2 > hive-1.2.1 > Reporter: yy > Priority: Major > Labels: sparksql > Fix For: 2.3.2 > > > We save the dataframe object as a hive table in orc/parquet format in the > spark shell. > After we modified the column type (int to double) of this table in hive > jdbc, we found the column type queried in spark-shell didn't change, but > changed in hive jdbc. After we restarted the spark-shell, this table's column > type is still incompatible as showed in hive jdbc. > The coding process are as follows: > spark-shell: > {code:java} > val df = spark.read.json("examples/src/main/resources/people.json"); > df.write.format("orc").saveAsTable("people_test"); > spark.catalog.refreshTable("people_test") > spark.sql("desc people").show() > {code} > > hive: > > {code:java} > alter table people_test change column age age1 double; > desc people_test;{code} > spark-shell: > {code:java} > spark.sql("desc people").show() > {code} > > We also tested in spark-shell by creating a table using spark.sql("create > table XXX()"), the modified columns are consistent. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org