[GitHub] spark issue #20846: [SPARK-5498][SQL][FOLLOW] add schema to table partition
Github user srowen commented on the issue: https://github.com/apache/spark/pull/20846 What JIRA was this really about? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20846: [SPARK-5498][SQL][FOLLOW] add schema to table partition
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20846 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20846: [SPARK-5498][SQL][FOLLOW] add schema to table partition
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20846 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20846: [SPARK-5498][SQL][FOLLOW] add schema to table partition
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20846 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20846: [SPARK-5498][SQL][FOLLOW] add schema to table partition
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20846 Right, @gatorsmile . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20846: [SPARK-5498][SQL][FOLLOW] add schema to table partition
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20846 We do not allow users to change the table column type. Currently, only the column comments are allowed to change if users issue the command through Spark. However, users still can change it through Hive. Thus, nothing we can do from Spark side, right? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20846: [SPARK-5498][SQL][FOLLOW] add schema to table partition
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20846 Definitely, +1 for your idea. Since that is different from this PR. Could you try to make a PR instead of this PR? > We should prevent user to change table's column type --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20846: [SPARK-5498][SQL][FOLLOW] add schema to table partition
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/20846 The exception is not thrown in `ALTER TABLE`. We should prevent user to change table's column type. But, for historical data, should we do some compatible measures? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20846: [SPARK-5498][SQL][FOLLOW] add schema to table partition
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20846 @liutang123 , Spark should not do this kind of risky thing. Hive 2.3.2 also disallows incompatible schema changes like the following. ```sql hive> CREATE TABLE test_par(a string) PARTITIONED BY (b bigint) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'; OK Time taken: 0.262 seconds hive> ALTER TABLE test_par CHANGE a a bigint RESTRICT; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions : a hive> SELECT VERSION(); OK 2.3.2 r857a9fd8ad725a53bd95c1b2d6612f9b1155f44d Time taken: 0.711 seconds, Fetched: 1 row(s) ``` cc @gatorsmile . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20846: [SPARK-5498][SQL][FOLLOW] add schema to table partition
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20846 @liutang123 . Did you test this with the latest Apache Spark 2.3? Apache Spark 2.3 works without any problem with your example. ```scala scala> sql("create table test_par(a string) PARTITIONED BY (b bigint) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'") res0: org.apache.spark.sql.DataFrame = [] hive> ALTER TABLE test_par CHANGE a a bigint restrict; OK Time taken: 1.358 seconds scala> sql("select * from test_par").show 18/03/16 17:33:52 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException 18/03/16 17:33:53 WARN HiveExternalCatalog: The table schema given by Hive metastore(struct) is different from the schema when this table was created by Spark SQL(struct). We have to fall back to the table schema from Hive metastore which is not case preserving. 18/03/16 17:33:54 WARN HiveExternalCatalog: The table schema given by Hive metastore(struct) is different from the schema when this table was created by Spark SQL(struct). We have to fall back to the table schema from Hive metastore which is not case preserving. +---+---+ | a| b| +---+---+ +---+---+ scala> sc.version res1: String = 2.3.0 ``` So, please include a test case for this PR. You may insert some data to illustrate your issue. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20846: [SPARK-5498][SQL][FOLLOW] add schema to table partition
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20846 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20846: [SPARK-5498][SQL][FOLLOW] add schema to table partition
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20846 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org