[GitHub] spark pull request #19773: [SPARK-22546][SQL] Supporting for changing column...

xuanyuanking Tue, 11 Sep 2018 02:18:37 -0700

Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19773#discussion_r216600156
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
    @@ -350,16 +366,11 @@ case class AlterTableChangeColumnCommand(
             s"${schema.fieldNames.mkString("[`", "`, `", "`]")}"))
       }
     
    -  // Add the comment to a column, if comment is empty, return the original 
column.
    -  private def addComment(column: StructField, comment: Option[String]): 
StructField = {
    -    comment.map(column.withComment(_)).getOrElse(column)
    -  }
    -
    --- End diff --
    
    ```
    Although the query above doesn't work well, why do users change column 
types?
    ```
    As the scenario described above, user firstly use int but during some time 
found here we need a Long, he can rewrite the new data as Long and load data to 
new partitions. And if we not support the type change, user should do the table 
recreate job for this type change work.
    
    Yep, if not the binary file, the query works OK.
    ```
    Logging initialized using configuration in 
jar:file:/Users/XuanYuan/Source/hive/apache-hive-1.2.2-bin/lib/hive-common-1.2.2.jar!/hive-log4j.properties
    hive> CREATE TABLE t(a INT, b STRING, c INT);
    OK
    Time taken: 2.576 seconds
    hive> INSERT INTO t VALUES (1, 'a', 3);;
    Query ID = XuanYuan_20180911164348_32238a6c-b0a4-4cfd-aa3d-00a7628031cf
    Total jobs = 3
    Launching Job 1 out of 3
    Number of reduce tasks is set to 0 since there's no reduce operator
    Job running in-process (local Hadoop)
    2018-09-11 16:43:51,684 Stage-1 map = 100%,  reduce = 0%
    Ended Job = job_local1624238888_0001
    Stage-4 is selected by condition resolver.
    Stage-3 is filtered out by condition resolver.
    Stage-5 is filtered out by condition resolver.
    Moving data to: 
file:/Users/XuanYuan/Source/hive/apache-hive-1.2.2-bin/warehouse/t/.hive-staging_hive_2018-09-11_16-43-48_117_2262603440504094412-1/-ext-10000
    Loading data to table default.t
    Table default.t stats: [numFiles=1, numRows=1, totalSize=6, rawDataSize=5]
    MapReduce Jobs Launched:
    Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 SUCCESS
    Total MapReduce CPU Time Spent: 0 msec
    OK
    Time taken: 4.025 seconds
    hive> select * from t;;
    OK
    1   a       3
    Time taken: 0.164 seconds, Fetched: 1 row(s)
    hive> ALTER TABLE t CHANGE a a STRING;
    OK
    Time taken: 0.177 seconds
    hive> select * from t;
    OK
    1   a       3
    Time taken: 0.12 seconds, Fetched: 1 row(s)
    hive> quit;
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19773: [SPARK-22546][SQL] Supporting for changing column...

Reply via email to