[spark] branch branch-3.0 updated: [SPARK-30937][DOC] Group Hive upgrade guides together

wenchen Thu, 27 Feb 2020 05:37:30 -0800

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-3.0 by this push:
     new b00895c  [SPARK-30937][DOC] Group Hive upgrade guides together
b00895c is described below

commit b00895ceded4da49793314833e5442249d05f461
Author: yi.wu <[email protected]>
AuthorDate: Thu Feb 27 21:29:42 2020 +0800

    [SPARK-30937][DOC] Group Hive upgrade guides together
    
    ### What changes were proposed in this pull request?
    
    This PR groups all hive upgrade related migration guides inside Spark 3.0 
together.
    
    Also add another behavior change of `ScriptTransform` in the new Hive 
section.
    
    ### Why are the changes needed?
    
    Make the doc more clearly to user.
    
    ### Does this PR introduce any user-facing change?
    
    No, new doc for Spark 3.0.
    
    ### How was this patch tested?
    
    N/A.
    
    Closes #27670 from Ngone51/hive_migration.
    
    Authored-by: yi.wu <[email protected]>
    Signed-off-by: Wenchen Fan <[email protected]>
    (cherry picked from commit 22dfd15a4574a5cccdc54c96f11de28d58363016)
    Signed-off-by: Wenchen Fan <[email protected]>
---
 docs/sql-migration-guide.md                                    | 10 +++++++---
 .../spark/sql/hive/execution/ScriptTransformationSuite.scala   |  5 ++---
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 7e0a536..d241a66 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -254,7 +254,7 @@ license: |
         </tr>
     </table>
     
-  - Since Spark 3.0, CREATE TABLE without a specific provider will use the 
value of `spark.sql.sources.default` as its provider. In Spark version 2.4 and 
earlier, it was hive. To restore the behavior before Spark 3.0, you can set 
`spark.sql.legacy.createHiveTableByDefault.enabled` to `true`.
+  - Since Spark 3.0, `CREATE TABLE` without a specific provider will use the 
value of `spark.sql.sources.default` as its provider. In Spark version 2.4 and 
earlier, it was hive. To restore the behavior before Spark 3.0, you can set 
`spark.sql.legacy.createHiveTableByDefault.enabled` to `true`.
 
   - Since Spark 3.0, the unary arithmetic operator plus(`+`) only accepts 
string, numeric and interval type values as inputs. Besides, `+` with a 
integral string representation will be coerced to double value, e.g. `+'1'` 
results `1.0`. In Spark version 2.4 and earlier, this operator is ignored. 
There is no type checking for it, thus, all type values with a `+` prefix are 
valid, e.g. `+ array(1, 2)` is valid and results `[1, 2]`. Besides, there is no 
type coercion for it at all, e.g. in  [...]
 
@@ -332,10 +332,14 @@ license: |
 
   - Since Spark 3.0, `SHOW CREATE TABLE` will always return Spark DDL, even 
when the given table is a Hive serde table. For generating Hive DDL, please use 
`SHOW CREATE TABLE AS SERDE` command instead.
 
-  - Since Spark 3.0, we upgraded the built-in Hive from 1.2 to 2.3. This may 
need to set `spark.sql.hive.metastore.version` and 
`spark.sql.hive.metastore.jars` according to the version of the Hive metastore.
+  - Since Spark 3.0, we upgraded the built-in Hive from 1.2 to 2.3 and it 
brings following impacts:
+  
+    - You may need to set `spark.sql.hive.metastore.version` and 
`spark.sql.hive.metastore.jars` according to the version of the Hive metastore 
you want to connect to.
   For example: set `spark.sql.hive.metastore.version` to `1.2.1` and 
`spark.sql.hive.metastore.jars` to `maven` if your Hive metastore version is 
1.2.1.
   
-  - Since Spark 3.0, we upgraded the built-in Hive from 1.2 to 2.3. You need 
to migrate your custom SerDes to Hive 2.3 or build your own Spark with 
`hive-1.2` profile. See HIVE-15167 for more details.
+    - You need to migrate your custom SerDes to Hive 2.3 or build your own 
Spark with `hive-1.2` profile. See HIVE-15167 for more details.
+
+    - The decimal string representation can be different between Hive 1.2 and 
Hive 2.3 when using `TRANSFORM` operator in SQL for script transformation, 
which depends on hive's behavior. In Hive 1.2, the string representation omits 
trailing zeroes. But in Hive 2.3, it is always padded to 18 digits with 
trailing zeroes if necessary.
 
 ## Upgrading from Spark SQL 2.4.4 to 2.4.5
 
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/ScriptTransformationSuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/ScriptTransformationSuite.scala
index 7d01fc5..7153d3f 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/ScriptTransformationSuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/ScriptTransformationSuite.scala
@@ -212,9 +212,8 @@ class ScriptTransformationSuite extends SparkPlanTest with 
SQLTestUtils with Tes
           |FROM v
         """.stripMargin)
 
-      // In Hive1.2, it does not do well on Decimal conversion. For example, 
in this case,
-      // it converts a decimal value's type from Decimal(38, 18) to Decimal(1, 
0). So we need
-      // do extra cast here for Hive1.2. But in Hive2.3, it still keeps the 
original Decimal type.
+      // In Hive 1.2, the string representation of a decimal omits trailing 
zeroes.
+      // But in Hive 2.3, it is always padded to 18 digits with trailing 
zeroes if necessary.
       val decimalToString: Column => Column = if (HiveUtils.isHive23) {
         c => c.cast("string")
       } else {


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[spark] branch branch-3.0 updated: [SPARK-30937][DOC] Group Hive upgrade guides together

Reply via email to