[GitHub] spark pull request #18853: [SPARK-21646][SQL] Add new type coercion to compa...

cloud-fan Fri, 10 Nov 2017 04:51:26 -0800

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18853#discussion_r150227062
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1460,6 +1460,13 @@ that these options will be deprecated in future 
release as more optimizations ar
           Configures the number of partitions to use when shuffling data for 
joins or aggregations.
         </td>
       </tr>
    +  <tr>
    +    <td><code>spark.sql.typeCoercion.mode</code></td>
    +    <td><code>legacy</code></td>
    +    <td>
    +        The <code>legacy</code> type coercion mode was used in spark prior 
to 2.3, and so it continues to be the default to avoid breaking behavior. 
However, it has logical inconsistencies. The <code>hive</code> mode is 
preferred for most new applications, though it may require additional manual 
casting.
    --- End diff --
    
    I don't agree hive's type coercion rule is the most reasonable. One example 
is casting both sides to double when comparing string and long, which may lead 
to wrong result because of precision lose.
    
    I'd like to be neutral here, just say users can choose different type 
coercion mode, like hive, mysql, etc. By default it's spark.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #18853: [SPARK-21646][SQL] Add new type coercion to compa...

Reply via email to