[
https://issues.apache.org/jira/browse/SPARK-21811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon resolved SPARK-21811.
----------------------------------
Resolution: Fixed
Assignee: Jiang Xingbo
Fix Version/s: 2.4.0
Fixed in https://github.com/apache/spark/pull/21074
> Inconsistency when finding the widest common type of a combination of
> DateType, StringType, and NumericType
> -----------------------------------------------------------------------------------------------------------
>
> Key: SPARK-21811
> URL: https://issues.apache.org/jira/browse/SPARK-21811
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.3.0
> Reporter: Ryan Bald
> Assignee: Jiang Xingbo
> Priority: Minor
> Fix For: 2.4.0
>
>
> Finding the widest common type for the arguments of a variadic function (such
> as IN or COALESCE) when the types of the arguments are a combination of
> DateType/TimestampType, StringType, and NumericType fails with an
> AnalysisException for some orders of the arguments and succeeds with a common
> type of StringType for other orders of the arguments.
> The below examples used to reproduce the error assume a schema of:
> {{[c1: date, c2: string, c3: int]}}
> The following succeeds:
> {{SELECT coalesce(c1, c2, c3) FROM table}}
> While the following produces an exception:
> {{SELECT coalesce(c1, c3, c2) FROM table}}
> The order of arguments affects the behavior because it looks to be the widest
> common type is found by repeatedly looking at two arguments at a time, the
> widest common type found thus far and the next argument. On initial thought
> of a fix, I think the way the widest common type is found would have to be
> changed and instead look at all arguments first before deciding what the
> widest common type should be.
> As my boss is out of office for the rest of the day I will give a pull
> request a shot, but as I am not super familiar with Scala or Spark's coding
> style guidelines, a pull request is not promised. Going forward with my
> attempted pull request, I will assume having DateType/TimestampType,
> StringType, and NumericType arguments in an IN expression and COALESCE
> function (and any other function/expression where this combination of
> argument types can occur) is valid. I find it also quite reasonable to have
> this combination of argument types to be invalid, so if that's what is
> decided, then oh well.
> If I were a betting man, I'd say the fix would be made in the following file:
> [TypeCoercion.scala|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]