Ryan Bald created SPARK-21811:
---------------------------------

             Summary: Inconsistency when finding the widest common type of a 
combination of DateType, StringType, and NumericType
                 Key: SPARK-21811
                 URL: https://issues.apache.org/jira/browse/SPARK-21811
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.3.0
            Reporter: Ryan Bald
            Priority: Minor


Finding the widest common type for the arguments of a variadic function (such 
as IN or COALESCE) when the types of the arguments are a combination of 
DateType/TimestampType, StringType, and NumericType fails with an 
AnalysisException for some orders of the arguments and succeeds with a common 
type of StringType for other orders of the arguments.

The below examples used to reproduce the error assume a schema of:
{{[c1: date, c2: string, c3: int]}}

The following succeeds:
{{SELECT coalesce(c1, c2, c3) FROM table}}

While the following produces an exception:
{{SELECT coalesce(c1, c3, c2) FROM table}}

The order of arguments affects the behavior because it looks to be the widest 
common type is found by repeatedly looking at two arguments at a time, the 
widest common type found thus far and the next argument. On initial thought of 
a fix, I think the way the widest common type is found would have to be changed 
and instead look at all arguments first before deciding what the widest common 
type should be.

As my boss is out of office for the rest of the day I will give a pull request 
a shot, but as I am not super familiar with Scala or Spark's coding style 
guidelines, a pull request is not promised. Going forward with my attempted 
pull request, I will assume having DateType/TimestampType, StringType, and 
NumericType arguments in an IN expression and COALESCE function (and any other 
function/expression where this combination of argument types can occur) is 
valid. I find it also quite reasonable to have this combination of argument 
types to be invalid, so if that's what is decided, then oh well.

If I were a betting man, I'd say the fix would be made in the following file: 
[TypeCoercion.scala|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to