gengliangwang commented on a change in pull request #25300: [SPARK-28503][SQL]
Return null result on cast an out-of-range value to a integral type
URL: https://github.com/apache/spark/pull/25300#discussion_r309043580
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
##########
@@ -456,9 +474,52 @@ case class Cast(child: Expression, dataType: DataType,
timeZoneId: Option[String
case DateType =>
buildCast[Int](_, d => null)
case TimestampType =>
- buildCast[Long](_, t => timestampToLong(t).toInt)
- case x: NumericType =>
- b => x.numeric.asInstanceOf[Numeric[Any]].toInt(b)
+ buildCast[Long](_, t => {
+ val longValue = timestampToLong(t)
+ if (longValue == longValue.toInt) {
+ longValue.toInt
+ } else {
+ null
+ }
+ })
+ case ByteType =>
+ b => b.asInstanceOf[Byte].toInt
+ case ShortType =>
+ b => b.asInstanceOf[Short].toInt
+ case LongType =>
+ buildCast[Long](_, l =>
+ if (l == l.toInt) {
+ l.toInt
+ } else {
+ null
+ }
+ )
+ case x: FloatType =>
+ buildCast[Float](_, f =>
+ if (f <= Int.MaxValue && f >= Int.MinValue) {
+ f.toInt
Review comment:
Actually it is tricty to compare float and int if the float value is around
Int.MaxValue or Int.MinValue.
```
scala> BigDecimal((Int.MaxValue + 1L).toString).toFloat <= Int.MaxValue
res1: Boolean = true
```
This is because `float` is also 32 bits long, and it uses 8 bit to represent
the exponent field. While `int` is 32 bits long. So the comparison won't be
accurate.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]