Github user datumbox commented on the issue:
https://github.com/apache/spark/pull/17059
@srowen The following snippet handles explicitly Longs. It can be rewritten
to remove duplicate code by introducing bools for overflow detection but I
don't think it is worth it. In theory you can catch also explicitly other types
such as Byte and Short but I think that's an overkill.
As far as I saw, all SQL numerical types inherit from Number so comparing
their doubleValue with their intValue would be enough to check if they are
within integer range.
```scala
val u = udf { (n: Any) =>
n match {
case v: Int => v
case v: Long =>
val intV = v.intValue
if (v == intV) {
intV
}
else {
throw new IllegalArgumentException("out of range")
}
//case v: Byte => v.toInt
//case v: Short => v.toInt
case v: Number =>
val intV = v.intValue
if (v.doubleValue == intV) {
intV
}
else {
throw new IllegalArgumentException("out of range")
}
case _ => throw new IllegalArgumentException("invalid type")
}
}
```
Personally, I would remove the explicit Long case as it introduces
duplicate code and does not help match. The remaining snippet avoids doing any
casting if the ID is integer (which should be the majority of cases and yields
the biggest memory/speed gains) or non-numeric and handles all corner cases
(All scala/java numeric types + SQL Numerics). Agree?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]