Github user concretevitamin commented on the pull request:
https://github.com/apache/spark/pull/1238#issuecomment-48645064
To handle potential overflow (one last TODO), I think there are a couple
alternatives:
- A: Throw exceptions for overflowing operations. Similar to [1].
- B: Use [1], but replace the overflow situations with a Top and a Bottom
that absorb/saturate things correspondingly. Similar concepts here [2].
- C: Use [3] (or reimplement parts of it), which just carry out an
overflowing operation in the lifted BigInt counterparts.
I think Approach A is bad as when dealing with big data we'd almost
certainly run into this case in the future. Approach B is reasonable in that
whenever we see a Top/Bottom, we could just disable/special-case the cost
estimation. Approach C looks okay too but may be too heavy.
Let me know what do you guys think should go into this PR.
[1]
https://github.com/twitter/util/blob/master/util-core/src/main/scala/com/twitter/util/LongOverflowArith.scala
[2]
https://github.com/twitter/util/blob/master/util-core/src/main/scala/com/twitter/util/Duration.scala
[3]
https://github.com/non/spire/blob/master/core/src/main/scala/spire/math/SafeLong.scala
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---