[
https://issues.apache.org/jira/browse/SPARK-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730096#comment-14730096
]
Marcelo Vanzin commented on SPARK-10439:
----------------------------------------
There's another overflow in {{toJulianDay}}, since the timestamp may exceed the
number of days that can be written in the format expected by Hive (32 bits).
A second issue in that method is that it can return negative day and nanosecond
values. The spec I have in hand (from Impala, which defined that format)
explicitly states negative values are not supported (high bit is ignored), and
while trying to read those using Hive code, I do run into issues (different
exceptions depending on the value). I haven't found a public version of the
document though. :-/
> Catalyst should check for overflow / underflow of date and timestamp values
> ---------------------------------------------------------------------------
>
> Key: SPARK-10439
> URL: https://issues.apache.org/jira/browse/SPARK-10439
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.5.0
> Reporter: Marcelo Vanzin
> Priority: Minor
>
> While testing some code, I noticed that a few methods in {{DateTimeUtils}}
> are prone to overflow and underflow.
> For example, {{millisToDays}} can overflow the return type ({{Int}}) if a
> large enough input value is provided.
> Similarly, {{fromJavaTimestamp}} converts milliseconds to microseconds, which
> can overflow if the input is {{> Long.MAX_VALUE / 1000}} (or underflow in the
> negative case).
> There might be others but these were the ones that caught my eye.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]