[ 
https://issues.apache.org/jira/browse/SPARK-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730096#comment-14730096
 ] 

Marcelo Vanzin commented on SPARK-10439:
----------------------------------------

There's another overflow in {{toJulianDay}}, since the timestamp may exceed the 
number of days that can be written in the format expected by Hive (32 bits). 

A second issue in that method is that it can return negative day and nanosecond 
values. The spec I have in hand (from Impala, which defined that format) 
explicitly states negative values are not supported (high bit is ignored), and 
while trying to read those using Hive code, I do run into issues (different 
exceptions depending on the value). I haven't found a public version of the 
document though. :-/

> Catalyst should check for overflow / underflow of date and timestamp values
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-10439
>                 URL: https://issues.apache.org/jira/browse/SPARK-10439
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.0
>            Reporter: Marcelo Vanzin
>            Priority: Minor
>
> While testing some code, I noticed that a few methods in {{DateTimeUtils}} 
> are prone to overflow and underflow.
> For example, {{millisToDays}} can overflow the return type ({{Int}}) if a 
> large enough input value is provided.
> Similarly, {{fromJavaTimestamp}} converts milliseconds to microseconds, which 
> can overflow if the input is {{> Long.MAX_VALUE / 1000}} (or underflow in the 
> negative case).
> There might be others but these were the ones that caught my eye.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to