Srinath created SPARK-17158:
-------------------------------

             Summary: Improve error message for numeric literal parsing
                 Key: SPARK-17158
                 URL: https://issues.apache.org/jira/browse/SPARK-17158
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.0.0
            Reporter: Srinath
            Priority: Minor


Spark currently gives confusing and inconsistent error messages for numeric 
literals. For example:
scala> sql("select 123456Y")
org.apache.spark.sql.catalyst.parser.ParseException:
Value out of range. Value:"123456" Radix:10(line 1, pos 7)

== SQL ==
select 123456Y
-------^^^
scala> sql("select 123456S")
org.apache.spark.sql.catalyst.parser.ParseException:
Value out of range. Value:"123456" Radix:10(line 1, pos 7)

== SQL ==
select 123456S
-------^^^
scala> sql("select 12345623434523434564565L")
org.apache.spark.sql.catalyst.parser.ParseException:
For input string: "12345623434523434564565"(line 1, pos 7)

== SQL ==
select 12345623434523434564565L
-------^^^
The problem is that we are relying on JDK's implementations for parsing, and 
those functions throw different error messages. This code can be found in 
AstBuilder.numericLiteral function.
The proposal is that instead of using `_.toByte` to turn a string into a byte, 
we always turn the numeric literal string into a BigDecimal, and then we 
validate the range before turning it into a numeric value. This way, we have 
more control over the data.
If BigDecimal fails to parse the number, we should throw a better exception 
than "For input string ...".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to