This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
     new 83ae6f4084f [SPARK-38796][SQL] Implement the to_number and 
try_to_number SQL functions according to a new specification
83ae6f4084f is described below

commit 83ae6f4084f1ebbff96df27f076a283515424c9c
Author: Daniel Tenedorio <[email protected]>
AuthorDate: Mon Apr 11 17:40:39 2022 +0800

    [SPARK-38796][SQL] Implement the to_number and try_to_number SQL functions 
according to a new specification
    
    ### What changes were proposed in this pull request?
    
    This PR implements the `to_number` and `try_to_number` SQL function 
expressions according to new semantics described below. The former is 
equivalent to the latter except that it throws an exception instead of 
returning NULL for cases where the input string does not match the format 
string.
    
    -----------
    
    # `try_to_number` function
    
    Returns `expr` cast to DECIMAL using formatting `fmt`, or `NULL` if `expr` 
is not a valid match for the given format.
    
    ## Syntax
    
    ```
    try_to_number(expr, fmt)
    fmt
      { ' [ S ] [ L | $ ]
          [ 0 | 9 | G | , ] [...]
          [ . | D ]
          [ 0 | 9 ] [...]
          [ L | $ ] [ PR | MI | S ] ' }
    ```
    
    ## Arguments
    
    - `expr`: A STRING expression representing a number. `expr` may include 
leading or trailing spaces.
    - `fmt`: An STRING literal, specifying the expected format of `expr`.
    
    ## Returns
    
    A DECIMAL(p, s) where `p` is the total number of digits (`0` or `9`) and 
`s` is the number of digits after the decimal point, or 0 if there is none.
    
    `fmt` can contain the following elements (case insensitive):
    
    - **`0`** or **`9`**
    
      Specifies an expected digit between `0` and `9`.
      A `0` to the left of the decimal points indicates that `expr` must have 
at least as many digits.
      A leading `9` indicates that `expr` may omit these digits.
    
      `expr` must not be larger than the number of digits to the left of the 
decimal point allowed by the format string.
    
      Digits to the right of the decimal point in the format string indicate 
the most digits that `expr` may have to the right of the decimal point.
    
    - **`.`** or **`D`**
    
      Specifies the position of the decimal point.
    
      `expr` does not need to include a decimal point.
    
    - **`,`** or **`G`**
    
      Specifies the position of the `,` grouping (thousands) separator.
      There must be a `0` or `9` to the left of the rightmost grouping 
separator.
      `expr` must match the grouping separator relevant for the size of the 
number.
    
    - **`L`** or **`$`**
    
      Specifies the location of the `$` currency sign. This character may only 
be specified once.
    
    - **`S`**
    
      Specifies the position of an option '+' or '-' sign. This character may 
only be specified once.
    
    - **`MI`**
    
      Specifies that `expr` has an optional `-` sign at the end, but no `+`.
    
    - **`PR`**
    
      Specifies that `expr` indicates a negative number with wrapping angled 
brackets (`<1>`).
    
    If `expr` contains any characters other than `0` through `9` and those 
permitted in `fmt` a `NULL` is returned.
    
    ## Examples
    
    ```sql
    -- The format expects:
    --  * an optional sign at the beginning,
    --  * followed by a dollar sign,
    --  * followed by a number between 3 and 6 digits long,
    --  * thousands separators,
    --  * up to two dights beyond the decimal point.
    > SELECT try_to_number('-$12,345.67', 'S$999,099.99');
     -12345.67
    -- The plus sign is optional, and so are fractional digits.
    > SELECT try_to_number('$345', 'S$999,099.99');
     345.00
    -- The format requires at least three digits.
    > SELECT try_to_number('$45', 'S$999,099.99');
     NULL
    -- The format requires at least three digits.
    > SELECT try_to_number('$045', 'S$999,099.99');
     45.00
    -- Using brackets to denote negative values
    > SELECT try_to_number('<1234>', '999999PR');
     -1234
    ```
    
    ### Why are the changes needed?
    
    The new semantics bring Spark into consistency with other engines and grant 
the user flexibility about how to handle cases where inputs do not match the 
format string.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes.
    
    * The minus sign `-` is no longer supported in the format string (`S` 
replaces it).
    * `MI` and `PR` are new options in the format string.
    * `to_number` and `try_to_number` are separate functions with different 
error behavior.
    
    ### How was this patch tested?
    
    * New positive and negative unit tests cover both `to_number` and 
`try_to_number` functions.
    * Query tests update as needed according to the behavior changes.
    
    Closes #36066 from dtenedor/to-number.
    
    Authored-by: Daniel Tenedorio <[email protected]>
    Signed-off-by: Wenchen Fan <[email protected]>
    (cherry picked from commit 7a6b98965bf40993ea2e7837ded1c79813bec5d8)
    Signed-off-by: Wenchen Fan <[email protected]>
---
 .../sql/catalyst/analysis/FunctionRegistry.scala   |   1 +
 .../expressions/numberFormatExpressions.scala      | 109 +++-
 .../spark/sql/catalyst/util/NumberFormatter.scala  | 243 ---------
 .../spark/sql/catalyst/util/ToNumberParser.scala   | 579 +++++++++++++++++++++
 .../expressions/StringExpressionsSuite.scala       | 297 ++++++-----
 .../sql/catalyst/util/NumberFormatterSuite.scala   | 315 -----------
 .../sql-functions/sql-expression-schema.md         |   5 +-
 .../sql-tests/inputs/string-functions.sql          |   7 +-
 .../results/ansi/string-functions.sql.out          |  22 +-
 .../sql-tests/results/postgreSQL/numeric.sql.out   |  18 +-
 .../sql-tests/results/string-functions.sql.out     |  22 +-
 11 files changed, 859 insertions(+), 759 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
index f6bd9891681..91bc34bef37 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
@@ -541,6 +541,7 @@ object FunctionRegistry {
     expression[FormatNumber]("format_number"),
     expression[FormatString]("format_string"),
     expression[ToNumber]("to_number"),
+    expression[TryToNumber]("try_to_number"),
     expression[GetJsonObject]("get_json_object"),
     expression[InitCap]("initcap"),
     expression[StringInstr]("instr"),
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala
index e29a425eef1..88947c5c87a 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala
@@ -22,48 +22,58 @@ import java.util.Locale
 import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
 import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, 
CodeGenerator, ExprCode}
 import org.apache.spark.sql.catalyst.expressions.codegen.Block.BlockHelper
-import org.apache.spark.sql.catalyst.util.NumberFormatter
+import org.apache.spark.sql.catalyst.util.ToNumberParser
 import org.apache.spark.sql.types.{DataType, StringType}
 import org.apache.spark.unsafe.types.UTF8String
 
 /**
- * A function that converts string to numeric.
+ * A function that converts strings to decimal values, returning an exception 
if the input string
+ * fails to match the format string.
  */
 @ExpressionDescription(
   usage = """
-     _FUNC_(strExpr, formatExpr) - Convert `strExpr` to a number based on the 
`formatExpr`.
-       The format can consist of the following characters:
-         '0' or '9':  digit position
-         '.' or 'D':  decimal point (only allowed once)
-         ',' or 'G':  group (thousands) separator
-         '-' or 'S':  sign anchored to number (only allowed once)
-         '$':  value with a leading dollar sign (only allowed once)
+     _FUNC_(expr, fmt) - Convert string 'expr' to a number based on the string 
format 'fmt'.
+       Throws an exception if the conversion fails. The format can consist of 
the following
+       characters, case insensitive:
+         '0' or '9': Specifies an expected digit between 0 and 9. A sequence 
of 0 or 9 in the format
+           string matches a sequence of digits in the input string. If the 0/9 
sequence starts with
+           0 and is before the decimal point, it can only match a digit 
sequence of the same size.
+           Otherwise, if the sequence starts with 9 or is after the decimal 
poin, it can match a
+           digit sequence that has the same or smaller size.
+         '.' or 'D': Specifies the position of the decimal point (optional, 
only allowed once).
+         ',' or 'G': Specifies the position of the grouping (thousands) 
separator (,). There must be
+           one or more 0 or 9 to the left of the rightmost grouping separator. 
'expr' must match the
+           grouping separator relevant for the size of the number.
+         '$': Specifies the location of the $ currency sign. This character 
may only be specified
+           once.
+         'S': Specifies the position of a '-' or '+' sign (optional, only 
allowed once).
+         'MI': Specifies that 'expr' has an optional '-' sign, but no '+' 
(only allowed once).
+         'PR': Only allowed at the end of the format string; specifies that 
'expr' indicates a
+           negative number with wrapping angled brackets.
+           ('<1>').
   """,
   examples = """
     Examples:
       > SELECT _FUNC_('454', '999');
        454
-      > SELECT _FUNC_('454.00', '000D00');
+      > SELECT _FUNC_('454.00', '000.00');
        454.00
-      > SELECT _FUNC_('12,454', '99G999');
+      > SELECT _FUNC_('12,454', '99,999');
        12454
       > SELECT _FUNC_('$78.12', '$99.99');
        78.12
-      > SELECT _FUNC_('12,454.8-', '99G999D9S');
+      > SELECT _FUNC_('12,454.8-', '99,999.9S');
        -12454.8
   """,
   since = "3.3.0",
   group = "string_funcs")
 case class ToNumber(left: Expression, right: Expression)
   extends BinaryExpression with ImplicitCastInputTypes with NullIntolerant {
-
   private lazy val numberFormat = 
right.eval().toString.toUpperCase(Locale.ROOT)
-  private lazy val numberFormatter = new NumberFormatter(numberFormat)
+  private lazy val numberFormatter = new ToNumberParser(numberFormat, true)
 
   override def dataType: DataType = numberFormatter.parsedDecimalType
-
   override def inputTypes: Seq[DataType] = Seq(StringType, StringType)
-
   override def checkInputDataTypes(): TypeCheckResult = {
     val inputTypeCheck = super.checkInputDataTypes()
     if (inputTypeCheck.isSuccess) {
@@ -76,17 +86,14 @@ case class ToNumber(left: Expression, right: Expression)
       inputTypeCheck
     }
   }
-
   override def prettyName: String = "to_number"
-
   override def nullSafeEval(string: Any, format: Any): Any = {
     val input = string.asInstanceOf[UTF8String]
     numberFormatter.parse(input)
   }
-
   override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
     val builder =
-      ctx.addReferenceObj("builder", numberFormatter, 
classOf[NumberFormatter].getName)
+      ctx.addReferenceObj("builder", numberFormatter, 
classOf[ToNumberParser].getName)
     val eval = left.genCode(ctx)
     ev.copy(code =
       code"""
@@ -98,8 +105,66 @@ case class ToNumber(left: Expression, right: Expression)
         |}
       """.stripMargin)
   }
-
   override protected def withNewChildrenInternal(
-      newLeft: Expression, newRight: Expression): ToNumber = copy(left = 
newLeft, right = newRight)
+      newLeft: Expression, newRight: Expression): ToNumber =
+    copy(left = newLeft, right = newRight)
 }
 
+/**
+ * A function that converts strings to decimal values, returning NULL if the 
input string fails to
+ * match the format string.
+ */
+@ExpressionDescription(
+  usage = """
+     _FUNC_(expr, fmt) - Convert string 'expr' to a number based on the string 
format `fmt`.
+       Returns NULL if the string 'expr' does not match the expected format. 
The format follows the
+       same semantics as the to_number function.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_('454', '999');
+       454
+      > SELECT _FUNC_('454.00', '000.00');
+       454.00
+      > SELECT _FUNC_('12,454', '99,999');
+       12454
+      > SELECT _FUNC_('$78.12', '$99.99');
+       78.12
+      > SELECT _FUNC_('12,454.8-', '99,999.9S');
+       -12454.8
+  """,
+  since = "3.3.0",
+  group = "string_funcs")
+case class TryToNumber(left: Expression, right: Expression)
+  extends BinaryExpression with ImplicitCastInputTypes with NullIntolerant {
+  private lazy val numberFormat = 
right.eval().toString.toUpperCase(Locale.ROOT)
+  private lazy val numberFormatter = new ToNumberParser(numberFormat, false)
+
+  override def dataType: DataType = numberFormatter.parsedDecimalType
+  override def inputTypes: Seq[DataType] = Seq(StringType, StringType)
+  override def nullable: Boolean = true
+  override def checkInputDataTypes(): TypeCheckResult = ToNumber(left, 
right).checkInputDataTypes()
+  override def prettyName: String = "try_to_number"
+  override def nullSafeEval(string: Any, format: Any): Any = {
+    val input = string.asInstanceOf[UTF8String]
+    numberFormatter.parse(input)
+  }
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    val builder =
+      ctx.addReferenceObj("builder", numberFormatter, 
classOf[ToNumberParser].getName)
+    val eval = left.genCode(ctx)
+    ev.copy(code =
+      code"""
+        |${eval.code}
+        |boolean ${ev.isNull} = ${eval.isNull};
+        |${CodeGenerator.javaType(dataType)} ${ev.value} = 
${CodeGenerator.defaultValue(dataType)};
+        |if (!${ev.isNull}) {
+        |  ${ev.value} = $builder.parse(${eval.value});
+        |}
+      """.stripMargin)
+  }
+  override protected def withNewChildrenInternal(
+      newLeft: Expression,
+      newRight: Expression): TryToNumber =
+    copy(left = newLeft, right = newRight)
+}
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberFormatter.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberFormatter.scala
deleted file mode 100644
index a14aceb6922..00000000000
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberFormatter.scala
+++ /dev/null
@@ -1,243 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *    http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.util
-
-import java.math.BigDecimal
-import java.text.{DecimalFormat, ParsePosition}
-import java.util.Locale
-
-import org.apache.spark.sql.AnalysisException
-import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
-import org.apache.spark.sql.errors.QueryExecutionErrors
-import org.apache.spark.sql.types.{Decimal, DecimalType}
-import org.apache.spark.unsafe.types.UTF8String
-
-object NumberFormatter {
-  final val POINT_SIGN = '.'
-  final val POINT_LETTER = 'D'
-  final val COMMA_SIGN = ','
-  final val COMMA_LETTER = 'G'
-  final val MINUS_SIGN = '-'
-  final val MINUS_LETTER = 'S'
-  final val DOLLAR_SIGN = '$'
-  final val NINE_DIGIT = '9'
-  final val ZERO_DIGIT = '0'
-  final val POUND_SIGN = '#'
-
-  final val COMMA_SIGN_STRING = COMMA_SIGN.toString
-  final val POUND_SIGN_STRING = POUND_SIGN.toString
-
-  final val SIGN_SET = Set(POINT_SIGN, COMMA_SIGN, MINUS_SIGN, DOLLAR_SIGN)
-}
-
-class NumberFormatter(originNumberFormat: String, isParse: Boolean = true) 
extends Serializable {
-  import NumberFormatter._
-
-  protected val normalizedNumberFormat = normalize(originNumberFormat)
-
-  private val transformedFormat = transform(normalizedNumberFormat)
-
-  private lazy val numberDecimalFormat = {
-    val decimalFormat = new DecimalFormat(transformedFormat)
-    decimalFormat.setParseBigDecimal(true)
-    decimalFormat
-  }
-
-  private lazy val (precision, scale) = {
-    val formatSplits = 
normalizedNumberFormat.split(POINT_SIGN).map(_.filterNot(isSign))
-    assert(formatSplits.length <= 2)
-    val precision = formatSplits.map(_.length).sum
-    val scale = if (formatSplits.length == 2) formatSplits.last.length else 0
-    (precision, scale)
-  }
-
-  def parsedDecimalType: DecimalType = DecimalType(precision, scale)
-
-  /**
-   * DecimalFormat provides '#' and '0' as placeholder of digit, ',' as 
grouping separator,
-   * '.' as decimal separator, '-' as minus, '$' as dollar, but not '9', 'G', 
'D', 'S'. So we need
-   * replace them show below:
-   * 1. '9' -> '#'
-   * 2. 'G' -> ','
-   * 3. 'D' -> '.'
-   * 4. 'S' -> '-'
-   *
-   * Note: When calling format, we must preserve the digits after decimal 
point, so the digits
-   * after decimal point should be replaced as '0'. For example: '999.9' will 
be normalized as
-   * '###.0' and '999.99' will be normalized as '###.00', so if the input is 
454, the format
-   * output will be 454.0 and 454.00 respectively.
-   *
-   * @param format number format string
-   * @return normalized number format string
-   */
-  private def normalize(format: String): String = {
-    var notFindDecimalPoint = true
-    val normalizedFormat = format.toUpperCase(Locale.ROOT).map {
-      case NINE_DIGIT if notFindDecimalPoint => POUND_SIGN
-      case ZERO_DIGIT if isParse && notFindDecimalPoint => POUND_SIGN
-      case NINE_DIGIT if !notFindDecimalPoint => ZERO_DIGIT
-      case COMMA_LETTER => COMMA_SIGN
-      case POINT_LETTER | POINT_SIGN =>
-        notFindDecimalPoint = false
-        POINT_SIGN
-      case MINUS_LETTER => MINUS_SIGN
-      case other => other
-    }
-    // If the comma is at the beginning or end of number format, then 
DecimalFormat will be
-    // invalid. For example, "##,###," or ",###,###" for DecimalFormat is 
invalid, so we must use
-    // "##,###" or "###,###".
-    
normalizedFormat.stripPrefix(COMMA_SIGN_STRING).stripSuffix(COMMA_SIGN_STRING)
-  }
-
-  private def isSign(c: Char): Boolean = {
-    SIGN_SET.contains(c)
-  }
-
-  private def transform(format: String): String = {
-    if (format.contains(MINUS_SIGN)) {
-      // For example: '#.######' represents a positive number,
-      // but '#.######;#.######-' represents a negative number.
-      val positiveFormatString = format.replaceAll("-", "")
-      s"$positiveFormatString;$format"
-    } else {
-      format
-    }
-  }
-
-  def check(): TypeCheckResult = {
-    def invalidSignPosition(c: Char): Boolean = {
-      val signIndex = normalizedNumberFormat.indexOf(c)
-      signIndex > 0 && signIndex < normalizedNumberFormat.length - 1
-    }
-
-    def multipleSignInNumberFormatError(message: String): String = {
-      s"At most one $message is allowed in the number format: 
'$originNumberFormat'"
-    }
-
-    def nonFistOrLastCharInNumberFormatError(message: String): String = {
-      s"$message must be the first or last char in the number format: 
'$originNumberFormat'"
-    }
-
-    if (normalizedNumberFormat.length == 0) {
-      TypeCheckResult.TypeCheckFailure("Number format cannot be empty")
-    } else if (normalizedNumberFormat.count(_ == POINT_SIGN) > 1) {
-      TypeCheckResult.TypeCheckFailure(
-        multipleSignInNumberFormatError(s"'$POINT_LETTER' or '$POINT_SIGN'"))
-    } else if (normalizedNumberFormat.count(_ == MINUS_SIGN) > 1) {
-      TypeCheckResult.TypeCheckFailure(
-        multipleSignInNumberFormatError(s"'$MINUS_LETTER' or '$MINUS_SIGN'"))
-    } else if (normalizedNumberFormat.count(_ == DOLLAR_SIGN) > 1) {
-      
TypeCheckResult.TypeCheckFailure(multipleSignInNumberFormatError(s"'$DOLLAR_SIGN'"))
-    } else if (invalidSignPosition(MINUS_SIGN)) {
-      TypeCheckResult.TypeCheckFailure(
-        nonFistOrLastCharInNumberFormatError(s"'$MINUS_LETTER' or 
'$MINUS_SIGN'"))
-    } else if (invalidSignPosition(DOLLAR_SIGN)) {
-      TypeCheckResult.TypeCheckFailure(
-        nonFistOrLastCharInNumberFormatError(s"'$DOLLAR_SIGN'"))
-    } else {
-      TypeCheckResult.TypeCheckSuccess
-    }
-  }
-
-  /**
-   * Convert string to numeric based on the given number format.
-   * The format can consist of the following characters:
-   * '0' or '9': digit position
-   * '.' or 'D': decimal point (only allowed once)
-   * ',' or 'G': group (thousands) separator
-   * '-' or 'S': sign anchored to number (only allowed once)
-   * '$': value with a leading dollar sign (only allowed once)
-   *
-   * @param input the string need to converted
-   * @return decimal obtained from string parsing
-   */
-  def parse(input: UTF8String): Decimal = {
-    val inputStr = input.toString.trim
-    val inputSplits = inputStr.split(POINT_SIGN)
-    assert(inputSplits.length <= 2)
-    if (inputSplits.length == 1) {
-      if (inputStr.filterNot(isSign).length > precision - scale) {
-        throw QueryExecutionErrors.invalidNumberFormatError(input, 
originNumberFormat)
-      }
-    } else if (inputSplits(0).filterNot(isSign).length > precision - scale ||
-      inputSplits(1).filterNot(isSign).length > scale) {
-      throw QueryExecutionErrors.invalidNumberFormatError(input, 
originNumberFormat)
-    }
-
-    try {
-      val number = numberDecimalFormat.parse(inputStr, new ParsePosition(0))
-      assert(number.isInstanceOf[BigDecimal])
-      Decimal(number.asInstanceOf[BigDecimal])
-    } catch {
-      case _: IllegalArgumentException =>
-        throw QueryExecutionErrors.invalidNumberFormatError(input, 
originNumberFormat)
-    }
-  }
-
-  /**
-   * Convert numeric to string based on the given number format.
-   * The format can consist of the following characters:
-   * '9': digit position (can be dropped if insignificant)
-   * '0': digit position (will not be dropped, even if insignificant)
-   * '.' or 'D': decimal point (only allowed once)
-   * ',' or 'G': group (thousands) separator
-   * '-' or 'S': sign anchored to number (only allowed once)
-   * '$': value with a leading dollar sign (only allowed once)
-   *
-   * @param input the decimal to format
-   * @param numberFormat the format string
-   * @return The string after formatting input decimal
-   */
-  def format(input: Decimal): String = {
-    val bigDecimal = input.toJavaBigDecimal
-    val decimalPlainStr = bigDecimal.toPlainString
-    if (decimalPlainStr.length > transformedFormat.length) {
-      transformedFormat.replaceAll("0", POUND_SIGN_STRING)
-    } else {
-      var resultStr = numberDecimalFormat.format(bigDecimal)
-      // Since we trimmed the comma at the beginning or end of number format 
in function
-      // `normalize`, we restore the comma to the result here.
-      // For example, if the specified number format is "99,999," or 
",999,999", function
-      // `normalize` normalize them to "##,###" or "###,###".
-      // new DecimalFormat("##,###").parse(12454) and new 
DecimalFormat("###,###").parse(124546)
-      // will return "12,454" and "124,546" respectively. So we add ',' at the 
end and head of
-      // the result, then the final output are "12,454," or ",124,546".
-      if (originNumberFormat.last == COMMA_SIGN || originNumberFormat.last == 
COMMA_LETTER) {
-        resultStr = resultStr + COMMA_SIGN
-      }
-      if (originNumberFormat.charAt(0) == COMMA_SIGN ||
-        originNumberFormat.charAt(0) == COMMA_LETTER) {
-        resultStr = COMMA_SIGN + resultStr
-      }
-
-      resultStr
-    }
-  }
-}
-
-// Visible for testing
-class TestNumberFormatter(originNumberFormat: String, isParse: Boolean = true)
-  extends NumberFormatter(originNumberFormat, isParse) {
-  def checkWithException(): Unit = {
-    check() match {
-      case TypeCheckResult.TypeCheckFailure(message) =>
-        throw new AnalysisException(message)
-      case _ =>
-    }
-  }
-}
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ToNumberParser.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ToNumberParser.scala
new file mode 100644
index 00000000000..afba683efad
--- /dev/null
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ToNumberParser.scala
@@ -0,0 +1,579 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.util
+
+import scala.collection.mutable
+
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import org.apache.spark.sql.errors.QueryExecutionErrors
+import org.apache.spark.sql.types.{Decimal, DecimalType}
+import org.apache.spark.unsafe.types.UTF8String
+
+// This object contains some definitions of characters and tokens for the 
parser below.
+object ToNumberParser {
+  final val ANGLE_BRACKET_CLOSE = '>'
+  final val ANGLE_BRACKET_OPEN = '<'
+  final val COMMA_LETTER = 'G'
+  final val COMMA_SIGN = ','
+  final val DOLLAR_SIGN = '$'
+  final val MINUS_SIGN = '-'
+  final val NINE_DIGIT = '9'
+  final val OPTIONAL_PLUS_OR_MINUS_LETTER = 'S'
+  final val PLUS_SIGN = '+'
+  final val POINT_LETTER = 'D'
+  final val POINT_SIGN = '.'
+  final val ZERO_DIGIT = '0'
+
+  final val OPTIONAL_MINUS_STRING = "MI"
+  final val WRAPPING_ANGLE_BRACKETS_TO_NEGATIVE_NUMBER = "PR"
+
+  final val OPTIONAL_MINUS_STRING_START = 'M'
+  final val OPTIONAL_MINUS_STRING_END = 'I'
+
+  final val WRAPPING_ANGLE_BRACKETS_TO_NEGATIVE_NUMBER_START = 'P'
+  final val WRAPPING_ANGLE_BRACKETS_TO_NEGATIVE_NUMBER_END = 'R'
+
+  // This class represents one or more characters that we expect to be present 
in the input string
+  // based on the format string.
+  abstract class InputToken()
+  // Represents some number of digits (0-9).
+  abstract class Digits extends InputToken
+  // Represents exactly 'num' digits (0-9).
+  case class ExactlyAsManyDigits(num: Int) extends Digits
+  // Represents at most 'num' digits (0-9).
+  case class AtMostAsManyDigits(num: Int) extends Digits
+  // Represents one decimal point (.).
+  case class DecimalPoint() extends InputToken
+  // Represents one thousands separator (,).
+  case class ThousandsSeparator() extends InputToken
+  // Represents one or more groups of Digits (0-9) with ThousandsSeparators 
(,) between each group.
+  // The 'tokens' are the Digits and ThousandsSeparators in order; the 
'digits' are just the Digits.
+  case class DigitGroups(tokens: Seq[InputToken], digits: Seq[Digits]) extends 
InputToken
+  // Represents one dollar sign ($).
+  case class DollarSign() extends InputToken
+  // Represents one optional plus sign (+) or minus sign (-).
+  case class OptionalPlusOrMinusSign() extends InputToken
+  // Represents one optional minus sign (-).
+  case class OptionalMinusSign() extends InputToken
+  // Represents one opening angle bracket (<).
+  case class OpeningAngleBracket() extends InputToken
+  // Represents one closing angle bracket (>).
+  case class ClosingAngleBracket() extends InputToken
+  // Represents any unrecognized character other than the above.
+  case class InvalidUnrecognizedCharacter(char: Char) extends InputToken
+}
+
+/**
+ * This class represents a parser to implement the to_number or try_to_number 
SQL functions.
+ *
+ * It works by consuming an input string and a format string. This class 
accepts the format string
+ * as a field, and proceeds to iterate through the format string to generate a 
sequence of tokens
+ * (or throw an exception if the format string is invalid). Then when the 
function is called with an
+ * input string, this class steps through the sequence of tokens and compares 
them against the input
+ * string, returning a Spark Decimal object if they match (or throwing an 
exception otherwise).
+ *
+ * @param numberFormat the format string describing the expected inputs.
+ * @param errorOnFail true if evaluation should throw an exception if the 
input string fails to
+ *                    match the format string. Otherwise, returns NULL instead.
+ */
+class ToNumberParser(numberFormat: String, errorOnFail: Boolean) extends 
Serializable {
+  import ToNumberParser._
+
+  // Consumes the format string and produce a sequence of input tokens 
expected from each input
+  // string.
+  private lazy val formatTokens: Seq[InputToken] = {
+    val tokens = mutable.Buffer.empty[InputToken]
+    var i = 0
+    var reachedDecimalPoint = false
+    val len = numberFormat.length
+    while (i < len) {
+      val char: Char = numberFormat(i)
+      char match {
+        case ZERO_DIGIT =>
+          val prevI = i
+          do {
+            i += 1
+          } while (i < len && (numberFormat(i) == ZERO_DIGIT || 
numberFormat(i) == NINE_DIGIT))
+          if (reachedDecimalPoint) {
+            tokens.append(AtMostAsManyDigits(i - prevI))
+          } else {
+            tokens.append(ExactlyAsManyDigits(i - prevI))
+          }
+        case NINE_DIGIT =>
+          val prevI = i
+          do {
+            i += 1
+          } while (i < len && (numberFormat(i) == ZERO_DIGIT || 
numberFormat(i) == NINE_DIGIT))
+          tokens.append(AtMostAsManyDigits(i - prevI))
+        case POINT_SIGN | POINT_LETTER =>
+          tokens.append(DecimalPoint())
+          reachedDecimalPoint = true
+          i += 1
+        case COMMA_SIGN | COMMA_LETTER =>
+          tokens.append(ThousandsSeparator())
+          i += 1
+        case DOLLAR_SIGN =>
+          tokens.append(DollarSign())
+          i += 1
+        case OPTIONAL_PLUS_OR_MINUS_LETTER =>
+          tokens.append(OptionalPlusOrMinusSign())
+          i += 1
+        case OPTIONAL_MINUS_STRING_START if i < len - 1 &&
+          OPTIONAL_MINUS_STRING_END == numberFormat(i + 1) =>
+          tokens.append(OptionalMinusSign())
+          i += 2
+        case WRAPPING_ANGLE_BRACKETS_TO_NEGATIVE_NUMBER_START if i < len - 1 &&
+          WRAPPING_ANGLE_BRACKETS_TO_NEGATIVE_NUMBER_END == numberFormat(i + 
1) =>
+          tokens.prepend(OpeningAngleBracket())
+          tokens.append(ClosingAngleBracket())
+          i += 2
+        case c: Char =>
+          tokens.append(InvalidUnrecognizedCharacter(c))
+          i += 1
+      }
+    }
+
+    // Combine each group of consecutive Digits and ThousandsSeparator tokens 
into a DigitGroups.
+    val groupedTokens = mutable.Buffer.empty[InputToken]
+    var currentGroup = mutable.Buffer.empty[InputToken]
+    var currentDigits = mutable.Buffer.empty[Digits]
+    for (token <- tokens) {
+      token match {
+        case digits: Digits =>
+          currentGroup.append(token)
+          currentDigits.append(digits)
+        case _: ThousandsSeparator =>
+          currentGroup.append(token)
+        case other =>
+          if (currentGroup.nonEmpty) {
+            // We reverse the expected digit tokens in this new DigitGroups 
here, and we do the same
+            // for actual groups of 0-9 characters in each input string. In 
this way, we can safely
+            // ignore any leading optional groups of digits in the format 
string.
+            groupedTokens.append(
+              DigitGroups(currentGroup.reverse.toSeq, 
currentDigits.reverse.toSeq))
+            currentGroup = mutable.Buffer.empty[InputToken]
+            currentDigits = mutable.Buffer.empty[Digits]
+          }
+          groupedTokens.append(other)
+      }
+    }
+    if (currentGroup.nonEmpty) {
+      groupedTokens.append(DigitGroups(currentGroup.reverse.toSeq, 
currentDigits.reverse.toSeq))
+    }
+    groupedTokens.toSeq
+  }
+
+  /**
+   * Precision is the number of digits in a number. Scale is the number of 
digits to the right of
+   * the decimal point in a number. For example, the number 123.45 has a 
precision of 5 and a
+   * scale of 2.
+   */
+  private lazy val precision: Int = {
+    val lengths = formatTokens.map {
+      case DigitGroups(_, digits) => digits.map {
+        case ExactlyAsManyDigits(num) => num
+        case AtMostAsManyDigits(num) => num
+      }.sum
+      case _ => 0
+    }
+    lengths.sum
+  }
+
+  private lazy val scale: Int = {
+    val index = formatTokens.indexOf(DecimalPoint())
+    if (index != -1) {
+      val suffix: Seq[InputToken] = formatTokens.drop(index)
+      val lengths: Seq[Int] = suffix.map {
+        case DigitGroups(_, digits) => digits.map {
+          case ExactlyAsManyDigits(num) => num
+          case AtMostAsManyDigits(num) => num
+        }.sum
+        case _ => 0
+      }
+      lengths.sum
+    } else {
+      0
+    }
+  }
+
+  // Holds all digits (0-9) before the decimal point (.) while parsing each 
input string.
+  private lazy val beforeDecimalPoint = new StringBuilder(precision)
+  // Holds all digits (0-9) after the decimal point (.) while parsing each 
input string.
+  private lazy val afterDecimalPoint = new StringBuilder(scale)
+  // Number of digits (0-9) in each group of the input string, split by 
thousands separators.
+  private lazy val parsedDigitGroupSizes = mutable.Buffer.empty[Int]
+  // Increments to count the number of digits (0-9) in the current group 
within the input string.
+  private var numDigitsInCurrentGroup: Int = 0
+
+  /**
+   * The result type of this parsing is a Decimal value with the appropriate 
precision and scale.
+   */
+  def parsedDecimalType: DecimalType = DecimalType(precision, scale)
+
+  /**
+   * Consumes the format string to check validity and computes an appropriate 
Decimal output type.
+   */
+  def check(): TypeCheckResult = {
+    val validateResult: String = validateFormatString
+    if (validateResult.nonEmpty) {
+      TypeCheckResult.TypeCheckFailure(validateResult)
+    } else {
+      TypeCheckResult.TypeCheckSuccess
+    }
+  }
+
+  /**
+   * This implementation of the [[check]] method returns any error, or the 
empty string on success.
+   */
+  private def validateFormatString: String = {
+    def multipleSignInNumberFormatError(message: String) = {
+      s"At most one $message is allowed in the number format: '$numberFormat'"
+    }
+
+    def notAtEndOfNumberFormatError(message: String) = {
+      s"$message must be at the end of the number format: '$numberFormat'"
+    }
+
+    val inputTokenCounts = formatTokens.groupBy(identity).mapValues(_.size)
+
+    val firstDollarSignIndex: Int = formatTokens.indexOf(DollarSign())
+    val firstDigitIndex: Int = formatTokens.indexWhere {
+      case _: DigitGroups => true
+      case _ => false
+    }
+    val firstDecimalPointIndex: Int = formatTokens.indexOf(DecimalPoint())
+    val digitGroupsBeforeDecimalPoint: Seq[DigitGroups] =
+      formatTokens.zipWithIndex.flatMap {
+        case (d@DigitGroups(_, _), i)
+          if firstDecimalPointIndex == -1 ||
+            i < firstDecimalPointIndex =>
+          Seq(d)
+        case _ => Seq()
+      }
+    val digitGroupsAfterDecimalPoint: Seq[DigitGroups] =
+      formatTokens.zipWithIndex.flatMap {
+        case (d@DigitGroups(_, _), i)
+          if firstDecimalPointIndex != -1 &&
+            i > firstDecimalPointIndex =>
+          Seq(d)
+        case _ => Seq()
+      }
+
+    // Make sure the format string contains at least one token.
+    if (numberFormat.isEmpty) {
+      "The format string cannot be empty"
+    }
+    // Make sure the format string does not contain any unrecognized 
characters.
+    else if 
(formatTokens.exists(_.isInstanceOf[InvalidUnrecognizedCharacter])) {
+      val unrecognizedChars =
+        formatTokens.filter {
+          _.isInstanceOf[InvalidUnrecognizedCharacter]
+        }.map {
+          case i: InvalidUnrecognizedCharacter => i.char
+        }
+      val char: Char = unrecognizedChars.head
+      s"Encountered invalid character $char in the number format: 
'$numberFormat'"
+    }
+    // Make sure the format string contains at least one digit.
+    else if (!formatTokens.exists(
+      token => token.isInstanceOf[DigitGroups])) {
+      "The format string requires at least one number digit"
+    }
+    // Make sure the format string contains at most one decimal point.
+    else if (inputTokenCounts.getOrElse(DecimalPoint(), 0) > 1) {
+      multipleSignInNumberFormatError(s"'$POINT_LETTER' or '$POINT_SIGN'")
+    }
+    // Make sure the format string contains at most one plus or minus sign.
+    else if (inputTokenCounts.getOrElse(OptionalPlusOrMinusSign(), 0) > 1) {
+      multipleSignInNumberFormatError(s"'$OPTIONAL_PLUS_OR_MINUS_LETTER'")
+    }
+    // Make sure the format string contains at most one dollar sign.
+    else if (inputTokenCounts.getOrElse(DollarSign(), 0) > 1) {
+      multipleSignInNumberFormatError(s"'$DOLLAR_SIGN'")
+    }
+    // Make sure the format string contains at most one "MI" sequence.
+    else if (inputTokenCounts.getOrElse(OptionalMinusSign(), 0) > 1) {
+      multipleSignInNumberFormatError(s"'$OPTIONAL_MINUS_STRING'")
+    }
+    // Make sure the format string contains at most one closing angle bracket 
at the end.
+    else if (inputTokenCounts.getOrElse(ClosingAngleBracket(), 0) > 1 ||
+      (inputTokenCounts.getOrElse(ClosingAngleBracket(), 0) == 1 &&
+        formatTokens.last != ClosingAngleBracket())) {
+      
notAtEndOfNumberFormatError(s"'$WRAPPING_ANGLE_BRACKETS_TO_NEGATIVE_NUMBER'")
+    }
+    // Make sure that any dollar sign in the format string occurs before any 
digits.
+    else if (firstDigitIndex < firstDollarSignIndex) {
+      s"Currency characters must appear before digits in the number format: 
'$numberFormat'"
+    }
+    // Make sure that any dollar sign in the format string occurs before any 
decimal point.
+    else if (firstDecimalPointIndex != -1 &&
+      firstDecimalPointIndex < firstDollarSignIndex) {
+      "Currency characters must appear before any decimal point in the " +
+        s"number format: '$numberFormat'"
+    }
+    // Make sure that any thousands separators in the format string have 
digits before and after.
+    else if (digitGroupsBeforeDecimalPoint.exists {
+      case DigitGroups(tokens, _) =>
+        tokens.zipWithIndex.exists({
+          case (_: ThousandsSeparator, j: Int) if j == 0 || j == tokens.length 
- 1 =>
+            true
+          case (_: ThousandsSeparator, j: Int) if tokens(j - 
1).isInstanceOf[ThousandsSeparator] =>
+            true
+          case (_: ThousandsSeparator, j: Int) if tokens(j + 
1).isInstanceOf[ThousandsSeparator] =>
+            true
+          case _ =>
+            false
+        })
+    }) {
+      "Thousands separators (,) must have digits in between them " +
+        s"in the number format: '$numberFormat'"
+    }
+    // Thousands separators are not allowed after the decimal point, if any.
+    else if (digitGroupsAfterDecimalPoint.exists {
+      case DigitGroups(tokens, digits) =>
+        tokens.length > digits.length
+    }) {
+      "Thousands separators (,) may not appear after the decimal point " +
+        s"in the number format: '$numberFormat'"
+    }
+    // Validation of the format string finished successfully.
+    else {
+      ""
+    }
+  }
+
+  /**
+   * Convert string to numeric based on the given number format.
+   *
+   * Iterates through the [[formatTokens]] obtained from processing the format 
string, while also
+   * keeping a parallel index into the input string. Throws an exception if 
the latter does not
+   * contain expected characters at any point.
+   *
+   * @param input the string that needs to converted
+   * @return the result Decimal value obtained from string parsing
+   */
+  def parse(input: UTF8String): Decimal = {
+    val inputString = input.toString
+    val inputLength = inputString.length
+    // Build strings representing all digits before and after the decimal 
point, respectively.
+    beforeDecimalPoint.clear()
+    afterDecimalPoint.clear()
+    var reachedDecimalPoint = false
+    // Record whether the input specified a negative result, such as with a 
minus sign.
+    var negateResult = false
+    // This is an index into the characters of the provided input string.
+    var inputIndex = 0
+    // This is an index into the tokens of the provided format string.
+    var formatIndex = 0
+
+    // Iterate through the tokens representing the provided format string, in 
order.
+    while (formatIndex < formatTokens.size) {
+      val token: InputToken = formatTokens(formatIndex)
+      token match {
+        case d: DigitGroups =>
+          inputIndex = parseDigitGroups(d, inputString, inputIndex, 
reachedDecimalPoint).getOrElse(
+            return formatMatchFailure(input, numberFormat))
+        case DecimalPoint() =>
+          if (inputIndex < inputLength &&
+            inputString(inputIndex) == POINT_SIGN) {
+            reachedDecimalPoint = true
+            inputIndex += 1
+          } else {
+            // There is no decimal point. Consume the token and remain at the 
same character in the
+            // input string.
+          }
+        case DollarSign() =>
+          if (inputIndex >= inputLength ||
+            inputString(inputIndex) != DOLLAR_SIGN) {
+            // The input string did not contain an expected dollar sign.
+            return formatMatchFailure(input, numberFormat)
+          }
+          inputIndex += 1
+        case OptionalPlusOrMinusSign() =>
+          if (inputIndex < inputLength &&
+            inputString(inputIndex) == PLUS_SIGN) {
+            inputIndex += 1
+          } else if (inputIndex < inputLength &&
+            inputString(inputIndex) == MINUS_SIGN) {
+            negateResult = !negateResult
+            inputIndex += 1
+          } else {
+            // There is no plus or minus sign. Consume the token and remain at 
the same character in
+            // the input string.
+          }
+        case OptionalMinusSign() =>
+          if (inputIndex < inputLength &&
+            inputString(inputIndex) == MINUS_SIGN) {
+            negateResult = !negateResult
+            inputIndex += 1
+          } else {
+            // There is no minus sign. Consume the token and remain at the 
same character in the
+            // input string.
+          }
+        case OpeningAngleBracket() =>
+          if (inputIndex >= inputLength ||
+            inputString(inputIndex) != ANGLE_BRACKET_OPEN) {
+            // The input string did not contain an expected opening angle 
bracket.
+            return formatMatchFailure(input, numberFormat)
+          }
+          inputIndex += 1
+        case ClosingAngleBracket() =>
+          if (inputIndex >= inputLength ||
+            inputString(inputIndex) != ANGLE_BRACKET_CLOSE) {
+            // The input string did not contain an expected closing angle 
bracket.
+            return formatMatchFailure(input, numberFormat)
+          }
+          negateResult = !negateResult
+          inputIndex += 1
+      }
+      formatIndex += 1
+    }
+    if (inputIndex < inputLength) {
+      // If we have consumed all the tokens in the format string, but 
characters remain unconsumed
+      // in the input string, then the input string does not match the format 
string.
+      formatMatchFailure(input, numberFormat)
+    } else {
+      getDecimal(negateResult)
+    }
+  }
+
+  /**
+   * Handle parsing the input string for the given expected DigitGroups from 
the format string.
+   *
+   * @param digitGroups the expected DigitGroups from the format string
+   * @param inputString the input string provided to the original parsing 
method
+   * @param startingInputIndex the input index within the input string to 
begin parsing here
+   * @param reachedDecimalPoint true if we have already parsed past the 
decimal point
+   * @return the new updated index within the input string to resume parsing, 
or None on error
+   */
+  private def parseDigitGroups(
+      digitGroups: DigitGroups,
+      inputString: String,
+      startingInputIndex: Int,
+      reachedDecimalPoint: Boolean): Option[Int] = {
+    val expectedDigits: Seq[Digits] = digitGroups.digits
+    val inputLength = inputString.length
+    // Consume characters from the current input index forwards in the input 
string as long as
+    // they are digits (0-9) or the thousands separator (,).
+    numDigitsInCurrentGroup = 0
+    var inputIndex = startingInputIndex
+    parsedDigitGroupSizes.clear()
+
+    while (inputIndex < inputLength &&
+      matchesDigitOrComma(inputString(inputIndex), reachedDecimalPoint)) {
+      inputIndex += 1
+    }
+    if (inputIndex == inputLength) {
+      parsedDigitGroupSizes.prepend(numDigitsInCurrentGroup)
+    }
+    // Compare the number of digits encountered in each group (separated by 
thousands
+    // separators) with the expected numbers from the format string.
+    if (parsedDigitGroupSizes.length > expectedDigits.length) {
+      // The input contains more thousands separators than the format string.
+      return None
+    }
+    for (i <- 0 until expectedDigits.length) {
+      val expectedToken: Digits = expectedDigits(i)
+      val actualNumDigits: Int =
+        if (i < parsedDigitGroupSizes.length) {
+          parsedDigitGroupSizes(i)
+        } else {
+          0
+        }
+      expectedToken match {
+        case ExactlyAsManyDigits(expectedNumDigits)
+          if actualNumDigits != expectedNumDigits =>
+          // The input contained more or fewer digits than required.
+          return None
+        case AtMostAsManyDigits(expectedMaxDigits)
+          if actualNumDigits > expectedMaxDigits =>
+          // The input contained more digits than allowed.
+          return None
+        case _ =>
+      }
+    }
+    Some(inputIndex)
+  }
+
+  /**
+   * Returns true if the given character matches a digit (0-9) or a comma, 
updating fields of
+   * this class related to parsing during the process.
+   */
+  private def matchesDigitOrComma(char: Char, reachedDecimalPoint: Boolean): 
Boolean = {
+    char match {
+      case _ if char.isWhitespace =>
+        // Ignore whitespace and keep advancing through the input string.
+        true
+      case _ if char >= ZERO_DIGIT && char <= NINE_DIGIT =>
+        numDigitsInCurrentGroup += 1
+        // Append each group of input digits to the appropriate 
before/afterDecimalPoint
+        // string for later use in constructing the result Decimal value.
+        if (reachedDecimalPoint) {
+          afterDecimalPoint.append(char)
+        } else {
+          beforeDecimalPoint.append(char)
+        }
+        true
+      case COMMA_SIGN =>
+        parsedDigitGroupSizes.prepend(numDigitsInCurrentGroup)
+        numDigitsInCurrentGroup = 0
+        true
+      case _ =>
+        parsedDigitGroupSizes.prepend(numDigitsInCurrentGroup)
+        false
+    }
+  }
+
+  /**
+   * This method executes when the input string fails to match the format 
string. It throws an
+   * exception if indicated on construction of this class, or returns NULL 
otherwise.
+   */
+  private def formatMatchFailure(input: UTF8String, originNumberFormat: 
String): Decimal = {
+    if (errorOnFail) {
+      throw QueryExecutionErrors.invalidNumberFormatError(input, 
originNumberFormat)
+    }
+    null
+  }
+
+  /**
+   * Computes the final Decimal value from the beforeDecimalPoint and 
afterDecimalPoint fields of
+   * this class, as a result of parsing.
+   *
+   * @param negateResult whether the input string specified to negate the 
result
+   * @return a Decimal value with the value indicated by the input string and 
the precision and
+   *         scale indicated by the format string
+   */
+  private def getDecimal(negateResult: Boolean): Decimal = {
+    // Append zeros to the afterDecimalPoint until it comprises the same 
number of digits as the
+    // scale. This is necessary because we must determine the scale from the 
format string alone but
+    // each input string may include a variable number of digits after the 
decimal point.
+    val extraZeros = "0" * (scale - afterDecimalPoint.length)
+    val afterDecimalPadded = afterDecimalPoint.toString + extraZeros
+    val prefix = if (negateResult) "-" else ""
+    val suffix = if (afterDecimalPadded.nonEmpty) "." + afterDecimalPadded 
else ""
+    val numStr = s"$prefix$beforeDecimalPoint$suffix"
+    val javaDecimal = new java.math.BigDecimal(numStr)
+    if (precision <= Decimal.MAX_LONG_DIGITS) {
+      // Constructs a `Decimal` with an unscaled `Long` value if possible.
+      Decimal(javaDecimal.unscaledValue().longValue(), precision, scale)
+    } else {
+      // Otherwise, resorts to an unscaled `BigInteger` instead.
+      Decimal(javaDecimal, precision, scale)
+    }
+  }
+}
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala
index b05142add0b..4936bce7bf2 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala
@@ -17,6 +17,8 @@
 
 package org.apache.spark.sql.catalyst.expressions
 
+import java.math.{BigDecimal => JavaBigDecimal}
+
 import org.apache.spark.SparkFunSuite
 import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
 import org.apache.spark.sql.catalyst.dsl.expressions._
@@ -892,169 +894,160 @@ class StringExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
       Literal.create(null, IntegerType), Literal.create(null, IntegerType)), 
null)
   }
 
-  test("ToNumber") {
-    ToNumber(Literal("454"), Literal("")).checkInputDataTypes() match {
-      case TypeCheckResult.TypeCheckFailure(msg) =>
-        assert(msg.contains("Number format cannot be empty"))
-    }
-    ToNumber(Literal("454"), NonFoldableLiteral.create("999", StringType))
-      .checkInputDataTypes() match {
-      case TypeCheckResult.TypeCheckFailure(msg) =>
-        assert(msg.contains("Format expression must be foldable"))
-    }
-
-    // Test '0' and '9'
-
-    Seq("454", "054", "54", "450").foreach { input =>
-      val invalidFormat1 = 0.until(input.length - 1).map(_ => '0').mkString
-      val invalidFormat2 = 0.until(input.length - 2).map(_ => '0').mkString
-      val invalidFormat3 = 0.until(input.length - 1).map(_ => '9').mkString
-      val invalidFormat4 = 0.until(input.length - 2).map(_ => '9').mkString
-      Seq(invalidFormat1, invalidFormat2, invalidFormat3, invalidFormat4)
-        .filter(_.nonEmpty).foreach { format =>
-        checkExceptionInExpression[IllegalArgumentException](
-          ToNumber(Literal(input), Literal(format)),
-          s"The input string '$input' does not match the given number format: 
'$format'")
-      }
-
-      val format1 = 0.until(input.length).map(_ => '0').mkString
-      val format2 = 0.until(input.length).map(_ => '9').mkString
-      val format3 = 0.until(input.length).map(i => i % 2 * 9).mkString
-      val format4 = 0.until(input.length + 1).map(_ => '0').mkString
-      val format5 = 0.until(input.length + 1).map(_ => '9').mkString
-      val format6 = 0.until(input.length + 1).map(i => i % 2 * 9).mkString
-      Seq(format1, format2, format3, format4, format5, format6).foreach { 
format =>
-        checkEvaluation(ToNumber(Literal(input), Literal(format)), 
Decimal(input))
-      }
-    }
-
-    // Test '.' and 'D'
-    checkExceptionInExpression[IllegalArgumentException](
-      ToNumber(Literal("454.2"), Literal("999")),
-      "The input string '454.2' does not match the given number format: '999'")
-    Seq("999.9", "000.0", "99.99", "00.00", "0000.0", "9999.9", "00.000", 
"99.999")
-      .foreach { format =>
-        checkExceptionInExpression[IllegalArgumentException](
-          ToNumber(Literal("454.23"), Literal(format)),
-          s"The input string '454.23' does not match the given number format: 
'$format'")
-        val format2 = format.replace('.', 'D')
-        checkExceptionInExpression[IllegalArgumentException](
-          ToNumber(Literal("454.23"), Literal(format2)),
-          s"The input string '454.23' does not match the given number format: 
'$format2'")
-    }
-
+  test("ToNumber: positive tests") {
     Seq(
-      ("454.2", "000.0") -> Decimal(454.2),
-      ("454.23", "000.00") -> Decimal(454.23),
-      ("454.2", "000.00") -> Decimal(454.2),
-      ("454.0", "000.0") -> Decimal(454),
-      ("454.00", "000.00") -> Decimal(454),
-      (".4542", ".0000") -> Decimal(0.4542),
-      ("4542.", "0000.") -> Decimal(4542)
-    ).foreach { case ((str, format), expected) =>
-      checkEvaluation(ToNumber(Literal(str), Literal(format)), expected)
-      val format2 = format.replace('.', 'D')
-      checkEvaluation(ToNumber(Literal(str), Literal(format2)), expected)
-      val format3 = format.replace('0', '9')
-      checkEvaluation(ToNumber(Literal(str), Literal(format3)), expected)
-      val format4 = format3.replace('.', 'D')
-      checkEvaluation(ToNumber(Literal(str), Literal(format4)), expected)
+      ("$345", "S$999,099.99") -> Decimal(345),
+      ("-$12,345.67", "S$999,099.99") -> Decimal(-12345.67),
+      ("454,123", "999,099") -> Decimal(454123),
+      ("$045", "S$999,099.99") -> Decimal(45),
+      ("454", "099") -> Decimal(454),
+      ("454.", "099.99") -> Decimal(454.0),
+      ("454.6", "099D99") -> Decimal(454.6),
+      ("454.67", "099.00") -> Decimal(454.67),
+      ("454", "000") -> Decimal(454),
+      ("  454 ", "9099") -> Decimal(454),
+      ("454", "099") -> Decimal(454),
+      ("454.67", "099.99") -> Decimal(454.67),
+      ("$454", "$999") -> Decimal(454),
+      ("  454,123 ", "999G099") -> Decimal(454123),
+      ("$454,123", "$999G099") -> Decimal(454123),
+      ("+$89,1,2,3,45.123", "S$999,0,0,0,999.00000") -> Decimal(8912345.123),
+      ("-454", "S999") -> Decimal(-454),
+      ("+454", "S999") -> Decimal(454),
+      ("<454>", "999PR") -> Decimal(-454),
+      ("454-", "999MI") -> Decimal(-454),
+      ("-$54", "MI$99") -> Decimal(-54),
+      ("$4-4", "$9MI9") -> Decimal(-44),
+      // The input string contains more digits than fit in a long integer.
+      ("123,456,789,123,456,789,123", "999,999,999,999,999,999,999") ->
+        Decimal(new JavaBigDecimal("123456789123456789123"))
+    ).foreach { case ((str: String, format: String), expected: Decimal) =>
+      val toNumberExpr = ToNumber(Literal(str), Literal(format))
+      assert(toNumberExpr.checkInputDataTypes() == 
TypeCheckResult.TypeCheckSuccess)
+      checkEvaluation(toNumberExpr, expected)
+
+      val tryToNumberExpr = TryToNumber(Literal(str), Literal(format))
+      assert(tryToNumberExpr.checkInputDataTypes() == 
TypeCheckResult.TypeCheckSuccess)
+      checkEvaluation(tryToNumberExpr, expected)
     }
 
-    Seq("999.9.9", "999D9D9", "999.9D9", "999D9.9").foreach { str =>
-      ToNumber(Literal("454.3.2"), Literal(str)).checkInputDataTypes() match {
-        case TypeCheckResult.TypeCheckFailure(msg) =>
-          assert(msg.contains(s"At most one 'D' or '.' is allowed in the 
number format: '$str'"))
+    for (i <- 0 to 2) {
+      for (j <- 3 to 5) {
+        for (k <- 6 to 9) {
+          Seq(
+            (s"$i$j$k", "999") -> Decimal(s"$i$j$k".toInt),
+            (s"$i$j$k", "S099.") -> Decimal(s"$i$j$k".toInt),
+            (s"$i$j.$k", "99.9") -> Decimal(s"$i$j.$k".toDouble),
+            (s"$i,$j,$k", "999,999,0") -> Decimal(s"$i$j$k".toInt)
+          ).foreach { case ((str: String, format: String), expected: Decimal) 
=>
+            val toNumberExpr = ToNumber(Literal(str), Literal(format))
+            assert(toNumberExpr.checkInputDataTypes() == 
TypeCheckResult.TypeCheckSuccess)
+            checkEvaluation(toNumberExpr, expected)
+
+            val tryToNumberExpr = TryToNumber(Literal(str), Literal(format))
+            assert(tryToNumberExpr.checkInputDataTypes() == 
TypeCheckResult.TypeCheckSuccess)
+            checkEvaluation(tryToNumberExpr, expected)
+          }
+        }
       }
     }
+  }
 
-    // Test ',' and 'G'
-    checkExceptionInExpression[IllegalArgumentException](
-      ToNumber(Literal("123,456"), Literal("9G9")),
-      "The input string '123,456' does not match the given number format: 
'9G9'")
-    checkExceptionInExpression[IllegalArgumentException](
-      ToNumber(Literal("123,456,789"), Literal("999,999")),
-      "The input string '123,456,789' does not match the given number format: 
'999,999'")
-
-    Seq(
-      ("12,454", "99,999") -> Decimal(12454),
-      ("12,454", "99,999,999") -> Decimal(12454),
-      ("12,454,367", "99,999,999") -> Decimal(12454367),
-      ("12,454,", "99,999,") -> Decimal(12454),
-      (",454,367", ",999,999") -> Decimal(454367),
-      (",454,367", "999,999") -> Decimal(454367)
-    ).foreach { case ((str, format), expected) =>
-      checkEvaluation(ToNumber(Literal(str), Literal(format)), expected)
-      val format2 = format.replace(',', 'G')
-      checkEvaluation(ToNumber(Literal(str), Literal(format2)), expected)
-      val format3 = format.replace('9', '0')
-      checkEvaluation(ToNumber(Literal(str), Literal(format3)), expected)
-      val format4 = format3.replace(',', 'G')
-      checkEvaluation(ToNumber(Literal(str), Literal(format4)), expected)
-      val format5 = s"${format}9"
-      checkEvaluation(ToNumber(Literal(str), Literal(format5)), expected)
-      val format6 = s"${format}0"
-      checkEvaluation(ToNumber(Literal(str), Literal(format6)), expected)
-      val format7 = s"9${format}9"
-      checkEvaluation(ToNumber(Literal(str), Literal(format7)), expected)
-      val format8 = s"0${format}0"
-      checkEvaluation(ToNumber(Literal(str), Literal(format8)), expected)
-      val format9 = s"${format3}9"
-      checkEvaluation(ToNumber(Literal(str), Literal(format9)), expected)
-      val format10 = s"${format3}0"
-      checkEvaluation(ToNumber(Literal(str), Literal(format10)), expected)
-      val format11 = s"9${format3}9"
-      checkEvaluation(ToNumber(Literal(str), Literal(format11)), expected)
-      val format12 = s"0${format3}0"
-      checkEvaluation(ToNumber(Literal(str), Literal(format12)), expected)
-    }
-
-    // Test '$'
+  test("ToNumber: negative tests (the format string is invalid)") {
+    val invalidCharacter = "Encountered invalid character"
+    val thousandsSeparatorDigitsBetween =
+      "Thousands separators (,) must have digits in between them"
+    val mustBeAtEnd = "must be at the end of the number format"
+    val atMostOne = "At most one"
     Seq(
-      ("$78.12", "$99.99") -> Decimal(78.12),
-      ("$78.12", "$00.00") -> Decimal(78.12),
-      ("78.12$", "99.99$") -> Decimal(78.12),
-      ("78.12$", "00.00$") -> Decimal(78.12)
-    ).foreach { case ((str, format), expected) =>
-      checkEvaluation(ToNumber(Literal(str), Literal(format)), expected)
-    }
+      // The format string must not be empty.
+      ("454", "") -> "The format string cannot be empty",
+      // Make sure the format string does not contain any unrecognized 
characters.
+      ("454", "999@") -> invalidCharacter,
+      ("454", "999M") -> invalidCharacter,
+      ("454", "999P") -> invalidCharacter,
+      // Make sure the format string contains at least one digit.
+      ("454", "$") -> "The format string requires at least one number digit",
+      // Make sure the format string contains at most one decimal point.
+      ("454", "99.99.99") -> atMostOne,
+      // Make sure the format string contains at most one dollar sign.
+      ("454", "$$99") -> atMostOne,
+      // Make sure the format string contains at most one minus sign at the 
end.
+      ("--$54", "SS$99") -> atMostOne,
+      ("-$54", "MI$99MI") -> atMostOne,
+      ("$4-4", "$9MI9MI") -> atMostOne,
+      // Make sure the format string contains at most one closing angle 
bracket at the end.
+      ("<$45>", "PR$99") -> mustBeAtEnd,
+      ("$4<4>", "$9PR9") -> mustBeAtEnd,
+      ("<<454>>", "999PRPR") -> mustBeAtEnd,
+      // Make sure that any dollar sign in the format string occurs before any 
digits.
+      ("4$54", "9$99") -> "Currency characters must appear before digits",
+      // Make sure that any dollar sign in the format string occurs before any 
decimal point.
+      (".$99", ".$99") -> "Currency characters must appear before any decimal 
point",
+      // Thousands separators must have digits in between them.
+      (",123", ",099") -> thousandsSeparatorDigitsBetween,
+      (",123,456", ",999,099") -> thousandsSeparatorDigitsBetween,
+      (",,345", "9,,09.99") -> thousandsSeparatorDigitsBetween,
+      (",,345", "9,99,.99") -> thousandsSeparatorDigitsBetween,
+      (",,345", "9,99,") -> thousandsSeparatorDigitsBetween,
+      (",,345", ",,999,099.99") -> thousandsSeparatorDigitsBetween,
+      // Thousands separators must not appear after the decimal point.
+      ("123.45,6", "099.99,9") -> "Thousands separators (,) may not appear 
after the decimal point"
+    ).foreach { case ((str: String, format: String), expectedErrMsg: String) =>
+      val toNumberResult = ToNumber(Literal(str), 
Literal(format)).checkInputDataTypes()
+      assert(toNumberResult != TypeCheckResult.TypeCheckSuccess,
+        s"The format string should have been invalid: $format")
+      toNumberResult match {
+        case TypeCheckResult.TypeCheckFailure(message) =>
+          assert(message.contains(expectedErrMsg))
+      }
 
-    ToNumber(Literal("$78$.12"), Literal("$99$.99")).checkInputDataTypes() 
match {
-      case TypeCheckResult.TypeCheckFailure(msg) =>
-        assert(msg.contains("At most one '$' is allowed in the number format: 
'$99$.99'"))
-    }
-    ToNumber(Literal("78$.12"), Literal("99$.99")).checkInputDataTypes() match 
{
-      case TypeCheckResult.TypeCheckFailure(msg) =>
-        assert(msg.contains("'$' must be the first or last char in the number 
format: '99$.99'"))
+      val tryToNumberResult = TryToNumber(Literal(str), 
Literal(format)).checkInputDataTypes()
+      assert(tryToNumberResult != TypeCheckResult.TypeCheckSuccess,
+        s"The format string should have been invalid: $format")
+      tryToNumberResult match {
+        case TypeCheckResult.TypeCheckFailure(message) =>
+          assert(message.contains(expectedErrMsg))
+      }
     }
+  }
 
-    // Test '-' and 'S'
+  test("ToNumber: negative tests (the input string does not match the format 
string)") {
     Seq(
-      ("454-", "999-") -> Decimal(-454),
-      ("-454", "-999") -> Decimal(-454),
-      ("12,454.8-", "99G999D9-") -> Decimal(-12454.8),
-      ("00,454.8-", "99G999.9-") -> Decimal(-454.8)
-    ).foreach { case ((str, format), expected) =>
-      checkEvaluation(ToNumber(Literal(str), Literal(format)), expected)
-      val format2 = format.replace('9', '0')
-      checkEvaluation(ToNumber(Literal(str), Literal(format2)), expected)
-      val format3 = format.replace('-', 'S')
-      checkEvaluation(ToNumber(Literal(str), Literal(format3)), expected)
-      val format4 = format2.replace('-', 'S')
-      checkEvaluation(ToNumber(Literal(str), Literal(format4)), expected)
-    }
-
-    ToNumber(Literal("454.3--"), Literal("999D9SS")).checkInputDataTypes() 
match {
-      case TypeCheckResult.TypeCheckFailure(msg) =>
-        assert(msg.contains("At most one 'S' or '-' is allowed in the number 
format: '999D9SS'"))
-    }
-
-    Seq("9S99", "9-99").foreach { str =>
-      ToNumber(Literal("-454"), Literal(str)).checkInputDataTypes() match {
-        case TypeCheckResult.TypeCheckFailure(msg) =>
-          assert(msg.contains(
-            s"'S' or '-' must be the first or last char in the number format: 
'$str'"))
-      }
+      // The input contained more thousands separators than the format string.
+      ("45", "0,9"),
+      // The input contained more or fewer digits than required.
+      ("4", "09"),
+      ("454", "09"),
+      // The input contained more digits than allowed.
+      ("454", "99"),
+      // The input string did not contain an expected dollar sign.
+      ("45", "$99"),
+      // The input string did not contain an expected opening angle bracket.
+      ("45>", "99PR"),
+      // The input string did not contain an expected closing angle bracket.
+      ("<45", "99PR"),
+      // The trailing MI did not match against any trailing +.
+      ("454+", "999MI"),
+      // The trailing PR required exactly one leading < and trailing >.
+      ("<454", "999PR"),
+      ("454>", "999PR"),
+      ("<<454>>", "999PR"),
+      // At least three digits were required.
+      ("45", "S$999,099.99"),
+      // Groups of digits with leading zeros are not optional.
+      ("$345", "S$099,099.99"),
+      // The letter 'D' is allowed in the format string, but not the input 
string.
+      ("4D5", "0D9")
+    ).foreach { case (str: String, format: String) =>
+      val toNumberExpr = ToNumber(Literal(str), Literal(format))
+      assert(toNumberExpr.checkInputDataTypes() == 
TypeCheckResult.TypeCheckSuccess)
+      checkExceptionInExpression[IllegalArgumentException](
+        toNumberExpr, "does not match the given number format")
+
+      val tryToNumberExpr = TryToNumber(Literal(str), Literal(format))
+      assert(tryToNumberExpr.checkInputDataTypes() == 
TypeCheckResult.TypeCheckSuccess)
+      checkEvaluation(tryToNumberExpr, null)
     }
   }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/NumberFormatterSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/NumberFormatterSuite.scala
deleted file mode 100644
index 81264f4e850..00000000000
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/NumberFormatterSuite.scala
+++ /dev/null
@@ -1,315 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *    http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.util
-
-import org.apache.spark.SparkFunSuite
-import org.apache.spark.sql.AnalysisException
-import org.apache.spark.sql.types.Decimal
-import org.apache.spark.unsafe.types.UTF8String
-
-class NumberFormatterSuite extends SparkFunSuite {
-
-  private def invalidNumberFormat(numberFormat: String, errorMsg: String): 
Unit = {
-    val testNumberFormatter = new TestNumberFormatter(numberFormat)
-    val e = 
intercept[AnalysisException](testNumberFormatter.checkWithException())
-    assert(e.getMessage.contains(errorMsg))
-  }
-
-  private def failParseWithInvalidInput(
-      input: UTF8String, numberFormat: String, errorMsg: String): Unit = {
-    val testNumberFormatter = new TestNumberFormatter(numberFormat)
-    val e = 
intercept[IllegalArgumentException](testNumberFormatter.parse(input))
-    assert(e.getMessage.contains(errorMsg))
-  }
-
-  test("parse") {
-    invalidNumberFormat("", "Number format cannot be empty")
-
-    // Test '9' and '0'
-    failParseWithInvalidInput(UTF8String.fromString("454"), "9",
-      "The input string '454' does not match the given number format: '9'")
-    failParseWithInvalidInput(UTF8String.fromString("454"), "99",
-      "The input string '454' does not match the given number format: '99'")
-
-    Seq(
-      ("454", "999") -> Decimal(454),
-      ("054", "999") -> Decimal(54),
-      ("54", "999") -> Decimal(54),
-      ("404", "999") -> Decimal(404),
-      ("450", "999") -> Decimal(450),
-      ("454", "9999") -> Decimal(454),
-      ("054", "9999") -> Decimal(54),
-      ("404", "9999") -> Decimal(404),
-      ("450", "9999") -> Decimal(450)
-    ).foreach { case ((str, format), expected) =>
-      val builder = new TestNumberFormatter(format)
-      builder.check()
-      assert(builder.parse(UTF8String.fromString(str)) === expected)
-    }
-
-    failParseWithInvalidInput(UTF8String.fromString("454"), "0",
-      "The input string '454' does not match the given number format: '0'")
-    failParseWithInvalidInput(UTF8String.fromString("454"), "00",
-      "The input string '454' does not match the given number format: '00'")
-
-    Seq(
-      ("454", "000") -> Decimal(454),
-      ("054", "000") -> Decimal(54),
-      ("54", "000") -> Decimal(54),
-      ("404", "000") -> Decimal(404),
-      ("450", "000") -> Decimal(450),
-      ("454", "0000") -> Decimal(454),
-      ("054", "0000") -> Decimal(54),
-      ("404", "0000") -> Decimal(404),
-      ("450", "0000") -> Decimal(450)
-    ).foreach { case ((str, format), expected) =>
-      val builder = new TestNumberFormatter(format)
-      builder.check()
-      assert(builder.parse(UTF8String.fromString(str)) === expected)
-    }
-
-    // Test '.' and 'D'
-    failParseWithInvalidInput(UTF8String.fromString("454.2"), "999",
-      "The input string '454.2' does not match the given number format: '999'")
-    failParseWithInvalidInput(UTF8String.fromString("454.23"), "999.9",
-      "The input string '454.23' does not match the given number format: 
'999.9'")
-
-    Seq(
-      ("454.2", "999.9") -> Decimal(454.2),
-      ("454.2", "000.0") -> Decimal(454.2),
-      ("454.2", "999D9") -> Decimal(454.2),
-      ("454.2", "000D0") -> Decimal(454.2),
-      ("454.23", "999.99") -> Decimal(454.23),
-      ("454.23", "000.00") -> Decimal(454.23),
-      ("454.23", "999D99") -> Decimal(454.23),
-      ("454.23", "000D00") -> Decimal(454.23),
-      ("454.0", "999.9") -> Decimal(454),
-      ("454.0", "000.0") -> Decimal(454),
-      ("454.0", "999D9") -> Decimal(454),
-      ("454.0", "000D0") -> Decimal(454),
-      ("454.00", "999.99") -> Decimal(454),
-      ("454.00", "000.00") -> Decimal(454),
-      ("454.00", "999D99") -> Decimal(454),
-      ("454.00", "000D00") -> Decimal(454),
-      (".4542", ".9999") -> Decimal(0.4542),
-      (".4542", ".0000") -> Decimal(0.4542),
-      (".4542", "D9999") -> Decimal(0.4542),
-      (".4542", "D0000") -> Decimal(0.4542),
-      ("4542.", "9999.") -> Decimal(4542),
-      ("4542.", "0000.") -> Decimal(4542),
-      ("4542.", "9999D") -> Decimal(4542),
-      ("4542.", "0000D") -> Decimal(4542)
-    ).foreach { case ((str, format), expected) =>
-      val builder = new TestNumberFormatter(format)
-      builder.check()
-      assert(builder.parse(UTF8String.fromString(str)) === expected)
-    }
-
-    invalidNumberFormat(
-      "999.9.9", "At most one 'D' or '.' is allowed in the number format: 
'999.9.9'")
-    invalidNumberFormat(
-      "999D9D9", "At most one 'D' or '.' is allowed in the number format: 
'999D9D9'")
-    invalidNumberFormat(
-      "999.9D9", "At most one 'D' or '.' is allowed in the number format: 
'999.9D9'")
-    invalidNumberFormat(
-      "999D9.9", "At most one 'D' or '.' is allowed in the number format: 
'999D9.9'")
-
-    // Test ',' and 'G'
-    Seq(
-      ("12,454", "99,999") -> Decimal(12454),
-      ("12,454", "00,000") -> Decimal(12454),
-      ("12,454", "99G999") -> Decimal(12454),
-      ("12,454", "00G000") -> Decimal(12454),
-      ("12,454,367", "99,999,999") -> Decimal(12454367),
-      ("12,454,367", "00,000,000") -> Decimal(12454367),
-      ("12,454,367", "99G999G999") -> Decimal(12454367),
-      ("12,454,367", "00G000G000") -> Decimal(12454367),
-      ("12,454,", "99,999,") -> Decimal(12454),
-      ("12,454,", "00,000,") -> Decimal(12454),
-      ("12,454,", "99G999G") -> Decimal(12454),
-      ("12,454,", "00G000G") -> Decimal(12454),
-      (",454,367", ",999,999") -> Decimal(454367),
-      (",454,367", ",000,000") -> Decimal(454367),
-      (",454,367", "G999G999") -> Decimal(454367),
-      (",454,367", "G000G000") -> Decimal(454367),
-      (",454,367", "999,999") -> Decimal(454367),
-      (",454,367", "000,000") -> Decimal(454367),
-      (",454,367", "999G999") -> Decimal(454367),
-      (",454,367", "000G000") -> Decimal(454367)
-    ).foreach { case ((str, format), expected) =>
-      val builder = new TestNumberFormatter(format)
-      builder.check()
-      assert(builder.parse(UTF8String.fromString(str)) === expected)
-    }
-
-    // Test '$'
-    Seq(
-      ("$78.12", "$99.99") -> Decimal(78.12),
-      ("$78.12", "$00.00") -> Decimal(78.12),
-      ("78.12$", "99.99$") -> Decimal(78.12),
-      ("78.12$", "00.00$") -> Decimal(78.12)
-    ).foreach { case ((str, format), expected) =>
-      val builder = new TestNumberFormatter(format)
-      builder.check()
-      assert(builder.parse(UTF8String.fromString(str)) === expected)
-    }
-
-    invalidNumberFormat(
-      "99$.99", "'$' must be the first or last char in the number format: 
'99$.99'")
-    invalidNumberFormat("$99.99$", "At most one '$' is allowed in the number 
format: '$99.99$'")
-
-    // Test '-' and 'S'
-    Seq(
-      ("454-", "999-") -> Decimal(-454),
-      ("454-", "999S") -> Decimal(-454),
-      ("-454", "-999") -> Decimal(-454),
-      ("-454", "S999") -> Decimal(-454),
-      ("454-", "000-") -> Decimal(-454),
-      ("454-", "000S") -> Decimal(-454),
-      ("-454", "-000") -> Decimal(-454),
-      ("-454", "S000") -> Decimal(-454),
-      ("12,454.8-", "99G999D9S") -> Decimal(-12454.8),
-      ("00,454.8-", "99G999.9S") -> Decimal(-454.8)
-    ).foreach { case ((str, format), expected) =>
-      val builder = new TestNumberFormatter(format)
-      builder.check()
-      assert(builder.parse(UTF8String.fromString(str)) === expected)
-    }
-
-    invalidNumberFormat(
-      "9S99", "'S' or '-' must be the first or last char in the number format: 
'9S99'")
-    invalidNumberFormat(
-      "9-99", "'S' or '-' must be the first or last char in the number format: 
'9-99'")
-    invalidNumberFormat(
-      "999D9SS", "At most one 'S' or '-' is allowed in the number format: 
'999D9SS'")
-  }
-
-  test("format") {
-
-    // Test '9' and '0'
-    Seq(
-      (Decimal(454), "9") -> "#",
-      (Decimal(454), "99") -> "##",
-      (Decimal(454), "999") -> "454",
-      (Decimal(54), "999") -> "54",
-      (Decimal(404), "999") -> "404",
-      (Decimal(450), "999") -> "450",
-      (Decimal(454), "9999") -> "454",
-      (Decimal(54), "9999") -> "54",
-      (Decimal(404), "9999") -> "404",
-      (Decimal(450), "9999") -> "450",
-      (Decimal(454), "0") -> "#",
-      (Decimal(454), "00") -> "##",
-      (Decimal(454), "000") -> "454",
-      (Decimal(54), "000") -> "054",
-      (Decimal(404), "000") -> "404",
-      (Decimal(450), "000") -> "450",
-      (Decimal(454), "0000") -> "0454",
-      (Decimal(54), "0000") -> "0054",
-      (Decimal(404), "0000") -> "0404",
-      (Decimal(450), "0000") -> "0450"
-    ).foreach { case ((decimal, format), expected) =>
-      val builder = new TestNumberFormatter(format, false)
-      builder.check()
-      assert(builder.format(decimal) === expected)
-    }
-
-    // Test '.' and 'D'
-    Seq(
-      (Decimal(454.2), "999.9") -> "454.2",
-      (Decimal(454.2), "000.0") -> "454.2",
-      (Decimal(454.2), "999D9") -> "454.2",
-      (Decimal(454.2), "000D0") -> "454.2",
-      (Decimal(454), "999.9") -> "454.0",
-      (Decimal(454), "000.0") -> "454.0",
-      (Decimal(454), "999D9") -> "454.0",
-      (Decimal(454), "000D0") -> "454.0",
-      (Decimal(454), "999.99") -> "454.00",
-      (Decimal(454), "000.00") -> "454.00",
-      (Decimal(454), "999D99") -> "454.00",
-      (Decimal(454), "000D00") -> "454.00",
-      (Decimal(0.4542), ".9999") -> ".####",
-      (Decimal(0.4542), ".0000") -> ".####",
-      (Decimal(0.4542), "D9999") -> ".####",
-      (Decimal(0.4542), "D0000") -> ".####",
-      (Decimal(4542), "9999.") -> "4542.",
-      (Decimal(4542), "0000.") -> "4542.",
-      (Decimal(4542), "9999D") -> "4542.",
-      (Decimal(4542), "0000D") -> "4542."
-    ).foreach { case ((decimal, format), expected) =>
-      val builder = new TestNumberFormatter(format, false)
-      builder.check()
-      assert(builder.format(decimal) === expected)
-    }
-
-    // Test ',' and 'G'
-    Seq(
-      (Decimal(12454), "99,999") -> "12,454",
-      (Decimal(12454), "00,000") -> "12,454",
-      (Decimal(12454), "99G999") -> "12,454",
-      (Decimal(12454), "00G000") -> "12,454",
-      (Decimal(12454367), "99,999,999") -> "12,454,367",
-      (Decimal(12454367), "00,000,000") -> "12,454,367",
-      (Decimal(12454367), "99G999G999") -> "12,454,367",
-      (Decimal(12454367), "00G000G000") -> "12,454,367",
-      (Decimal(12454), "99,999,") -> "12,454,",
-      (Decimal(12454), "00,000,") -> "12,454,",
-      (Decimal(12454), "99G999G") -> "12,454,",
-      (Decimal(12454), "00G000G") -> "12,454,",
-      (Decimal(454367), ",999,999") -> ",454,367",
-      (Decimal(454367), ",000,000") -> ",454,367",
-      (Decimal(454367), "G999G999") -> ",454,367",
-      (Decimal(454367), "G000G000") -> ",454,367"
-    ).foreach { case ((decimal, format), expected) =>
-      val builder = new TestNumberFormatter(format, false)
-      builder.check()
-      assert(builder.format(decimal) === expected)
-    }
-
-    // Test '$'
-    Seq(
-      (Decimal(78.12), "$99.99") -> "$78.12",
-      (Decimal(78.12), "$00.00") -> "$78.12",
-      (Decimal(78.12), "99.99$") -> "78.12$",
-      (Decimal(78.12), "00.00$") -> "78.12$"
-    ).foreach { case ((decimal, format), expected) =>
-      val builder = new TestNumberFormatter(format, false)
-      builder.check()
-      assert(builder.format(decimal) === expected)
-    }
-
-    // Test '-' and 'S'
-    Seq(
-      (Decimal(-454), "999-") -> "454-",
-      (Decimal(-454), "999S") -> "454-",
-      (Decimal(-454), "-999") -> "-454",
-      (Decimal(-454), "S999") -> "-454",
-      (Decimal(-454), "000-") -> "454-",
-      (Decimal(-454), "000S") -> "454-",
-      (Decimal(-454), "-000") -> "-454",
-      (Decimal(-454), "S000") -> "-454",
-      (Decimal(-12454.8), "99G999D9S") -> "12,454.8-",
-      (Decimal(-454.8), "99G999.9S") -> "454.8-"
-    ).foreach { case ((decimal, format), expected) =>
-      val builder = new TestNumberFormatter(format, false)
-      builder.check()
-      assert(builder.format(decimal) === expected)
-    }
-  }
-
-}
diff --git a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md 
b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
index 1dbf9678af9..14902b08549 100644
--- a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
+++ b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
@@ -1,6 +1,6 @@
 <!-- Automatically generated by ExpressionsSchemaSuite -->
 ## Summary
-  - Number of queries: 386
+  - Number of queries: 387
   - Number of expressions that missing example: 12
   - Expressions missing examples: 
bigint,binary,boolean,date,decimal,double,float,int,smallint,string,timestamp,tinyint
 ## Schema of Built-in Functions
@@ -317,6 +317,7 @@
 | org.apache.spark.sql.catalyst.expressions.TryMultiply | try_multiply | 
SELECT try_multiply(2, 3) | struct<try_multiply(2, 3):int> |
 | org.apache.spark.sql.catalyst.expressions.TrySubtract | try_subtract | 
SELECT try_subtract(2, 1) | struct<try_subtract(2, 1):int> |
 | org.apache.spark.sql.catalyst.expressions.TryToBinary | try_to_binary | 
SELECT try_to_binary('abc', 'utf-8') | struct<try_to_binary(abc, utf-8):binary> 
|
+| org.apache.spark.sql.catalyst.expressions.TryToNumber | try_to_number | 
SELECT try_to_number('454', '999') | struct<try_to_number(454, 
999):decimal(3,0)> |
 | org.apache.spark.sql.catalyst.expressions.TypeOf | typeof | SELECT typeof(1) 
| struct<typeof(1):string> |
 | org.apache.spark.sql.catalyst.expressions.UnBase64 | unbase64 | SELECT 
unbase64('U3BhcmsgU1FM') | struct<unbase64(U3BhcmsgU1FM):binary> |
 | org.apache.spark.sql.catalyst.expressions.UnaryMinus | negative | SELECT 
negative(1) | struct<negative(1):int> |
@@ -391,4 +392,4 @@
 | org.apache.spark.sql.catalyst.expressions.xml.XPathList | xpath | SELECT 
xpath('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>','a/b/text()') | 
struct<xpath(<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>, 
a/b/text()):array<string>> |
 | org.apache.spark.sql.catalyst.expressions.xml.XPathLong | xpath_long | 
SELECT xpath_long('<a><b>1</b><b>2</b></a>', 'sum(a/b)') | 
struct<xpath_long(<a><b>1</b><b>2</b></a>, sum(a/b)):bigint> |
 | org.apache.spark.sql.catalyst.expressions.xml.XPathShort | xpath_short | 
SELECT xpath_short('<a><b>1</b><b>2</b></a>', 'sum(a/b)') | 
struct<xpath_short(<a><b>1</b><b>2</b></a>, sum(a/b)):smallint> |
-| org.apache.spark.sql.catalyst.expressions.xml.XPathString | xpath_string | 
SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c') | 
struct<xpath_string(<a><b>b</b><c>cc</c></a>, a/c):string> |
\ No newline at end of file
+| org.apache.spark.sql.catalyst.expressions.xml.XPathString | xpath_string | 
SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c') | 
struct<xpath_string(<a><b>b</b><c>cc</c></a>, a/c):string> |
diff --git a/sql/core/src/test/resources/sql-tests/inputs/string-functions.sql 
b/sql/core/src/test/resources/sql-tests/inputs/string-functions.sql
index 0db28ad9f3e..e1c97b468f2 100644
--- a/sql/core/src/test/resources/sql-tests/inputs/string-functions.sql
+++ b/sql/core/src/test/resources/sql-tests/inputs/string-functions.sql
@@ -162,10 +162,11 @@ select to_number('454', '000');
 select to_number('454.2', '000.0');
 select to_number('12,454', '00,000');
 select to_number('$78.12', '$00.00');
-select to_number('-454', '-000');
+select to_number('+454', 'S000');
 select to_number('-454', 'S000');
-select to_number('12,454.8-', '00,000.9-');
-select to_number('00,454.8-', '00,000.9-');
+select to_number('12,454.8-', '00,000.9MI');
+select to_number('00,454.8-', '00,000.9MI');
+select to_number('<00,454.8>', '00,000.9PR');
 
 -- to_binary
 select to_binary('abc');
diff --git 
a/sql/core/src/test/resources/sql-tests/results/ansi/string-functions.sql.out 
b/sql/core/src/test/resources/sql-tests/results/ansi/string-functions.sql.out
index 52d70e22a44..e330cafa73a 100644
--- 
a/sql/core/src/test/resources/sql-tests/results/ansi/string-functions.sql.out
+++ 
b/sql/core/src/test/resources/sql-tests/results/ansi/string-functions.sql.out
@@ -1004,11 +1004,11 @@ struct<to_number($78.12, $00.00):decimal(4,2)>
 
 
 -- !query
-select to_number('-454', '-000')
+select to_number('+454', 'S000')
 -- !query schema
-struct<to_number(-454, -000):decimal(3,0)>
+struct<to_number(+454, S000):decimal(3,0)>
 -- !query output
--454
+454
 
 
 -- !query
@@ -1020,17 +1020,25 @@ struct<to_number(-454, S000):decimal(3,0)>
 
 
 -- !query
-select to_number('12,454.8-', '00,000.9-')
+select to_number('12,454.8-', '00,000.9MI')
 -- !query schema
-struct<to_number(12,454.8-, 00,000.9-):decimal(6,1)>
+struct<to_number(12,454.8-, 00,000.9MI):decimal(6,1)>
 -- !query output
 -12454.8
 
 
 -- !query
-select to_number('00,454.8-', '00,000.9-')
+select to_number('00,454.8-', '00,000.9MI')
+-- !query schema
+struct<to_number(00,454.8-, 00,000.9MI):decimal(6,1)>
+-- !query output
+-454.8
+
+
+-- !query
+select to_number('<00,454.8>', '00,000.9PR')
 -- !query schema
-struct<to_number(00,454.8-, 00,000.9-):decimal(6,1)>
+struct<to_number(<00,454.8>, 00,000.9PR):decimal(6,1)>
 -- !query output
 -454.8
 
diff --git 
a/sql/core/src/test/resources/sql-tests/results/postgreSQL/numeric.sql.out 
b/sql/core/src/test/resources/sql-tests/results/postgreSQL/numeric.sql.out
index 41fc9908d0c..9a6cc7eac02 100644
--- a/sql/core/src/test/resources/sql-tests/results/postgreSQL/numeric.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/postgreSQL/numeric.sql.out
@@ -4597,9 +4597,10 @@ struct<>
 -- !query
 SELECT '' AS to_number_1,  to_number('-34,338,492', '99G999G999')
 -- !query schema
-struct<to_number_1:string,to_number(-34,338,492, 99G999G999):decimal(8,0)>
+struct<>
 -- !query output
-       -34338492
+java.lang.IllegalArgumentException
+The input string '-34,338,492' does not match the given number format: 
'99G999G999'
 
 
 -- !query
@@ -4607,8 +4608,8 @@ SELECT '' AS to_number_2,  
to_number('-34,338,492.654,878', '99G999G999D999G999'
 -- !query schema
 struct<>
 -- !query output
-java.lang.IllegalArgumentException
-The input string '-34,338,492.654,878' does not match the given number format: 
'99G999G999D999G999'
+org.apache.spark.sql.AnalysisException
+cannot resolve 'to_number('-34,338,492.654,878', '99G999G999D999G999')' due to 
data type mismatch: Thousands separators (,) may not appear after the decimal 
point in the number format: '99G999G999D999G999'; line 1 pos 27
 
 
 -- !query
@@ -4656,16 +4657,17 @@ SELECT '' AS to_number_15, to_number('123,000','999G')
 -- !query schema
 struct<>
 -- !query output
-java.lang.IllegalArgumentException
-The input string '123,000' does not match the given number format: '999G'
+org.apache.spark.sql.AnalysisException
+cannot resolve 'to_number('123,000', '999G')' due to data type mismatch: 
Thousands separators (,) must have digits in between them in the number format: 
'999G'; line 1 pos 27
 
 
 -- !query
 SELECT '' AS to_number_16, to_number('123456','999G999')
 -- !query schema
-struct<to_number_16:string,to_number(123456, 999G999):decimal(6,0)>
+struct<>
 -- !query output
-       123456
+java.lang.IllegalArgumentException
+The input string '123456' does not match the given number format: '999G999'
 
 
 -- !query
diff --git 
a/sql/core/src/test/resources/sql-tests/results/string-functions.sql.out 
b/sql/core/src/test/resources/sql-tests/results/string-functions.sql.out
index ff14da143da..af861e3913b 100644
--- a/sql/core/src/test/resources/sql-tests/results/string-functions.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/string-functions.sql.out
@@ -1000,11 +1000,11 @@ struct<to_number($78.12, $00.00):decimal(4,2)>
 
 
 -- !query
-select to_number('-454', '-000')
+select to_number('+454', 'S000')
 -- !query schema
-struct<to_number(-454, -000):decimal(3,0)>
+struct<to_number(+454, S000):decimal(3,0)>
 -- !query output
--454
+454
 
 
 -- !query
@@ -1016,17 +1016,25 @@ struct<to_number(-454, S000):decimal(3,0)>
 
 
 -- !query
-select to_number('12,454.8-', '00,000.9-')
+select to_number('12,454.8-', '00,000.9MI')
 -- !query schema
-struct<to_number(12,454.8-, 00,000.9-):decimal(6,1)>
+struct<to_number(12,454.8-, 00,000.9MI):decimal(6,1)>
 -- !query output
 -12454.8
 
 
 -- !query
-select to_number('00,454.8-', '00,000.9-')
+select to_number('00,454.8-', '00,000.9MI')
+-- !query schema
+struct<to_number(00,454.8-, 00,000.9MI):decimal(6,1)>
+-- !query output
+-454.8
+
+
+-- !query
+select to_number('<00,454.8>', '00,000.9PR')
 -- !query schema
-struct<to_number(00,454.8-, 00,000.9-):decimal(6,1)>
+struct<to_number(<00,454.8>, 00,000.9PR):decimal(6,1)>
 -- !query output
 -454.8
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to