dtenedor commented on code in PR #36066:
URL: https://github.com/apache/spark/pull/36066#discussion_r846303565


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/numberFormatExpressions.scala:
##########
@@ -22,48 +22,59 @@ import java.util.Locale
 import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
 import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, 
CodeGenerator, ExprCode}
 import org.apache.spark.sql.catalyst.expressions.codegen.Block.BlockHelper
-import org.apache.spark.sql.catalyst.util.NumberFormatter
+import org.apache.spark.sql.catalyst.util.ToNumberParser
 import org.apache.spark.sql.types.{DataType, StringType}
 import org.apache.spark.unsafe.types.UTF8String
 
 /**
- * A function that converts string to numeric.
+ * A function that converts strings to decimal values, returning an exception 
if the input string
+ * fails to match the format string.
  */
 @ExpressionDescription(
   usage = """
-     _FUNC_(strExpr, formatExpr) - Convert `strExpr` to a number based on the 
`formatExpr`.
-       The format can consist of the following characters:
-         '0' or '9':  digit position
-         '.' or 'D':  decimal point (only allowed once)
-         ',' or 'G':  group (thousands) separator
-         '-' or 'S':  sign anchored to number (only allowed once)
-         '$':  value with a leading dollar sign (only allowed once)
+     _FUNC_(expr, fmt) - Convert string 'expr' to a number based on the string 
format 'fmt'.
+       Throws an exception if the conversion fails. The format can consist of 
the following
+       characters, case insensitive:
+         '0' or '9': Specifies an expected digit between 0 and 9. A sequence 
of 0 or 9 in the format
+           string matches a sequence of digits in the input string. If the 0/9 
sequence starts with
+           0 and is before the decimal point, it can only match a digit 
sequence of the same size.
+           Otherwise, if the sequence starts with 9 or is after the decimal 
poin, it can match a
+           digit sequence that has the same or smaller size.
+         '.' or 'D': Specifies the position of the decimal point (optional, 
only allowed once).
+         ',' or 'G': Specifies the position of the grouping (thousands) 
separator (,). There must be
+           one or more 0 or 9 to the left of the rightmost grouping separator. 
'expr' must match the
+           grouping separator relevant for the size of the number.
+         '$': Specifies the location of the $ currency sign. This character 
may only be specified
+           once.
+         'S': Specifies the position of a '+' or '-' sign (optional, only 
allowed once).

Review Comment:
   Looking closer, the `S` matches against either `-` or `+` whereas the `MI` 
only matches against `-`. So technically they are still different (but 
similar). I updated this part of the doc to help clarify.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to