This is an automated email from the ASF dual-hosted git repository.
wenchen pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.3 by this push:
new f96b96d7971 [SPARK-38796][SQL] Update documentation for number format
strings with the {try_}to_number functions
f96b96d7971 is described below
commit f96b96d7971a03740a15c029bb76e87fb99306ad
Author: Daniel Tenedorio <[email protected]>
AuthorDate: Fri Jul 1 16:49:50 2022 +0800
[SPARK-38796][SQL] Update documentation for number format strings with the
{try_}to_number functions
### What changes were proposed in this pull request?
Update documentation for number format strings with the `{try_}to_number`
functions.
### Why are the changes needed?
The existing documentation is incomplete.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Built the doc site locally to provide the following initial screenshot:
<img width="1224" alt="image"
src="https://user-images.githubusercontent.com/99207096/174898847-4d42b6d7-f119-4001-bbf6-6d3ceb60fd77.png">
Closes #36950 from dtenedor/number-docs.
Lead-authored-by: Daniel Tenedorio <[email protected]>
Co-authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 09d1bae95be2af01da65573d57867346f3833907)
Signed-off-by: Wenchen Fan <[email protected]>
---
docs/sql-ref-number-pattern.md | 166 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 165 insertions(+), 1 deletion(-)
diff --git a/docs/sql-ref-number-pattern.md b/docs/sql-ref-number-pattern.md
index dc7d696e32f..dd4997a6aec 100644
--- a/docs/sql-ref-number-pattern.md
+++ b/docs/sql-ref-number-pattern.md
@@ -19,4 +19,168 @@ license: |
limitations under the License.
---
-TODO: Add the content of Number Patterns for Formatting and Parsing
+### Description
+
+Functions such as `to_number` and `to_char` support converting between values
of string and
+Decimal type. Such functions accept format strings indicating how to map
between these types.
+
+### Syntax
+
+Number format strings support the following syntax:
+
+```
+ { ' [ MI | S ] [ $ ]
+ [ 0 | 9 | G | , ] [...]
+ [ . | D ]
+ [ 0 | 9 ] [...]
+ [ $ ] [ PR | MI | S ] ' }
+```
+
+### Elements
+
+Each number format string can contain the following elements (case
insensitive):
+
+- **`0`** or **`9`**
+
+ Specifies an expected digit between `0` and `9`.
+
+ A sequence of 0 or 9 in the format string matches a sequence of digits with
the same or smaller
+ size. If the 0/9 sequence starts with 0 and is before the decimal point, it
requires matching the
+ number of digits: when parsing, it matches only a digit sequence of the same
size; when
+ formatting, the result string adds left-padding with zeros to the digit
sequence to reach the
+ same size. Otherwise, the 0/9 sequence matches any digit sequence with the
same or smaller size
+ when parsing, and pads the digit sequence with spaces in the result string
when formatting. Note
+ that the digit sequence will become a '#' sequence if the size is larger
than the 0/9 sequence.
+
+- **`.`** or **`D`**
+
+ Specifies the position of the decimal point. This character may only be
specified once.
+
+ When parsing, the input string does not need to include a decimal point.
+
+- **`,`** or **`G`**
+
+ Specifies the position of the `,` grouping (thousands) separator.
+
+ There must be a `0` or `9` to the left and right of each grouping separator.
When parsing,
+ the input string must match the grouping separator relevant for the size of
the number.
+
+- **`$`**
+
+ Specifies the location of the `$` currency sign. This character may only be
specified once.
+
+- **`S`**
+
+ Specifies the position of an optional '+' or '-' sign. This character may
only be specified once.
+
+- **`MI`**
+
+ Specifies the position of an optional '-' sign (no '+'). This character may
only be specified once.
+
+ When formatting, it prints a space for positive values.
+
+- **`PR`**
+
+ Maps negative input values to wrapping angle brackets (`<1>`) in the
corresponding string.
+
+ Positive input values do not receive wrapping angle brackets.
+
+### Function types and error handling
+
+* The `to_number` function accepts an input string and a format string
argument. It requires that
+the input string matches the provided format and raises an error otherwise.
The function then
+returns the corresponding Decimal value.
+* The `try_to_number` function accepts an input string and a format string
argument. It works the
+same as the `to_number` function except that it returns NULL instead of
raising an error if the
+input string does not match the given number format.
+* The `to_char` function accepts an input decimal and a format string
argument. It requires that
+the input decimal matches the provided format and raises an error otherwise.
The function then
+returns the corresponding string value.
+* All functions will fail if the given format string is invalid.
+
+### Examples
+
+The following examples use the `to_number`, `try_to_number`, `to_char`, and
`try_to_char` SQL
+functions.
+
+Note that the format string used in most of these examples expects:
+* an optional sign at the beginning,
+* followed by a dollar sign,
+* followed by a number between 3 and 6 digits long,
+* thousands separators,
+* up to two digits beyond the decimal point.
+
+#### The `to_number` function
+
+```sql
+-- The negative number with currency symbol maps to characters in the format
string.
+> SELECT to_number('-$12,345.67', 'S$999,099.99');
+ -12345.67
+
+-- The '$' sign is not optional.
+> SELECT to_number('5', '$9');
+ Error: the input string does not match the given number format
+
+-- The plus sign is optional, and so are fractional digits.
+> SELECT to_number('$345', 'S$999,099.99');
+ 345.00
+
+-- The format requires at least three digits.
+> SELECT to_number('$45', 'S$999,099.99');
+ Error: the input string does not match the given number format
+
+-- The format requires at least three digits.
+> SELECT to_number('$045', 'S$999,099.99');
+ 45.00
+
+-- MI indicates an optional minus sign at the beginning or end of the input
string.
+> SELECT to_number('1234-', '999999MI');
+ -1234
+
+-- PR indicates optional wrapping angel brakets.
+> SELECT to_number('9', '999PR')
+ 9
+```
+
+#### The `try_to_number` function:
+
+```sql
+-- The '$' sign is not optional.
+> SELECT try_to_number('5', '$9');
+ NULL
+
+-- The format requires at least three digits.
+> SELECT try_to_number('$45', 'S$999,099.99');
+ NULL
+```
+
+#### The `to_char` function:
+
+```sql
+> SELECT to_char(decimal(454), '999');
+ "454"
+
+-- '99' can format digit sequence with a smaller size.
+> SELECT to_char(decimal(1), '99.9');
+ " 1.0"
+
+-- '000' left-pads 0 for digit sequence with a smaller size.
+> SELECT to_char(decimal(45.00), '000.00');
+ "045.00"
+
+> SELECT to_char(decimal(12454), '99,999');
+ "12,454"
+
+-- digit sequence with a larger size leads to '#' sequence.
+> SELECT to_char(decimal(78.12), '$9.99');
+ "$#.##"
+
+-- 'S' can be at the end.
+> SELECT try_to_char(decimal(-12454.8), '99,999.9S');
+ "12,454.8-"
+
+> SELECT try_to_char(decimal(12454.8), 'L99,999.9');
+ Error: cannot resolve 'try_to_char(Decimal(12454.8), 'L99,999.9')' due to
data type mismatch:
+ Unexpected character 'L' found in the format string 'L99,999.9'; the
structure of the format
+ string must match: [MI|S] [$] [0|9|G|,]* [.|D] [0|9]* [$] [PR|MI|S]; line 1
pos 25
+```
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]