This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 39228d4ed60 [SPARK-45361][SQL][DOCS] Describe characters unescaping in 
string literals
39228d4ed60 is described below

commit 39228d4ed60de18a57915ede12013f415d5471aa
Author: Max Gekk <max.g...@gmail.com>
AuthorDate: Mon Oct 2 23:16:55 2023 -0700

    [SPARK-45361][SQL][DOCS] Describe characters unescaping in string literals
    
    ### What changes were proposed in this pull request?
    In the PR, I propose to update the doc 
([link](https://spark.apache.org/docs/latest/sql-ref-literals.html#string-literal))
 about string literals, and describe unescaping.
    <img width="1193" alt="Screenshot 2023-09-28 at 21 23 01" 
src="https://github.com/apache/spark/assets/1580697/7b871ded-50e1-4c93-9d86-60a2ce93f5e7";>
    
    ### Why are the changes needed?
    To make clear how string literals are preprocessed. This should less 
confuse Spark SQL users.
    
    ### Does this PR introduce _any_ user-facing change?
    No.
    
    ### How was this patch tested?
    By building docs.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No.
    
    Closes #43152 from MaxGekk/doc-string-literal.
    
    Authored-by: Max Gekk <max.g...@gmail.com>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 docs/sql-ref-literals.md | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/docs/sql-ref-literals.md b/docs/sql-ref-literals.md
index 44164d94fdc..e9447af71c5 100644
--- a/docs/sql-ref-literals.md
+++ b/docs/sql-ref-literals.md
@@ -51,6 +51,17 @@ A string literal is used to specify a character string value.
 
     Case insensitive, indicates `RAW`. If a string literal starts with `r` 
prefix, neither special characters nor unicode characters are escaped by `\`.
 
+The following escape sequences are recognized in regular string literals 
(without the `r` prefix), and replaced according to the following rules:
+- `\0` -> `\u0000`, unicode character with the code 0;
+- `\b` -> `\u0008`, backspace;
+- `\n` -> `\u000a`, linefeed;
+- `\r` -> `\u000d`, carriage return;
+- `\t` -> `\u0009`, horizontal tab;
+- `\Z` -> `\u001A`, substitute;
+- `\%` -> `\%`;
+- `\_` -> `\_`;
+- `\<other char>` -> `<other char>`, skip the slash and leave the character as 
is.
+
 #### Examples
 
 ```sql


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to