Hi Aitozi,

I think this is a good idea to improve the backslash escape strings.
However, I lean a bit more toward the Postgres approach[1],
which is more standard-compliant. PG allows backslash escape
string by writing the letter E (upper or lower case) just before the
opening single quote, e.g., E'foo\n'.

Recognizing backslash escapes in both regular and escape string constants
is not backward compatible in Flink, and is also deprecated in PG.

In addition, Flink also supports Unicode escape string constants by
writing the U& before the quote[1] which works in the same way with
backslash escape string.

Best,
Jark

[1]:
https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS
[2]:
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/overview/

On Sat, 4 Mar 2023 at 23:31, Aitozi <gjying1...@gmail.com> wrote:

> Hi,
>   I encountered a problem when using string literal in Flink. Currently,
> Flink will escape the string literal during codegen, so for the query
> below:
>
> SELECT 'a\nb'; it will print => a\nb
>
> then for the query
>
> SELECT SPLIT_INDEX(col, '\n', 0);
>
> The col can not split by the newline. If we want to split by the newline,
> we should use
>
> SELECT SPLIT_INDEX(col, '
> ', 0)
>
> or
>
> SELECT SPLIT_INDEX(col, CHR(10), 0)
>
> The above way could be more intuitive. Some other databases support these
> "Special Character Escape Sequences"[1].
>
> In this way, we can directly use
> SELECT SPLIT_INDEX(col, '\n', 0); for the query.
>
> I know this is not standard behavior in ANSI SQL. I'm opening this thread
> for some opinions from the community guys.
>
> [1]:
>
> https://dev.mysql.com/doc/refman/8.0/en/string-literals.html#character-escape-sequences
>
> Thanks,
> Aitozi
>

Reply via email to