[
https://issues.apache.org/jira/browse/SPARK-34102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Noah Kawasaki updated SPARK-34102:
----------------------------------
Labels: escaping filter string-manipulation (was: )
> Spark SQL cannot escape both \ and other special characters
> ------------------------------------------------------------
>
> Key: SPARK-34102
> URL: https://issues.apache.org/jira/browse/SPARK-34102
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.0.2, 2.1.3, 2.2.2, 2.3.0, 2.4.5, 3.0.1
> Reporter: Noah Kawasaki
> Priority: Minor
> Labels: escaping, filter, string-manipulation
>
> Spark literal string parsing does not properly escape backslashes or other
> special characters. This is an extension of this issue:
> https://issues.apache.org/jira/browse/SPARK-17647#
>
> The issue is that depending on how spark.sql.parser.escapedStringLiterals is
> set, you will either be able to correctly get escaped backslashes in a string
> literal, but not escaped other special characters, OR, you can have correctly
> escaped other special characters, but not correctly escaped backslashes.
> So you have to choose which configuration you care about more.
> I have tested Spark versions 2.1, 2.2, 2.3, 2.4, and 3.0 and they all
> experience the issue:
> {code:java}
> # These do not return the expected backslash
> SET spark.sql.parser.escapedStringLiterals=false;
> SELECT '\\';
> > \
> (should return \\)
> SELECT 'hi\hi';
> > hihi
> (should return hi\hi)
> # These are correctly escaped
> SELECT '\"';
> > "
> SELECT '\'';
> > '{code}
> If I switch this:
> {code:java}
> # These now work
> SET spark.sql.parser.escapedStringLiterals=true;
> SELECT '\\';
> > \\
> SELECT 'hi\hi';
> > hi\hi
> # These are now not correctly escaped
> SELECT '\"';
> > \"
> (should return ")
> SELECT '\'';
> > \'
> (should return ' ){code}
> So basically we have to choose:
> SET spark.sql.parser.escapedStringLiterals=false; if we want backslashes
> correctly escaped but not other special characters
> SET spark.sql.parser.escapedStringLiterals=true; if we want other special
> characters correctly escaped but not backslashes
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]