HeartSaVioR edited a comment on issue #25942: [SPARK-21914][SQL][TESTS] Check 
results of expression examples
URL: https://github.com/apache/spark/pull/25942#issuecomment-536159134
 
 
   I've came across same observation and found different issue. Please take a 
look at example of `LIKE`:
   
   
https://github.com/apache/spark/blob/d72f39897b00d0bbd7a4db9de281a1256fcf908d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala#L97-L106
   
   If spark.sql.parser.escapedStringLiterals=false, then it should fail as 
there's `\U` in pattern (spark.sql.parser.escapedStringLiterals=false by 
default) but it doesn't fail.
   
   ```
   The escape character is '\'. If an escape character precedes a special 
symbol or another
   escape character, the following character is matched literally. It is 
invalid to escape
   any other character.
   ``` 
   
   For the query 
   
   ```
   SET spark.sql.parser.escapedStringLiterals=false;
   SELECT '%SystemDrive%\Users\John' like '\%SystemDrive\%\Users%';
   ```
   
   SQL parser removes single `\` (not sure that is intended) so the expressions 
of Like are constructed as following:
   
   > LIKE - left `%SystemDrive%UsersJohn` / right `\%SystemDrive\%Users%`
   
   which are no longer having origin intention.
   
   Below query tests the origin intention:
   
   ```
   SET spark.sql.parser.escapedStringLiterals=false;
   SELECT '%SystemDrive%\\Users\\John' like '\%SystemDrive\%\\\\Users%';
   ```
   
   > LIKE - left `%SystemDrive%\Users\John` / right `\%SystemDrive\%\\Users%`
   
   Note that `\\\\` is needed in pattern as `StringUtils.escapeLikeRegex` 
requires `\\` to represent normal character of `\`.
   
   Same for RLIKE: 
   
   ```
   SET spark.sql.parser.escapedStringLiterals=true;
   SELECT '%SystemDrive%\Users\John' rlike '%SystemDrive%\\Users.*';
   ```
   
   > RLIKE - left `%SystemDrive%\Users\John` / right `%SystemDrive%\\Users.*`
   
   which is OK, but
   
   ```
   SET spark.sql.parser.escapedStringLiterals=false;
   SELECT '%SystemDrive%\Users\John' rlike '%SystemDrive%\Users.*';
   ```
   
   > RLIKE - left `%SystemDrive%UsersJohn` / right `%SystemDrive%Users.*`
   
   which no longer haves origin intention.
   
   Below query tests the origin intention:
   ```
   SET spark.sql.parser.escapedStringLiterals=true;
   SELECT '%SystemDrive%\\Users\\John' rlike '%SystemDrive%%\\\\Users.*';
   ```
   
   RLIKE - left `%SystemDrive%\Users\John` / right `%SystemDrive%%\\Users.*`
   
   I'll raise a new patch to correct the examples.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to