[GitHub] [spark] HeartSaVioR opened a new pull request #25957: [SPARK-29281][SQL] Correct example of Like/RLike to test the origin intention correctly

GitBox Fri, 27 Sep 2019 23:54:22 -0700

HeartSaVioR opened a new pull request #25957: [SPARK-29281][SQL] Correct 
example of Like/RLike to test the origin intention correctly
URL: https://github.com/apache/spark/pull/25957
 
 
   ### What changes were proposed in this pull request?
   
   This patch fixes examples of Like/RLike to test its origin intention 
correctly. The example doesn't consider the default value of 
spark.sql.parser.escapedStringLiterals: it's false by default.
   
   Please take a look at current example of Like:
   
   
https://github.com/apache/spark/blob/d72f39897b00d0bbd7a4db9de281a1256fcf908d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala#L97-L106
   
   If spark.sql.parser.escapedStringLiterals=false, then it should fail as 
there's `\U` in pattern (spark.sql.parser.escapedStringLiterals=false by 
default) but it doesn't fail.
   
   ```
   The escape character is '\'. If an escape character precedes a special 
symbol or another
   escape character, the following character is matched literally. It is 
invalid to escape
   any other character.
   ``` 
   
   For the query 
   
   ```
   SET spark.sql.parser.escapedStringLiterals=false;
   SELECT '%SystemDrive%\Users\John' like '\%SystemDrive\%\Users%';
   ```
   
   SQL parser removes single `\` (not sure that is intended) so the expressions 
of Like are constructed as following (I've printed out expression of left and 
right for Like/RLike):
   
   > LIKE - left `%SystemDrive%UsersJohn` / right `\%SystemDrive\%Users%`
   
   which are no longer having origin intention (see left).
   
   Below query tests the origin intention:
   
   ```
   SET spark.sql.parser.escapedStringLiterals=false;
   SELECT '%SystemDrive%\\Users\\John' like '\%SystemDrive\%\\\\Users%';
   ```
   
   > LIKE - left `%SystemDrive%\Users\John` / right `\%SystemDrive\%\\Users%`
   
   Note that `\\\\` is needed in pattern as `StringUtils.escapeLikeRegex` 
requires `\\` to represent normal character of `\`.
   
   Same for RLIKE: 
   
   ```
   SET spark.sql.parser.escapedStringLiterals=true;
   SELECT '%SystemDrive%\Users\John' rlike '%SystemDrive%\\Users.*';
   ```
   
   > RLIKE - left `%SystemDrive%\Users\John` / right `%SystemDrive%\\Users.*`
   
   which is OK, but
   
   ```
   SET spark.sql.parser.escapedStringLiterals=false;
   SELECT '%SystemDrive%\Users\John' rlike '%SystemDrive%\Users.*';
   ```
   
   > RLIKE - left `%SystemDrive%UsersJohn` / right `%SystemDrive%Users.*`
   
   which no longer haves origin intention.
   
   Below query tests the origin intention:
   ```
   SET spark.sql.parser.escapedStringLiterals=true;
   SELECT '%SystemDrive%\\Users\\John' rlike '%SystemDrive%\\\\Users.*';
   ```
   
   > RLIKE - left `%SystemDrive%\Users\John` / right `%SystemDrive%\\Users.*`
   
   ### Why are the changes needed?
   
   Because the example doesn't test the origin intention. Spark is now running 
automated tests from these examples, so now it's not only documentation issue 
but also test issue.
   
   ### Does this PR introduce any user-facing change?
   
   No, as it only corrects documentation.
   
   ### How was this patch tested?
   
   Added debug log (like above) and ran queries from `spark-sql`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HeartSaVioR opened a new pull request #25957: [SPARK-29281][SQL] Correct example of Like/RLike to test the origin intention correctly

Reply via email to