[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-10-23 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r338011227
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -149,10 +153,17 @@ case class Like(left: Expression, right: Expression) 
extends StringRegexExpressi
 } else {
   val pattern = ctx.freshName("pattern")
   val rightStr = ctx.freshName("rightStr")
+  val escapeChar = escapeCharOpt.map { str =>
+if (str.equals("\"") || str.equals("\\")) {
 
 Review comment:
   I remember this is to accept [the 
query](https://github.com/apache/spark/pull/25001#issuecomment-511803526). The 
suggested looks nice and we need comments here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-08-01 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r309971143
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
 ##
 @@ -23,13 +23,30 @@ import org.apache.spark.sql.catalyst.util.StringUtils._
 class StringUtilsSuite extends SparkFunSuite {
 
   test("escapeLikeRegex") {
-assert(escapeLikeRegex("abdef") === 
"(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E")
-assert(escapeLikeRegex("a\\__b") === "(?s)\\Qa\\E\\Q_\\E.\\Qb\\E")
-assert(escapeLikeRegex("a_%b") === "(?s)\\Qa\\E..*\\Qb\\E")
-assert(escapeLikeRegex("a%\\%b") === "(?s)\\Qa\\E.*\\Q%\\E\\Qb\\E")
-assert(escapeLikeRegex("a%") === "(?s)\\Qa\\E.*")
-assert(escapeLikeRegex("**") === "(?s)\\Q*\\E\\Q*\\E")
-assert(escapeLikeRegex("a_b") === "(?s)\\Qa\\E.\\Qb\\E")
+val expectedEscapedStrArr = 
Array("(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E",
 
 Review comment:
   Can you assign these expected strings to each independent variable?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-08-01 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r309970475
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   For now, can you fix code to throw an exception for the case?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-29 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308523852
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   I like the standard behivour, but we should wait for the other review's 
comment. cc: @dongjoon-hyun @gatorsmile 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-29 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308523852
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   I like the behivour, but we should wait for the other review's comment. cc: 
@dongjoon-hyun @gatorsmile 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-29 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308523229
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   If we follow `1)`, we should throw an exception for the empty case?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-29 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308085976
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   oh... that's a weird case... the standard says something about this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-29 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308085976
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   oh... the standard says something about this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-29 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308081254
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   Before that, I just want to make sure that that behaivour is consistent 
among the other DBMSs. (I'm not sure we need to accept that empty case.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-29 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308081254
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   Before that, I just want to make sure that that behaivour is consistent 
among the other DBMSs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308059472
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   Could you show us these behivours? (I'm not sure that the current one is ok)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308059472
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   Could you show us these behivours?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308057910
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/RegexpExpressionsSuite.scala
 ##
 @@ -118,6 +118,295 @@ class RegexpExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 checkLiteralRow("""%SystemDrive%\Users\John""" like _, 
"""\%SystemDrive\%\\Users%""", true)
   }
 
+  test("LIKE Pattern ESCAPE '/'") {
+
 
 Review comment:
   These tests below are super duplicated between each other. IMHO we don't 
need these over-killing tests and more simpler tests are ok to me here (Basic 
tests already have been done in 
https://github.com/apache/spark/pull/25001/files#diff-c36b16be93e0c63a4e3fad37bd904b29R25)
   Can you write tests like this?
   ```
   Seq("/", "#", "\"", ...).foreach { escapeChar =>
 test(s"LIKE Pattern ESCAPE '$escapeChar'") {
   // More simpler tests here...
 }
   }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308056865
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
 ##
 @@ -23,13 +23,27 @@ import org.apache.spark.sql.catalyst.util.StringUtils._
 class StringUtilsSuite extends SparkFunSuite {
 
   test("escapeLikeRegex") {
-assert(escapeLikeRegex("abdef") === 
"(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E")
-assert(escapeLikeRegex("a\\__b") === "(?s)\\Qa\\E\\Q_\\E.\\Qb\\E")
-assert(escapeLikeRegex("a_%b") === "(?s)\\Qa\\E..*\\Qb\\E")
-assert(escapeLikeRegex("a%\\%b") === "(?s)\\Qa\\E.*\\Q%\\E\\Qb\\E")
-assert(escapeLikeRegex("a%") === "(?s)\\Qa\\E.*")
-assert(escapeLikeRegex("**") === "(?s)\\Q*\\E\\Q*\\E")
-assert(escapeLikeRegex("a_b") === "(?s)\\Qa\\E.\\Qb\\E")
+assert(escapeLikeRegex("abdef", "\\") === 
"(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E")
+assert(escapeLikeRegex("abdef", "/") === 
"(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E")
+assert(escapeLikeRegex("abdef", "\"") === 
"(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E")
+assert(escapeLikeRegex("a\\__b", "\\") === "(?s)\\Qa\\E\\Q_\\E.\\Qb\\E")
+assert(escapeLikeRegex("a/__b", "/") === "(?s)\\Qa\\E\\Q_\\E.\\Qb\\E")
+assert(escapeLikeRegex("a\"__b", "\"") === "(?s)\\Qa\\E\\Q_\\E.\\Qb\\E")
 
 Review comment:
   nit: To make it easy to see, how about this format?
   ```
   val expectedEscapedStr = "(?s)\\Qa\\E\\Q_\\E.\\Qb\\E"
   assert(escapeLikeRegex("a\\__b", "\\") === expectedEscapedStr)
   assert(escapeLikeRegex("a/__b", "/") === expectedEscapedStr)
   assert(escapeLikeRegex("a\"__b", "\"") === expectedEscapedStr)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308054916
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala
 ##
 @@ -97,7 +97,9 @@ package object dsl {
   case _ => In(expr, list)
 }
 
-def like(other: Expression): Expression = Like(expr, other)
+def like(other: Expression): Expression = Like(expr, other, None)
+def like(other: Expression, escapeChar: String): Expression =
+  Like(expr, other, Some(escapeChar))
 
 Review comment:
   ```
   def like(other: Expression): Expression = Like(expr, other, None)
   def like(other: Expression, escapeChar: String): Expression =
 Like(expr, other, Some(escapeChar))
   ```
   ->
   ```
   def like(other: Expression, escapeCharOption: Option[String] = None): 
Expression =
   Like(expr, other, escapeCharOption)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308054143
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   > `You can specify '' as escape character so that disables the escape 
mechanism...`
   
   Is this behaviour the same with the other DBMSs?
   
   >  I adds some description of '' specified as escape character, we don't 
need it?
   
   Yeah, I feel its a corner case.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308054143
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   > `You can specify '' as escape character so that disables the escape 
mechanism...`
   
   Is this behaviour the same with the other DBMSs?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308054143
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   > `You can specify '' as escape character so that disables the escape 
mechanism...`
   Is this behaviour the same with the other DBMSs?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308053733
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
 ##
 @@ -175,7 +175,13 @@ class ExpressionParserSuite extends AnalysisTest {
 
   test("like expressions") {
 assertEqual("a like 'pattern%'", 'a like "pattern%")
+assertEqual("a like 'pattern%' escape '#'", 'a.like("pattern%", "#"))
+assertEqual("a like 'pattern%' escape '\"'", 'a.like("pattern%", "\""))
+intercept("a like 'pattern%' escape '##'", "Escape string must be empty or 
one character.")
 assertEqual("a not like 'pattern%'", !('a like "pattern%"))
+assertEqual("a not like 'pattern%' escape '#'", !('a.like("pattern%", 
"#")))
+assertEqual("a not like 'pattern%' escape '\"'", !('a.like("pattern%", 
"\"")))
+intercept("a not like 'pattern%' escape '\"/'", "Escape string must be 
empty or one character.")
 assertEqual("a rlike 'pattern%'", 'a rlike "pattern%")
 
 Review comment:
   Can you move these new assertions into a new test unit?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308052836
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/RegexpExpressionsSuite.scala
 ##
 @@ -118,6 +118,295 @@ class RegexpExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 checkLiteralRow("""%SystemDrive%\Users\John""" like _, 
"""\%SystemDrive\%\\Users%""", true)
   }
 
+  test("LIKE Pattern ESCAPE '/'") {
+
+// null handling
+checkLiteralRow(Literal.create(null, StringType).like(_, "/"), "a", null)
+checkEvaluation(
+  Literal.create("a", StringType).like(Literal.create(null, StringType), 
"/"), null)
+checkEvaluation(
+  Literal.create(null, StringType).like(Literal.create(null, StringType), 
"/"), null)
+checkEvaluation(
+  Literal.create("a", StringType).like(NonFoldableLiteral.create("a", 
StringType), "/"), true)
+checkEvaluation(
+  Literal.create("a", StringType).like(NonFoldableLiteral.create(null, 
StringType), "/"), null)
+checkEvaluation(
+  Literal.create(null, StringType).like(NonFoldableLiteral.create("a", 
StringType), "/"), null)
+checkEvaluation(
+  Literal.create(null, StringType).like(NonFoldableLiteral.create(null, 
StringType), "/"), null)
+
+// simple patterns
+checkLiteralRow("abdef" like(_, "/"), "abdef", true)
+checkLiteralRow("a_%b" like(_, "/"), "a/__b", true)
+checkLiteralRow("addb" like(_, "/"), "a_%b", true)
+checkLiteralRow("addb" like(_, "/"), "a/__b", false)
+checkLiteralRow("addb" like(_, "/"), "a%/%b", false)
+checkLiteralRow("a_%b" like(_, "/"), "a%/%b", true)
+checkLiteralRow("addb" like(_, "/"), "a%", true)
+checkLiteralRow("addb" like(_, "/"), "**", false)
+checkLiteralRow("abc" like(_, "/"), "a%", true)
+checkLiteralRow("abc"  like(_, "/"), "b%", false)
+checkLiteralRow("abc"  like(_, "/"), "bc%", false)
+checkLiteralRow("a\nb" like(_, "/"), "a_b", true)
+checkLiteralRow("ab" like(_, "/"), "a%b", true)
+checkLiteralRow("a\nb" like(_, "/"), "a%b", true)
+
+// empty input
+checkLiteralRow("" like(_, "/"), "", true)
+checkLiteralRow("a" like(_, "/"), "", false)
+checkLiteralRow("" like(_, "/"), "a", false)
+
+// SI-17647 double-escaping backslash
+checkLiteralRow("""""" like(_, "/"), """%//%""", true)
+checkLiteralRow("""%%""" like(_, "/"), """%%""", true)
+checkLiteralRow("""/__""" like(_, "/"), """///__""", true)
+checkLiteralRow("""///__""" like(_, "/"), """%//%/%""", false)
+checkLiteralRow("""_///%""" like(_, "/"), """%//""", false)
+
+// unicode
+// scalastyle:off nonascii
+checkLiteralRow("a\u20ACa" like(_, "/"), "_\u20AC_", true)
+checkLiteralRow("a€a" like(_, "/"), "_€_", true)
+checkLiteralRow("a€a" like(_, "/"), "_\u20AC_", true)
+checkLiteralRow("a\u20ACa" like(_, "/"), "_€_", true)
+// scalastyle:on nonascii
+
+// invalid escaping
+val invalidEscape = intercept[AnalysisException] {
+  evaluateWithoutCodegen("""a""" like("""/a""", "/"))
+}
+assert(invalidEscape.getMessage.contains("pattern"))
+
+val endEscape = intercept[AnalysisException] {
+  evaluateWithoutCodegen("""a""" like("""a/""", "/"))
+}
+assert(endEscape.getMessage.contains("pattern"))
+
+// case
+checkLiteralRow("A" like(_, "/"), "a%", false)
+checkLiteralRow("a" like(_, "/"), "A%", false)
+checkLiteralRow("AaA" like(_, "/"), "_a_", true)
+
+// example
+checkLiteralRow(
+  """%SystemDrive%/Users/John""" like(_, "/"), 
"""/%SystemDrive/%//Users%""", true)
+  }
+
+  test("LIKE Pattern ESCAPE '#'") {
+
+// null handling
+checkLiteralRow(Literal.create(null, StringType).like(_, "#"), "a", null)
+checkEvaluation(
+  Literal.create("a", StringType).like(Literal.create(null, StringType), 
"#"), null)
+checkEvaluation(
+  Literal.create(null, StringType).like(Literal.create(null, StringType), 
"#"), null)
+checkEvaluation(
+  Literal.create("a", StringType).like(NonFoldableLiteral.create("a", 
StringType), "#"), true)
+checkEvaluation(
+  Literal.create("a", StringType).like(NonFoldableLiteral.create(null, 
StringType), "#"), null)
+checkEvaluation(
+  Literal.create(null, StringType).like(NonFoldableLiteral.create("a", 
StringType), "#"), null)
+checkEvaluation(
+  Literal.create(null, StringType).like(NonFoldableLiteral.create(null, 
StringType), "#"), null)
+
+// simple patterns
+checkLiteralRow("abdef" like(_, "#"), "abdef", true)
+checkLiteralRow("a_%b" like(_, "#"), "a#__b", true)
+checkLiteralRow("addb" like(_, "#"), "a_%b", true)
+checkLiteralRow("addb" like(_, "#"), "a#__b", false)
+checkLiteralRow("addb" like(_, "#"), "a%#%b", false)
+checkLiteralRow("a_%b" like(_, "#"), "a%#%b", true)
+checkLiteralRow("addb" 

[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308052738
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/RegexpExpressionsSuite.scala
 ##
 @@ -118,6 +118,295 @@ class RegexpExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 checkLiteralRow("""%SystemDrive%\Users\John""" like _, 
"""\%SystemDrive\%\\Users%""", true)
   }
 
+  test("LIKE Pattern ESCAPE '/'") {
+
 
 Review comment:
   nit: remove a blank.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308052815
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/RegexpExpressionsSuite.scala
 ##
 @@ -118,6 +118,295 @@ class RegexpExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 checkLiteralRow("""%SystemDrive%\Users\John""" like _, 
"""\%SystemDrive\%\\Users%""", true)
   }
 
+  test("LIKE Pattern ESCAPE '/'") {
+
+// null handling
+checkLiteralRow(Literal.create(null, StringType).like(_, "/"), "a", null)
+checkEvaluation(
+  Literal.create("a", StringType).like(Literal.create(null, StringType), 
"/"), null)
+checkEvaluation(
+  Literal.create(null, StringType).like(Literal.create(null, StringType), 
"/"), null)
+checkEvaluation(
+  Literal.create("a", StringType).like(NonFoldableLiteral.create("a", 
StringType), "/"), true)
+checkEvaluation(
+  Literal.create("a", StringType).like(NonFoldableLiteral.create(null, 
StringType), "/"), null)
+checkEvaluation(
+  Literal.create(null, StringType).like(NonFoldableLiteral.create("a", 
StringType), "/"), null)
+checkEvaluation(
+  Literal.create(null, StringType).like(NonFoldableLiteral.create(null, 
StringType), "/"), null)
+
+// simple patterns
+checkLiteralRow("abdef" like(_, "/"), "abdef", true)
+checkLiteralRow("a_%b" like(_, "/"), "a/__b", true)
+checkLiteralRow("addb" like(_, "/"), "a_%b", true)
+checkLiteralRow("addb" like(_, "/"), "a/__b", false)
+checkLiteralRow("addb" like(_, "/"), "a%/%b", false)
+checkLiteralRow("a_%b" like(_, "/"), "a%/%b", true)
+checkLiteralRow("addb" like(_, "/"), "a%", true)
+checkLiteralRow("addb" like(_, "/"), "**", false)
+checkLiteralRow("abc" like(_, "/"), "a%", true)
+checkLiteralRow("abc"  like(_, "/"), "b%", false)
+checkLiteralRow("abc"  like(_, "/"), "bc%", false)
+checkLiteralRow("a\nb" like(_, "/"), "a_b", true)
+checkLiteralRow("ab" like(_, "/"), "a%b", true)
+checkLiteralRow("a\nb" like(_, "/"), "a%b", true)
+
+// empty input
+checkLiteralRow("" like(_, "/"), "", true)
+checkLiteralRow("a" like(_, "/"), "", false)
+checkLiteralRow("" like(_, "/"), "a", false)
+
+// SI-17647 double-escaping backslash
+checkLiteralRow("""""" like(_, "/"), """%//%""", true)
+checkLiteralRow("""%%""" like(_, "/"), """%%""", true)
+checkLiteralRow("""/__""" like(_, "/"), """///__""", true)
+checkLiteralRow("""///__""" like(_, "/"), """%//%/%""", false)
+checkLiteralRow("""_///%""" like(_, "/"), """%//""", false)
+
+// unicode
+// scalastyle:off nonascii
+checkLiteralRow("a\u20ACa" like(_, "/"), "_\u20AC_", true)
+checkLiteralRow("a€a" like(_, "/"), "_€_", true)
+checkLiteralRow("a€a" like(_, "/"), "_\u20AC_", true)
+checkLiteralRow("a\u20ACa" like(_, "/"), "_€_", true)
+// scalastyle:on nonascii
+
+// invalid escaping
+val invalidEscape = intercept[AnalysisException] {
+  evaluateWithoutCodegen("""a""" like("""/a""", "/"))
+}
+assert(invalidEscape.getMessage.contains("pattern"))
+
+val endEscape = intercept[AnalysisException] {
+  evaluateWithoutCodegen("""a""" like("""a/""", "/"))
+}
+assert(endEscape.getMessage.contains("pattern"))
+
+// case
+checkLiteralRow("A" like(_, "/"), "a%", false)
+checkLiteralRow("a" like(_, "/"), "A%", false)
+checkLiteralRow("AaA" like(_, "/"), "_a_", true)
+
+// example
+checkLiteralRow(
+  """%SystemDrive%/Users/John""" like(_, "/"), 
"""/%SystemDrive/%//Users%""", true)
+  }
+
+  test("LIKE Pattern ESCAPE '#'") {
+
 
 Review comment:
   nit: remove a blank.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308052627
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 ##
 @@ -39,9 +39,16 @@ object StringUtils extends Logging {
* throw an [[AnalysisException]].
*
* @param pattern the SQL pattern to convert
+   * @param escapeStr the escape string contains one character.
* @return the equivalent Java regular expression of the pattern
*/
-  def escapeLikeRegex(pattern: String): String = {
+  def escapeLikeRegex(pattern: String, escapeStr: String): String = {
+assert(escapeStr.length <= 1)
+val escapeChar = if (escapeStr.length == 0) {
+  ""
+} else {
+  escapeStr.charAt(0)
+}
 
 Review comment:
   Can you move this check to `AstBuilder`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308052627
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 ##
 @@ -39,9 +39,16 @@ object StringUtils extends Logging {
* throw an [[AnalysisException]].
*
* @param pattern the SQL pattern to convert
+   * @param escapeStr the escape string contains one character.
* @return the equivalent Java regular expression of the pattern
*/
-  def escapeLikeRegex(pattern: String): String = {
+  def escapeLikeRegex(pattern: String, escapeStr: String): String = {
+assert(escapeStr.length <= 1)
+val escapeChar = if (escapeStr.length == 0) {
+  ""
+} else {
+  escapeStr.charAt(0)
+}
 
 Review comment:
   Can you move this logic to `AstBuilder`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-28 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r308052538
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character. You can 
specify '' as escape
+  character so that disables the escape mechanism, which makes it 
impossible to turn off
+  the special meaning of underscore and percent signs in the pattern.
 
 Review comment:
   I like the previous one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-22 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r306098357
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 ##
 @@ -39,27 +39,36 @@ object StringUtils extends Logging {
* throw an [[AnalysisException]].
*
* @param pattern the SQL pattern to convert
+   * @param escapeStr the escape string contains one character.
* @return the equivalent Java regular expression of the pattern
*/
-  def escapeLikeRegex(pattern: String): String = {
+  def escapeLikeRegex(pattern: String, escapeStr: String): String = {
+val escapeChar = escapeStr.charAt(0)
 val in = pattern.toIterator
 val out = new StringBuilder()
 
 def fail(message: String) = throw new AnalysisException(
   s"the pattern '$pattern' is invalid, $message")
 
 while (in.hasNext) {
-  in.next match {
-case '\\' if in.hasNext =>
+  val cur = in.next
+  if (cur == escapeChar) {
 
 Review comment:
   You can write like this?
   ```
 in.next match {
   case c if c == escapeChar && in.hasNext =>
   ```
   but, I think this rewriting has no harm. So, I leave it to other reviewers.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-18 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r305181626
 
 

 ##
 File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala
 ##
 @@ -484,6 +484,8 @@ class SQLQueryTestSuite extends QueryTest with 
SharedSQLContext {
 """.stripMargin)
   .load(testFile("test-data/postgresql/tenk.data"))
   .createOrReplaceTempView("tenk1")
+
+  session.sql("select * from tenk1 where stringu1 like stringu2 escape 
'\"'")
 
 Review comment:
   This is not correct and I meant we need test s in 
`resources/sql-tests/inputs/`. Or, to check if the parser works well for the 
new token `ESCAPE`, its ok to add tests in `ExpressionParserSuite`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-18 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r305180881
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ##
 @@ -1240,7 +1240,14 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   case SqlBaseParser.IN =>
 invertIfNotDefined(In(e, ctx.expression.asScala.map(expression)))
   case SqlBaseParser.LIKE =>
-invertIfNotDefined(Like(e, expression(ctx.pattern)))
+val escapeOpt = Option(ctx.escapeChar).map(string).map { str =>
+  if (str.length > 1) {
+throw new ParseException("Invalid escape string." +
+  "Escape string must be empty or one character.", ctx)
 
 Review comment:
   Any test for this exception?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-18 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r305178415
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 ##
 @@ -39,27 +39,36 @@ object StringUtils extends Logging {
* throw an [[AnalysisException]].
*
* @param pattern the SQL pattern to convert
+   * @param escapeStr the escape string contains one character.
* @return the equivalent Java regular expression of the pattern
*/
-  def escapeLikeRegex(pattern: String): String = {
+  def escapeLikeRegex(pattern: String, escapeStr: String): String = {
+val escapeChar = escapeStr.charAt(0)
 val in = pattern.toIterator
 val out = new StringBuilder()
 
 def fail(message: String) = throw new AnalysisException(
   s"the pattern '$pattern' is invalid, $message")
 
 while (in.hasNext) {
-  in.next match {
-case '\\' if in.hasNext =>
+  val cur = in.next
+  if (cur == escapeChar) {
 
 Review comment:
   Any reason to rewrite this part from pattern-matching to if-then?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-18 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r305172239
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 ##
 @@ -39,27 +39,36 @@ object StringUtils extends Logging {
* throw an [[AnalysisException]].
*
* @param pattern the SQL pattern to convert
+   * @param escapeStr the escape string contains one character.
* @return the equivalent Java regular expression of the pattern
*/
-  def escapeLikeRegex(pattern: String): String = {
+  def escapeLikeRegex(pattern: String, escapeStr: String): String = {
+val escapeChar = escapeStr.charAt(0)
 
 Review comment:
   this might need to `assert(escapeStr.length == 1)`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-18 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r305171884
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,38 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character.
   """,
   examples = """
 Examples:
   > SELECT '%SystemDrive%\Users\John' _FUNC_ '\%SystemDrive\%\\Users%'
   true
+  > SELECT '%SystemDrive%/Users/John' _FUNC_ '/%SystemDrive/%//Users%' 
ESCAPE '/'
+  true
   """,
   note = """
 Use RLIKE to match with standard regular expressions.
   """,
   since = "1.0.0")
-case class Like(left: Expression, right: Expression) extends 
StringRegexExpression {
+case class Like(left: Expression, right: Expression, escapeCharOpt: 
Option[String] = None)
+  extends StringRegexExpression {
+
+  private lazy val escapeStr = escapeCharOpt.getOrElse("\\")
 
-  override def escape(v: String): String = StringUtils.escapeLikeRegex(v)
+  override def escape(v: String): String = StringUtils.escapeLikeRegex(v, 
escapeStr)
 
   override def matches(regex: Pattern, str: String): Boolean = 
regex.matcher(str).matches()
 
-  override def toString: String = s"$left LIKE $right"
+  override def toString: String = s"$left LIKE $right" +
+escapeCharOpt.map(str => s" ESCAPE $str").getOrElse("")
 
 Review comment:
   `str` has the same handling with line 150-156 for special chars?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-18 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r305171610
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,38 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' . 
If an escape character
+  precedes a special symbol or another escape character, the following 
character is matched
+  literally. It is invalid to escape any other character.
   """,
   examples = """
 Examples:
   > SELECT '%SystemDrive%\Users\John' _FUNC_ '\%SystemDrive\%\\Users%'
   true
+  > SELECT '%SystemDrive%/Users/John' _FUNC_ '/%SystemDrive/%//Users%' 
ESCAPE '/'
+  true
   """,
   note = """
 Use RLIKE to match with standard regular expressions.
   """,
   since = "1.0.0")
-case class Like(left: Expression, right: Expression) extends 
StringRegexExpression {
+case class Like(left: Expression, right: Expression, escapeCharOpt: 
Option[String] = None)
+  extends StringRegexExpression {
+
+  private lazy val escapeStr = escapeCharOpt.getOrElse("\\")
 
-  override def escape(v: String): String = StringUtils.escapeLikeRegex(v)
+  override def escape(v: String): String = StringUtils.escapeLikeRegex(v, 
escapeStr)
 
 Review comment:
   nit: How about this? Then, remove the `escapeStr` variable.
   ```
 override def escape(v: String): String =
   StringUtils.escapeLikeRegex(v, escapeCharOpt.getOrElse("\\"))
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-16 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r303892528
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/RegexpExpressionsSuite.scala
 ##
 @@ -118,6 +118,154 @@ class RegexpExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 checkLiteralRow("""%SystemDrive%\Users\John""" like _, 
"""\%SystemDrive\%\\Users%""", true)
   }
 
+  test("LIKE Pattern ESCAPE '/'") {
+
+// null handling
+checkLiteralRow(Literal.create(null, StringType).like(_, "/"), "a", null)
+checkEvaluation(
+  Literal.create("a", StringType).like(Literal.create(null, StringType), 
"/"), null)
+checkEvaluation(
+  Literal.create(null, StringType).like(Literal.create(null, StringType), 
"/"), null)
+checkEvaluation(
+  Literal.create("a", StringType).like(NonFoldableLiteral.create("a", 
StringType), "/"), true)
+checkEvaluation(
+  Literal.create("a", StringType).like(NonFoldableLiteral.create(null, 
StringType), "/"), null)
+checkEvaluation(
+  Literal.create(null, StringType).like(NonFoldableLiteral.create("a", 
StringType), "/"), null)
+checkEvaluation(
+  Literal.create(null, StringType).like(NonFoldableLiteral.create(null, 
StringType), "/"), null)
+
+// simple patterns
+checkLiteralRow("abdef" like(_, "/"), "abdef", true)
+checkLiteralRow("a_%b" like(_, "/"), "a/__b", true)
+checkLiteralRow("addb" like(_, "/"), "a_%b", true)
+checkLiteralRow("addb" like(_, "/"), "a/__b", false)
+checkLiteralRow("addb" like(_, "/"), "a%/%b", false)
+checkLiteralRow("a_%b" like(_, "/"), "a%/%b", true)
+checkLiteralRow("addb" like(_, "/"), "a%", true)
+checkLiteralRow("addb" like(_, "/"), "**", false)
+checkLiteralRow("abc" like(_, "/"), "a%", true)
+checkLiteralRow("abc"  like(_, "/"), "b%", false)
+checkLiteralRow("abc"  like(_, "/"), "bc%", false)
+checkLiteralRow("a\nb" like(_, "/"), "a_b", true)
+checkLiteralRow("ab" like(_, "/"), "a%b", true)
+checkLiteralRow("a\nb" like(_, "/"), "a%b", true)
+
+// empty input
+checkLiteralRow("" like(_, "/"), "", true)
+checkLiteralRow("a" like(_, "/"), "", false)
+checkLiteralRow("" like(_, "/"), "a", false)
+
+// SI-17647 double-escaping backslash
+checkLiteralRow("""""" like(_, "/"), """%//%""", true)
+checkLiteralRow("""%%""" like(_, "/"), """%%""", true)
+checkLiteralRow("""/__""" like(_, "/"), """///__""", true)
+checkLiteralRow("""///__""" like(_, "/"), """%//%/%""", false)
+checkLiteralRow("""_///%""" like(_, "/"), """%//""", false)
+
+// unicode
+// scalastyle:off nonascii
+checkLiteralRow("a\u20ACa" like(_, "/"), "_\u20AC_", true)
+checkLiteralRow("a€a" like(_, "/"), "_€_", true)
+checkLiteralRow("a€a" like(_, "/"), "_\u20AC_", true)
+checkLiteralRow("a\u20ACa" like(_, "/"), "_€_", true)
+// scalastyle:on nonascii
+
+// invalid escaping
+val invalidEscape = intercept[AnalysisException] {
+  evaluateWithoutCodegen("""a""" like("""/a""", "/"))
+}
+assert(invalidEscape.getMessage.contains("pattern"))
 
 Review comment:
   "pattern"? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-16 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r303878494
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' 
but a different one
+  can be selected by using the ESCAPE clause. If an escape character 
precedes a special
+  symbol or another escape character, the following character is 
matched literally. It is
+  invalid to escape any other character.
   """,
   examples = """
 Examples:
   > SELECT '%SystemDrive%\Users\John' _FUNC_ '\%SystemDrive\%\\Users%'
   true
+  > SELECT '%SystemDrive%/Users/John' _FUNC_ '/%SystemDrive/%//Users%' 
ESCAPE '/'
+  true
   """,
   note = """
 Use RLIKE to match with standard regular expressions.
   """,
   since = "1.0.0")
-case class Like(left: Expression, right: Expression) extends 
StringRegexExpression {
+case class Like(left: Expression, right: Expression, escapeCharOpt: 
Option[String] = None)
+  extends StringRegexExpression {
+
+  private lazy val escapeStr = escapeCharOpt.getOrElse("\\")
 
-  override def escape(v: String): String = StringUtils.escapeLikeRegex(v)
+  override def escape(v: String): String = StringUtils.escapeLikeRegex(v, 
escapeStr)
 
   override def matches(regex: Pattern, str: String): Boolean = 
regex.matcher(str).matches()
 
-  override def toString: String = s"$left LIKE $right"
+  override def toString: String = s"$left LIKE $right" +
+escapeCharOpt.map(str => s" ESCAPE $str")
 
 Review comment:
   ```
   override def toString: String = s"$left LIKE $right" + escapeCharOpt.map(str 
=> s" ESCAPE $str")
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-16 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r303890645
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -142,10 +148,11 @@ case class Like(left: Expression, right: Expression) 
extends StringRegexExpressi
 } else {
   val pattern = ctx.freshName("pattern")
   val rightStr = ctx.freshName("rightStr")
+  val escapeChar = escapeCharOpt.getOrElse("""""")
 
 Review comment:
   Have you checked if the other meta characters (e.g., ") work correctly?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-16 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r303876036
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -70,8 +70,8 @@ abstract class StringRegexExpression extends BinaryExpression
  * Simple RegEx pattern matching function
  */
 @ExpressionDescription(
-  usage = "str _FUNC_ pattern - Returns true if str matches pattern, " +
-"null if any arguments are null, false otherwise.",
+  usage = "str _FUNC_ pattern[ escape] - Returns true if str matches `pattern` 
with `escape`" +
 
 Review comment:
   `[ escape]` -> `[ ESCAPE escape]`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-16 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r303877847
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,33 +83,39 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
-
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a optional string. The default escape character is the '\' 
but a different one
+  can be selected by using the ESCAPE clause. If an escape character 
precedes a special
+  symbol or another escape character, the following character is 
matched literally. It is
+  invalid to escape any other character.
 
 Review comment:
   How about this?
   ```
 * escape - an optional string. The default escape character is the '\' 
. If an escape character
   precedes a special symbol or another escape character, the 
following character is matched
   literally. It is invalid to escape any other character.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-10 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r302360877
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -103,13 +103,16 @@ abstract class StringRegexExpression extends 
BinaryExpression
 Use RLIKE to match with standard regular expressions.
   """,
   since = "1.0.0")
-case class Like(left: Expression, right: Expression) extends 
StringRegexExpression {
+case class Like(left: Expression, right: Expression, escapeCharOpt: 
Option[String] = None)
+  extends StringRegexExpression {
+
+  private lazy val escapeStr = escapeCharOpt.getOrElse("\\")
 
-  override def escape(v: String): String = StringUtils.escapeLikeRegex(v)
+  override def escape(v: String): String = StringUtils.escapeLikeRegex(v, 
escapeStr)
 
   override def matches(regex: Pattern, str: String): Boolean = 
regex.matcher(str).matches()
 
-  override def toString: String = s"$left LIKE $right"
+  override def toString: String = s"$left LIKE $right ESCAPE $escapeStr"
 
 Review comment:
   This is an optional feature, so I think its better to hide this if 
escapeCharOpt = None.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-10 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r302302687
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -70,7 +70,7 @@ abstract class StringRegexExpression extends BinaryExpression
  * Simple RegEx pattern matching function
  */
 @ExpressionDescription(
-  usage = "str _FUNC_ pattern - Returns true if str matches pattern, " +
+  usage = "str _FUNC_ pattern escape - Returns true if str matches pattern 
with escape, " +
 
 Review comment:
   Is `escape` optional?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-10 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r302307011
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/RegexpExpressionsSuite.scala
 ##
 @@ -51,71 +51,123 @@ class RegexpExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 
 // null handling
 checkLiteralRow(Literal.create(null, StringType).like(_), "a", null)
+checkLiteralRow(Literal.create(null, StringType).like(_, "/"), "a", null)
 
 Review comment:
   plz drop these tests and add a new test case for various escape characters?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-10 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r302304973
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -70,7 +70,7 @@ abstract class StringRegexExpression extends BinaryExpression
  * Simple RegEx pattern matching function
  */
 @ExpressionDescription(
-  usage = "str _FUNC_ pattern - Returns true if str matches pattern, " +
+  usage = "str _FUNC_ pattern escape - Returns true if str matches pattern 
with escape, " +
 
 Review comment:
   Also, you need to add an example below... (could you check carefully again 
by referencing the other examples...?)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-10 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r302305938
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -142,10 +145,11 @@ case class Like(left: Expression, right: Expression) 
extends StringRegexExpressi
 } else {
   val pattern = ctx.freshName("pattern")
   val rightStr = ctx.freshName("rightStr")
+  val escapeChar = escapeCharOpt.getOrElse("")
 
 Review comment:
   plz use `"""`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-10 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r302306449
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -103,13 +106,16 @@ abstract class StringRegexExpression extends 
BinaryExpression
 Use RLIKE to match with standard regular expressions.
   """,
   since = "1.0.0")
-case class Like(left: Expression, right: Expression) extends 
StringRegexExpression {
+case class Like(left: Expression, right: Expression, escapeCharOpt: 
Option[String] = None)
+  extends StringRegexExpression {
 
-  override def escape(v: String): String = StringUtils.escapeLikeRegex(v)
+  private lazy val escapeStr = escapeCharOpt.getOrElse("\\")
 
 Review comment:
   duplicated in the line 148?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-10 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r302305300
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -103,13 +103,16 @@ abstract class StringRegexExpression extends 
BinaryExpression
 Use RLIKE to match with standard regular expressions.
   """,
   since = "1.0.0")
-case class Like(left: Expression, right: Expression) extends 
StringRegexExpression {
+case class Like(left: Expression, right: Expression, escapeCharOpt: 
Option[String] = None)
+  extends StringRegexExpression {
+
+  private lazy val escapeStr = escapeCharOpt.getOrElse("\\")
 
-  override def escape(v: String): String = StringUtils.escapeLikeRegex(v)
+  override def escape(v: String): String = StringUtils.escapeLikeRegex(v, 
escapeStr)
 
   override def matches(regex: Pattern, str: String): Boolean = 
regex.matcher(str).matches()
 
-  override def toString: String = s"$left LIKE $right"
+  override def toString: String = s"$left LIKE $right ESCAPE $escapeStr"
 
 Review comment:
   `escapeStr` always printed if `escapeCharOpt` = None?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-07 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300851495
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -103,13 +106,16 @@ abstract class StringRegexExpression extends 
BinaryExpression
 Use RLIKE to match with standard regular expressions.
   """,
   since = "1.0.0")
-case class Like(left: Expression, right: Expression) extends 
StringRegexExpression {
+case class Like(left: Expression, right: Expression, escapeCharOpt: 
Option[String] = None)
+  extends StringRegexExpression {
 
-  override def escape(v: String): String = StringUtils.escapeLikeRegex(v)
+  private lazy val escapeStr = escapeCharOpt.getOrElse("\\")
 
 Review comment:
   we need this lazy?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-07 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300851663
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/RegexpExpressionsSuite.scala
 ##
 @@ -51,71 +51,123 @@ class RegexpExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 
 // null handling
 checkLiteralRow(Literal.create(null, StringType).like(_), "a", null)
+checkLiteralRow(Literal.create(null, StringType).like(_, "/"), "a", null)
 
 Review comment:
   Why we need these tests?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-07 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300851464
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,16 +83,19 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
+  The default escape character is '\'. If an escape character precedes 
a special symbol
+  or another escape character, the following character is matched 
literally. It is
+  invalid to escape any other character.
 
 Review comment:
   How about just moving this section into the description for `escape`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-07 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300850906
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala
 ##
 @@ -97,7 +97,9 @@ package object dsl {
   case _ => In(expr, list)
 }
 
-def like(other: Expression): Expression = Like(expr, other)
+def like(other: Expression): Expression = Like(expr, other, None)
+def like(other: Expression, escapeChar: String): Expression =
+  Like(expr, other, Option(escapeChar))
 
 Review comment:
   Option->Some


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-07 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300851598
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ##
 @@ -1240,7 +1240,15 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   case SqlBaseParser.IN =>
 invertIfNotDefined(In(e, ctx.expression.asScala.map(expression)))
   case SqlBaseParser.LIKE =>
-invertIfNotDefined(Like(e, expression(ctx.pattern)))
+val escapeOpt = Option(ctx.escapeChar).map{ str =>
+  if (str.getText.length() > 1) {
 
 Review comment:
   nit: length() -> length


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-07 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300851675
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
 ##
 @@ -23,13 +23,20 @@ import org.apache.spark.sql.catalyst.util.StringUtils._
 class StringUtilsSuite extends SparkFunSuite {
 
   test("escapeLikeRegex") {
-assert(escapeLikeRegex("abdef") === 
"(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E")
-assert(escapeLikeRegex("a\\__b") === "(?s)\\Qa\\E\\Q_\\E.\\Qb\\E")
-assert(escapeLikeRegex("a_%b") === "(?s)\\Qa\\E..*\\Qb\\E")
-assert(escapeLikeRegex("a%\\%b") === "(?s)\\Qa\\E.*\\Q%\\E\\Qb\\E")
-assert(escapeLikeRegex("a%") === "(?s)\\Qa\\E.*")
-assert(escapeLikeRegex("**") === "(?s)\\Q*\\E\\Q*\\E")
-assert(escapeLikeRegex("a_b") === "(?s)\\Qa\\E.\\Qb\\E")
+assert(escapeLikeRegex("abdef", "\\") === 
"(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E")
 
 Review comment:
   Could you add a new test case instead of adding them in the existing one?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-07 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300851549
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ##
 @@ -1240,7 +1240,15 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   case SqlBaseParser.IN =>
 invertIfNotDefined(In(e, ctx.expression.asScala.map(expression)))
   case SqlBaseParser.LIKE =>
-invertIfNotDefined(Like(e, expression(ctx.pattern)))
+val escapeOpt = Option(ctx.escapeChar).map{ str =>
 
 Review comment:
   nit: `.map { str =>`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-07 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300851471
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,16 +83,19 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
+  The default escape character is '\'. If an escape character precedes 
a special symbol
+  or another escape character, the following character is matched 
literally. It is
+  invalid to escape any other character.
 
 Review comment:
   Also update the `usage` section.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-04 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300519376
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 ##
 @@ -65,6 +65,38 @@ object StringUtils extends Logging {
 "(?s)" + out.result() // (?s) enables dotall mode, causing "." to match 
new lines
   }
 
+  def escapeLikeRegex(pattern: String, escapeStr: String): String = {
 
 Review comment:
   Why we need the two version of `escapeLikeRegex`? Can we merge them?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-04 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300518433
 
 

 ##
 File path: 
sql/core/src/test/resources/sql-tests/results/typeCoercion/native/implicitTypeCasts.sql.out
 ##
 @@ -317,38 +317,38 @@ struct
 -- !query 39
 select 1 like '%' FROM t
 -- !query 39 schema
-struct
+struct
 
 Review comment:
   Could you remove this change by overriding some functions, e.g., 
simpleString?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-04 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300517662
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -83,16 +83,19 @@ abstract class StringRegexExpression extends 
BinaryExpression
   % matches zero or more characters in the input (similar to .* in 
posix regular
   expressions)
 
-  The escape character is '\'. If an escape character precedes a 
special symbol or another
-  escape character, the following character is matched literally. It 
is invalid to escape
-  any other character.
+  The default escape character is '\'. If an escape character precedes 
a special symbol
+  or another escape character, the following character is matched 
literally. It is
+  invalid to escape any other character.
 
   Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
   to match "\abc", the pattern should be "\\abc".
 
   When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it fallbacks
   to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
   enabled, the pattern to match "\abc" should be "\abc".
+  * escape - a string expression. The default escape character is the '\' 
but a different one
 
 Review comment:
   a string expression?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-04 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300516911
 
 

 ##
 File path: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
 ##
 @@ -662,7 +662,8 @@ predicate
 : NOT? kind=BETWEEN lower=valueExpression AND upper=valueExpression
 | NOT? kind=IN '(' expression (',' expression)* ')'
 | NOT? kind=IN '(' query ')'
-| NOT? kind=(RLIKE | LIKE) pattern=valueExpression
+| NOT? kind=RLIKE pattern=valueExpression
+| NOT? kind=LIKE pattern=valueExpression (ESCAPE 
escapeChar=valueExpression)?
 
 Review comment:
   `valueExpression` -> `STRING`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-04 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300517831
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -103,13 +106,20 @@ abstract class StringRegexExpression extends 
BinaryExpression
 Use RLIKE to match with standard regular expressions.
   """,
   since = "1.0.0")
-case class Like(left: Expression, right: Expression) extends 
StringRegexExpression {
+case class Like(
+  left: Expression,
+  right: Expression,
+  escapeCharOpt: Option[String] = None) extends StringRegexExpression {
 
 Review comment:
   ```
   case class Like(left: Expression, right: Expression, escapeCharOpt: 
Option[String] = None)
 extends StringRegexExpression {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-04 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300517573
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ##
 @@ -1240,7 +1240,9 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   case SqlBaseParser.IN =>
 invertIfNotDefined(In(e, ctx.expression.asScala.map(expression)))
   case SqlBaseParser.LIKE =>
-invertIfNotDefined(Like(e, expression(ctx.pattern)))
+val escapeOpt = Option(ctx.escapeChar).map(_.getText)
+val like = Like(e, expression(ctx.pattern), escapeOpt)
+invertIfNotDefined(like)
 
 Review comment:
   Needs to check if `escapeOpt` has a valid char for an escape ?
   In PostgreSQL;
   ```
   postgres=# select * from t where a like 'a%c' escape 'invalid';
   ERROR:  invalid escape string
   HINT:  Escape string must be empty or one character.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-04 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300519634
 
 

 ##
 File path: sql/core/src/main/scala/org/apache/spark/sql/Column.scala
 ##
 @@ -822,6 +822,16 @@ class Column(val expr: Expression) extends Logging {
*/
   def like(literal: String): Column = withExpr { Like(expr, lit(literal).expr) 
}
 
+  /**
+   * SQL like expression. Returns a boolean column based on a SQL LIKE ESCAPE 
match.
+   *
+   * @group expr_ops
+   * @since 3.0.0
+   */
+  def like(literal: String, escapeStr: String): Column = withExpr {
+Like(expr, lit(literal).expr, Option(escapeStr))
+  }
+
 
 Review comment:
   I'm neutral on adding this. Is that enough to just use the SQL interface for 
this escape? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-04 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300519115
 
 

 ##
 File path: docs/sql-keywords.md
 ##
 @@ -103,6 +103,7 @@ Below is a list of all the keywords in Spark SQL.
   
DROPnon-reservednon-reservedreserved
   ELSEreservednon-reservedreserved
   ENDreservednon-reservedreserved
+  
ESCAPEnon-reservednon-reservedreserved
 
 Review comment:
   Ur, IMHO this should be reserved even in spark.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-03 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300204709
 
 

 ##
 File path: docs/sql-keywords.md
 ##
 @@ -103,6 +103,7 @@ Below is a list of all the keywords in Spark SQL.
   
DROPnon-reservednon-reservedreserved
   ELSEreservednon-reservedreserved
   ENDreservednon-reservedreserved
+  
ESCAPEnon-reservednon-reservedreserved
 
 Review comment:
   Also, update `TableIdentifierParserSuite`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax

2019-07-03 Thread GitBox
maropu commented on a change in pull request #25001: [SPARK-28083][SQL] Support 
LIKE ... ESCAPE syntax
URL: https://github.com/apache/spark/pull/25001#discussion_r300204451
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ##
 @@ -65,6 +65,59 @@ abstract class StringRegexExpression extends 
BinaryExpression
   override def sql: String = s"${left.sql} 
${prettyName.toUpperCase(Locale.ROOT)} ${right.sql}"
 }
 
+abstract class StringRegexV2Expression extends TernaryExpression
 
 Review comment:
   like this? 
https://github.com/apache/spark/compare/master...maropu:SPARK-28083
   ```
   /* 066 */   String filter_rightStr_0 = scan_value_1.toString();
   /* 067 */   java.util.regex.Pattern filter_pattern_0 = 
java.util.regex.Pattern.compile(org.apache.spark.sql.catalyst.util.StringUtils.escapeLikeRegex(filter_rightStr_0,
 "\"));
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org