[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2021-01-13 Thread Noah Kawasaki (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17264452#comment-17264452
 ] 

Noah Kawasaki commented on SPARK-17647:
---

I can also confirm that this issue is not fully resolved. Like what [~swiegleb] 
has shown, escape characters are not fully supported. 

I have tested Spark versions 2.1, 2.2, 2.3, 2.4, and 3.0 and they all 
experience the issue:
{code:java}
# These do not return the expected backslash
SET spark.sql.parser.escapedStringLiterals=false;
SELECT '\\';
> \
(should return \\)

SELECT 'hi\hi';
> hihi
(should return hi\hi) 


# These are correctly escaped
SELECT '\"';
> "

 SELECT '\'';
> '{code}
If I switch this: 
{code:java}
# These now work
SET spark.sql.parser.escapedStringLiterals=true;
SELECT '\\';
> \\

SELECT 'hi\hi';
> hi\hi


# These are now not correctly escaped
SELECT '\"';
> \"
(should return ")

SELECT '\'';
> \'
(should return ' ){code}
 So basically we have to choose:

SET spark.sql.parser.escapedStringLiterals=false; if we want backslashes 
correctly escaped but not other special characters


SET spark.sql.parser.escapedStringLiterals=true; if we want other special 
characters correctly escaped but not backslashes

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Major
>  Labels: correctness
> Fix For: 2.1.1, 2.2.0
>
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2019-07-03 Thread Sascha Wiegleb (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877599#comment-16877599
 ] 

Sascha Wiegleb commented on SPARK-17647:


I have tested this behavior with spark 2.4.2 and it seems to be that the bug is 
still there.

With: 
{code:java}
spark.sql.parser.escapedStringLiterals=true
{code}
backslash it is working, but I run into failures with escaping " and ' .
{code:java}
the escape character is not allowed to precede '\"'
{code}
I have tested the behavior with the following special characters and their 
escaping:
{code:java}
_ % \ ' "
{code}
So both modes will not cover the full spectrum of escaping special characters.

 

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Major
>  Labels: correctness
> Fix For: 2.1.1, 2.2.0
>
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2017-12-20 Thread Wenchen Fan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16298456#comment-16298456
 ] 

Wenchen Fan commented on SPARK-17647:
-

{{spark.sql("select '' like '%\\%'").show}} actually is {{select '\\' like 
'%\%}} as SQL statement, after the java string escaping.

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>  Labels: correctness
> Fix For: 2.1.1, 2.2.0
>
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2017-12-18 Thread Takeshi Yamamuro (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16296234#comment-16296234
 ] 

Takeshi Yamamuro commented on SPARK-17647:
--

Probably, is it okay to set spark.sql.parser.escapedStringLiterals to true?

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>  Labels: correctness
> Fix For: 2.1.1, 2.2.0
>
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2017-12-18 Thread Takeshi Yamamuro (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295992#comment-16295992
 ] 

Takeshi Yamamuro commented on SPARK-17647:
--

It seems the master still handle ''%\\%' as `(?s).*\Q%\E` instead of  
`(?s).*\Q\\E.*`
.

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>  Labels: correctness
> Fix For: 2.1.1, 2.2.0
>
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2017-12-18 Thread Takeshi Yamamuro (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295976#comment-16295976
 ] 

Takeshi Yamamuro commented on SPARK-17647:
--

I'm looking  into the code and I'll make a follow-up pr.

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>  Labels: correctness
> Fix For: 2.1.1, 2.2.0
>
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2017-12-18 Thread Dong Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295261#comment-16295261
 ] 

Dong Jiang commented on SPARK-17647:


Are we sure this issue is resolved, I tested the following on spark-shell 2.2.0
{code}
Welcome to
    __
 / __/__  ___ _/ /__
_\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
  /_/
 
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_25)
Type in expressions to have them evaluated.
Type :help for more information.

scala> spark.sql("select '' like '%\\%'").show
+--+
|\ LIKE %\%|
+--+
| false|
+--+
{code}
same in spark-sql
{code}
spark-sql> select '' like '%\\%';
false
Time taken: 2.296 seconds, Fetched 1 row(s)
{code}


> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>  Labels: correctness
> Fix For: 2.1.1, 2.2.0
>
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2017-04-17 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15971675#comment-15971675
 ] 

Apache Spark commented on SPARK-17647:
--

User 'felixcheung' has created a pull request for this issue:
https://github.com/apache/spark/pull/17663

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>  Labels: correctness
> Fix For: 2.1.1, 2.2.0
>
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2016-12-09 Thread Jakob Odersky (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736107#comment-15736107
 ] 

Jakob Odersky commented on SPARK-17647:
---

I rebased the PR and resolved the conflict. However, there is still the 
incompatibility issue with the sql ANTLR parser. I talk about it in my last 
[two comments | 
https://github.com/apache/spark/pull/15398#issuecomment-255917940 ] and propose 
a few solutions. Any feedback is welcome!

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>  Labels: correctness
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2016-12-09 Thread Xiangrui Meng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734678#comment-15734678
 ] 

Xiangrui Meng commented on SPARK-17647:
---

[~r...@databricks.com] [~yhuai] I think this is a critical correctness bug, 
which should be fixed in 2.1. Thoughts?

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>  Labels: correctness
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2016-10-07 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556767#comment-15556767
 ] 

Apache Spark commented on SPARK-17647:
--

User 'jodersky' has created a pull request for this issue:
https://github.com/apache/spark/pull/15398

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>  Labels: correctness
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2016-10-06 Thread Jakob Odersky (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553517#comment-15553517
 ] 

Jakob Odersky commented on SPARK-17647:
---

Xiao pointed me to this issue, I can take a look at it

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>  Labels: correctness
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2016-10-06 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553424#comment-15553424
 ] 

Xiao Li commented on SPARK-17647:
-

Sure, let me check it with my teammates. Thanks!

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>  Labels: correctness
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2016-10-06 Thread Yin Huai (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553357#comment-15553357
 ] 

Yin Huai commented on SPARK-17647:
--

[~smilegator] Anyone from your side has time to take a look at it?

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>  Labels: correctness
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2016-09-26 Thread Xiangrui Meng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15523623#comment-15523623
 ] 

Xiangrui Meng commented on SPARK-17647:
---

Thanks [~joshrosen]! I updated the JIRA description. The LIKE escaping 
behaviors in MySQL/PostgreSQL are documented here:

* MySQL: 
http://dev.mysql.com/doc/refman/5.7/en/string-comparison-functions.html#operator_like
* PostgreSQL: https://www.postgresql.org/docs/8.3/static/functions-matching.html

In particular, MySQL:

{noformat}
Exception: At the end of the pattern string, backslash can be specified as 
“\\”. At the end of the string, backslash stands for itself because there is 
nothing following to escape. Suppose that a table contains the following values:
{noformat}

That explains why MySQL returns true for both `\\` like `` and `\\` like 
`\\`.

> SQL LIKE does not handle backslashes correctly
> --
>
> Key: SPARK-17647
> URL: https://issues.apache.org/jira/browse/SPARK-17647
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Xiangrui Meng
>  Labels: correctness
>
> Try the following in SQL shell:
> {code}
> select '' like '%\\%';
> {code}
> It returned false, which is wrong.
> cc: [~yhuai] [~joshrosen]
> A false-negative considered previously:
> {code}
> select '' rlike '.*.*';
> {code}
> It returned true, which is correct if we assume that the pattern is treated 
> as a Java string but not raw string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org