[jira] [Commented] (SPARK-24753) bad backslah parsing in SQL statements
[ https://issues.apache.org/jira/browse/SPARK-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16539490#comment-16539490 ] Takeshi Yamamuro commented on SPARK-24753: -- I closed this as 'not a problem'. Feel free to open a pr for the doc fix (you don't file a jira for tiny doc/comment fixes). > bad backslah parsing in SQL statements > -- > > Key: SPARK-24753 > URL: https://issues.apache.org/jira/browse/SPARK-24753 > Project: Spark > Issue Type: Documentation > Components: SQL >Affects Versions: 2.3.0 > Environment: __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ > /__ / .__/\_,_/_/ /_/\_\ version 2.3.0 > /_/ > Using Python version 2.7.12 (default, Jul 15 2016 11:23:12) >Reporter: mathieu longtin >Priority: Trivial > > When putting backslashes in SQL code, you need to double them (or rather > double double them). > Code in Python but I verified the problem is the same in Scala. > Line [3] should return the line, and line 4 shouldn't. > > {code:java} > In [1]: df = spark.createDataFrame([("abc def ghi",)], schema=["s"]) > In [2]: df.filter(df.s.rlike('\\bdef\\b')).show() > +---+ > | s| > +---+ > |abc def ghi| > +---+ > In [3]: df.filter("s rlike '\\bdef\\b'").show() > +---+ > | s| > +---+ > +---+ > In [4]: df.filter("s rlike 'bdefb'").show() > +---+ > | s| > +---+ > |abc def ghi| > +---+ > > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24753) bad backslah parsing in SQL statements
[ https://issues.apache.org/jira/browse/SPARK-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16539470#comment-16539470 ] Hyukjin Kwon commented on SPARK-24753: -- If the example is wrong, please go ahead for a PR after testing it out and filling the PR description. > bad backslah parsing in SQL statements > -- > > Key: SPARK-24753 > URL: https://issues.apache.org/jira/browse/SPARK-24753 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.0 > Environment: __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ > /__ / .__/\_,_/_/ /_/\_\ version 2.3.0 > /_/ > Using Python version 2.7.12 (default, Jul 15 2016 11:23:12) >Reporter: mathieu longtin >Priority: Minor > > When putting backslashes in SQL code, you need to double them (or rather > double double them). > Code in Python but I verified the problem is the same in Scala. > Line [3] should return the line, and line 4 shouldn't. > > {code:java} > In [1]: df = spark.createDataFrame([("abc def ghi",)], schema=["s"]) > In [2]: df.filter(df.s.rlike('\\bdef\\b')).show() > +---+ > | s| > +---+ > |abc def ghi| > +---+ > In [3]: df.filter("s rlike '\\bdef\\b'").show() > +---+ > | s| > +---+ > +---+ > In [4]: df.filter("s rlike 'bdefb'").show() > +---+ > | s| > +---+ > |abc def ghi| > +---+ > > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24753) bad backslah parsing in SQL statements
[ https://issues.apache.org/jira/browse/SPARK-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16539023#comment-16539023 ] mathieu longtin commented on SPARK-24753: - Thanks for the response. Yes, it does work with escapedStringLiterals. However, this is an inconsitent behavior. In the doc example: (https://spark.apache.org/docs/2.3.0/api/sql/index.html#rlike) {code:java} SELECT '%SystemDrive%\Users\John' rlike '%SystemDrive%\\Users.*' {code} The examples are totally wrong. In fact, they produce an error. To reproduce the example, using the *spark-sql* command with _escapedStringLiterals=False_, I need this: {code:java} When spark.sql.parser.escapedStringLiterals is disabled (default). > SELECT '%SystemDrive%\\Users\\John' rlike '%SystemDrive%Users.*' true{code} Notice the double and quadruple backslash. Somehow, the right side of rlike gets decoded, and then passed to the rlike function, which then decodes it again. BTW, from spark-sql: {code:java} > SELECT '%SystemDrive%\Users\John' ; %SystemDrive%UsersJohn {code} Oops, the backslash get swallowed. > bad backslah parsing in SQL statements > -- > > Key: SPARK-24753 > URL: https://issues.apache.org/jira/browse/SPARK-24753 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.0 > Environment: __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ > /__ / .__/\_,_/_/ /_/\_\ version 2.3.0 > /_/ > Using Python version 2.7.12 (default, Jul 15 2016 11:23:12) >Reporter: mathieu longtin >Priority: Minor > > When putting backslashes in SQL code, you need to double them (or rather > double double them). > Code in Python but I verified the problem is the same in Scala. > Line [3] should return the line, and line 4 shouldn't. > > {code:java} > In [1]: df = spark.createDataFrame([("abc def ghi",)], schema=["s"]) > In [2]: df.filter(df.s.rlike('\\bdef\\b')).show() > +---+ > | s| > +---+ > |abc def ghi| > +---+ > In [3]: df.filter("s rlike '\\bdef\\b'").show() > +---+ > | s| > +---+ > +---+ > In [4]: df.filter("s rlike 'bdefb'").show() > +---+ > | s| > +---+ > |abc def ghi| > +---+ > > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24753) bad backslah parsing in SQL statements
[ https://issues.apache.org/jira/browse/SPARK-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16536495#comment-16536495 ] Hyukjin Kwon commented on SPARK-24753: -- gentle ping [~mathieulongtin] > bad backslah parsing in SQL statements > -- > > Key: SPARK-24753 > URL: https://issues.apache.org/jira/browse/SPARK-24753 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.0 > Environment: __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ > /__ / .__/\_,_/_/ /_/\_\ version 2.3.0 > /_/ > Using Python version 2.7.12 (default, Jul 15 2016 11:23:12) >Reporter: mathieu longtin >Priority: Minor > > When putting backslashes in SQL code, you need to double them (or rather > double double them). > Code in Python but I verified the problem is the same in Scala. > Line [3] should return the line, and line 4 shouldn't. > > {code:java} > In [1]: df = spark.createDataFrame([("abc def ghi",)], schema=["s"]) > In [2]: df.filter(df.s.rlike('\\bdef\\b')).show() > +---+ > | s| > +---+ > |abc def ghi| > +---+ > In [3]: df.filter("s rlike '\\bdef\\b'").show() > +---+ > | s| > +---+ > +---+ > In [4]: df.filter("s rlike 'bdefb'").show() > +---+ > | s| > +---+ > |abc def ghi| > +---+ > > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24753) bad backslah parsing in SQL statements
[ https://issues.apache.org/jira/browse/SPARK-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16535660#comment-16535660 ] Takeshi Yamamuro commented on SPARK-24753: -- you try `spark.sql.parser.escapedStringLiterals=true`? > bad backslah parsing in SQL statements > -- > > Key: SPARK-24753 > URL: https://issues.apache.org/jira/browse/SPARK-24753 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.0 > Environment: __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ > /__ / .__/\_,_/_/ /_/\_\ version 2.3.0 > /_/ > Using Python version 2.7.12 (default, Jul 15 2016 11:23:12) >Reporter: mathieu longtin >Priority: Minor > > When putting backslashes in SQL code, you need to double them (or rather > double double them). > Code in Python but I verified the problem is the same in Scala. > Line [3] should return the line, and line 4 shouldn't. > > {code:java} > In [1]: df = spark.createDataFrame([("abc def ghi",)], schema=["s"]) > In [2]: df.filter(df.s.rlike('\\bdef\\b')).show() > +---+ > | s| > +---+ > |abc def ghi| > +---+ > In [3]: df.filter("s rlike '\\bdef\\b'").show() > +---+ > | s| > +---+ > +---+ > In [4]: df.filter("s rlike 'bdefb'").show() > +---+ > | s| > +---+ > |abc def ghi| > +---+ > > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24753) bad backslah parsing in SQL statements
[ https://issues.apache.org/jira/browse/SPARK-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16535659#comment-16535659 ] Takeshi Yamamuro commented on SPARK-24753: -- Your try `spark.sql.parser.quotedRegexColumnNames=true`? > bad backslah parsing in SQL statements > -- > > Key: SPARK-24753 > URL: https://issues.apache.org/jira/browse/SPARK-24753 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.0 > Environment: __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ > /__ / .__/\_,_/_/ /_/\_\ version 2.3.0 > /_/ > Using Python version 2.7.12 (default, Jul 15 2016 11:23:12) >Reporter: mathieu longtin >Priority: Minor > > When putting backslashes in SQL code, you need to double them (or rather > double double them). > Code in Python but I verified the problem is the same in Scala. > Line [3] should return the line, and line 4 shouldn't. > > {code:java} > In [1]: df = spark.createDataFrame([("abc def ghi",)], schema=["s"]) > In [2]: df.filter(df.s.rlike('\\bdef\\b')).show() > +---+ > | s| > +---+ > |abc def ghi| > +---+ > In [3]: df.filter("s rlike '\\bdef\\b'").show() > +---+ > | s| > +---+ > +---+ > In [4]: df.filter("s rlike 'bdefb'").show() > +---+ > | s| > +---+ > |abc def ghi| > +---+ > > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org