mathieu longtin created SPARK-24753:
---------------------------------------
Summary: bad backslah parsing in SQL statements
Key: SPARK-24753
URL: https://issues.apache.org/jira/browse/SPARK-24753
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.3.0
Environment: ____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 2.3.0
/_/
Using Python version 2.7.12 (default, Jul 15 2016 11:23:12)
Reporter: mathieu longtin
When putting backslashes in SQL code, you need to double them (or rather double
double them).
Code in Python but I verified the problem is the same in Scala.
Line [3] should return the line, and line 4 shouldn't.
{code:java}
In [1]: df = spark.createDataFrame([("abc def ghi",)], schema=["s"])
In [2]: df.filter(df.s.rlike('\\bdef\\b')).show()
+-----------+
| s|
+-----------+
|abc def ghi|
+-----------+
In [3]: df.filter("s rlike '\\bdef\\b'").show()
+---+
| s|
+---+
+---+
In [4]: df.filter("s rlike '\\\\bdef\\\\b'").show()
+-----------+
| s|
+-----------+
|abc def ghi|
+-----------+
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]