[
https://issues.apache.org/jira/browse/SPARK-20837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-20837:
---------------------------------
Labels: bulk-closed (was: )
> Spark SQL doesn't support escape of single/double quote as SQL standard.
> ------------------------------------------------------------------------
>
> Key: SPARK-20837
> URL: https://issues.apache.org/jira/browse/SPARK-20837
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.6.1, 1.6.2, 1.6.3, 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1
> Reporter: bing huang
> Priority: Major
> Labels: bulk-closed
>
> 1. If we run the code below against spark 1.6.x, we will get error message
> "Exception in thread "main" java.lang.RuntimeException: [1.44] failure: ``)''
> expected but "york" found".
> 2. If we run the code against spark 2.x.x, we can run successfully and the
> result is (1,2,3,4,5), however, based on the sql specification, to doubling
> up the single quote here is just to escape the single quote, hence the result
> should be (6,7,8,9,10), which could also be verified if you ran the same sql
> and same data in MySQLWorkBench or SQL Server.
> The code snippet I used to demonstrate the issue as below:
> val conf = new SparkConf().setAppName("appName").setMaster("local[3]")
> val sc = new SparkContext(conf)
> val sqlContext = new SQLContext(sc)
> // create test dataset
> val data = (1 to 10).map{x:Int => x match {
> case t if t <= 5 => Row("New 'york' city", t.toString,"2015-01-01
> 13:59:59.123", 2147483647.0, Double
> .PositiveInfinity)
> case t => Row("New york city", t.toString,"2015-01-02 23:59:59.456",
> 1.0, Double.PositiveInfinity)
> }}
> // create schema of the test dataset
> val schema = StructType(Array(
> StructField("A1", DataTypes.StringType),
> StructField("A2", DataTypes.StringType),
> StructField("A3", DataTypes.StringType),
> StructField("A4", DataTypes.DoubleType),
> StructField("A5", DataTypes.DoubleType)
> ))
> val rdd = sc.parallelize(data)
> val df = sqlContext.createDataFrame(rdd,schema)
> df.registerTempTable("test")
> val sqlString ="select A2 from test where A1 not in ('New ''york'' city')"
> sqlContext.sql(sqlString).show(false)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]