[
https://issues.apache.org/jira/browse/SPARK-16386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hao Ren updated SPARK-16386:
----------------------------
Description:
I just want to figure out why the two contexts behavior differently even on a
simple query.
In a netshell, I have a query in which there is a String containing single
quote and casting to Array/Map.
I have tried all the combination of diff type of sql context and query call api
(sql, df.select, df.selectExpr).
I can't find one rules all.
Here is the code for reproducing the problem.
{code: javaj}
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.{SparkConf, SparkContext}
object Test extends App {
val sc = new SparkContext("local[2]", "test", new SparkConf)
val hiveContext = new HiveContext(sc)
val sqlContext = new SQLContext(sc)
val context = hiveContext
// val context = sqlContext
import context.implicits._
val df = Seq((Seq(1, 2), 2)).toDF("a", "b")
df.registerTempTable("tbl")
df.printSchema()
// case 1
context.sql("select cast(a as array<string>) from tbl").show()
// HiveContext => org.apache.spark.sql.AnalysisException: cannot recognize
input near 'array' '<' 'string' in primitive type specification; line 1 pos 17
// SQLContext => OK
// case 2
context.sql("select 'a\\'b'").show()
// HiveContext => OK
// SQLContext => failure: ``union'' expected but ErrorToken(unclosed string
literal) found
// case 3
df.selectExpr("cast(a as array<string>)").show() // OK with HiveContext and
SQLContext
// case 4
df.selectExpr("'a\\'b'").show() // HiveContext, SQLContext => failure: end of
input expected
}
{code}
was:
I just want to figure out why the two contexts behavior differently even on a
simple query.
In a netshell, I have a query in which there is a String containing single
quote and casting to Array/Map.
I have tried all the combination of diff type of sql context and query call api
(sql, df.select, df.selectExpr).
I can't find one rules all.
Here is the code for reproducing the problem.
{code: scala}
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.{SparkConf, SparkContext}
object Test extends App {
val sc = new SparkContext("local[2]", "test", new SparkConf)
val hiveContext = new HiveContext(sc)
val sqlContext = new SQLContext(sc)
val context = hiveContext
// val context = sqlContext
import context.implicits._
val df = Seq((Seq(1, 2), 2)).toDF("a", "b")
df.registerTempTable("tbl")
df.printSchema()
// case 1
context.sql("select cast(a as array<string>) from tbl").show()
// HiveContext => org.apache.spark.sql.AnalysisException: cannot recognize
input near 'array' '<' 'string' in primitive type specification; line 1 pos 17
// SQLContext => OK
// case 2
context.sql("select 'a\\'b'").show()
// HiveContext => OK
// SQLContext => failure: ``union'' expected but ErrorToken(unclosed string
literal) found
// case 3
df.selectExpr("cast(a as array<string>)").show() // OK with HiveContext and
SQLContext
// case 4
df.selectExpr("'a\\'b'").show() // HiveContext, SQLContext => failure: end of
input expected
}
{code}
> SQLContext and HiveContext parse a query string differently
> -----------------------------------------------------------
>
> Key: SPARK-16386
> URL: https://issues.apache.org/jira/browse/SPARK-16386
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.6.0, 1.6.1, 1.6.2
> Environment: scala 2.10, 2.11
> Reporter: Hao Ren
> Labels: patch
>
> I just want to figure out why the two contexts behavior differently even on a
> simple query.
> In a netshell, I have a query in which there is a String containing single
> quote and casting to Array/Map.
> I have tried all the combination of diff type of sql context and query call
> api (sql, df.select, df.selectExpr).
> I can't find one rules all.
> Here is the code for reproducing the problem.
> {code: javaj}
> import org.apache.spark.sql.SQLContext
> import org.apache.spark.sql.hive.HiveContext
> import org.apache.spark.{SparkConf, SparkContext}
> object Test extends App {
> val sc = new SparkContext("local[2]", "test", new SparkConf)
> val hiveContext = new HiveContext(sc)
> val sqlContext = new SQLContext(sc)
> val context = hiveContext
> // val context = sqlContext
> import context.implicits._
> val df = Seq((Seq(1, 2), 2)).toDF("a", "b")
> df.registerTempTable("tbl")
> df.printSchema()
> // case 1
> context.sql("select cast(a as array<string>) from tbl").show()
> // HiveContext => org.apache.spark.sql.AnalysisException: cannot recognize
> input near 'array' '<' 'string' in primitive type specification; line 1 pos 17
> // SQLContext => OK
> // case 2
> context.sql("select 'a\\'b'").show()
> // HiveContext => OK
> // SQLContext => failure: ``union'' expected but ErrorToken(unclosed string
> literal) found
> // case 3
> df.selectExpr("cast(a as array<string>)").show() // OK with HiveContext and
> SQLContext
> // case 4
> df.selectExpr("'a\\'b'").show() // HiveContext, SQLContext => failure: end
> of input expected
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]