Ok this is what I have:
object SQLHiveContextSingleton {
@transient private var instance: HiveContext = _
def getInstance(sparkContext: SparkContext): HiveContext = {
synchronized {
if (instance == null || sparkContext.isStopped) {
instance = new HiveContext(sparkContext)
}
instance
}
}
}
and then in my application I have:
linesStream.foreachRDD(linesRdd => {
if (!linesRdd.isEmpty()) {
val jsonRdd = linesRdd.map(x => parser.parse(x))
val sqlContext = SQLHiveContextSingleton.getInstance(linesRdd.sparkContext)
import sqlContext.implicits._
Again this is for a streaming app and I am following best practice from here:
https://spark.apache.org/docs/latest/streaming-programming-guide.html#dataframe-and-sql-operations
Date: Fri, 4 Mar 2016 14:52:58 -0800
Subject: Re: Error building a self contained Spark app
From: [email protected]
To: [email protected]
CC: [email protected]
Can you show your code snippet ?Here is an example:
val sqlContext = new SQLContext(sc) import sqlContext.implicits._
On Fri, Mar 4, 2016 at 1:55 PM, Mich Talebzadeh <[email protected]>
wrote:
Hi Ted,
I am getting the following error after adding that import
[error] /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:5:
not found: object sqlContext
[error] import sqlContext.implicits._
[error] ^
[error] /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:15:
value toDF is not a member of Seq[(String, Int)]
Dr Mich Talebzadeh
LinkedIn
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
http://talebzadehmich.wordpress.com
On 4 March 2016 at 21:39, Ted Yu <[email protected]> wrote:
Can you add the following into your code ? import sqlContext.implicits._
On Fri, Mar 4, 2016 at 1:14 PM, Mich Talebzadeh <[email protected]>
wrote:
Hi,
I have a simple Scala program as below
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.sql.SQLContextobject Sequence {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Sequence")
val sc = new SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
val a = Seq(("Mich",20), ("Christian", 18), ("James",13), ("Richard",16))
// Sort option 1 using tempTable
val b = a.toDF("Name","score").registerTempTable("tmp")
sql("select Name,score from tmp order by score desc").show
// Sort option 2 with FP
a.toDF("Name","score").sort(desc("score")).show
}
}
I build this using sbt tool as below
cat sequence.sbt
name := "Sequence"version := "1.0"scalaVersion := "2.10.5"libraryDependencies
+= "org.apache.spark" %% "spark-core" % "1.5.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.0.0"
libraryDependencies += "org.apache.spark" %% "spark-hive" % "1.5.0"
But it fails compilation as below
[info] Compilation completed in 12.366 s
[error] /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:15:
value toDF is not a member of Seq[(String, Int)]
[error] val b = a.toDF("Name","score").registerTempTable("tmp")
[error] ^
[error] /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:16:
not found: value sql
[error] sql("select Name,score from tmp order by score desc").show
[error] ^
[error] /home/hduser/dba/bin/scala/Sequence/src/main/scala/Sequence.scala:18:
value toDF is not a member of Seq[(String, Int)]
[error] a.toDF("Name","score").sort(desc("score")).show
[error] ^
[error] three errors found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 95 s, completed Mar 4, 2016 9:06:40 PM
I think I am missing some dependencies here
I have a simple
Dr Mich Talebzadeh
LinkedIn
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
http://talebzadehmich.wordpress.com