No, creating DF using createDataFrame won’t work:

val peopleDF = sqlContext.createDataFrame(people)

the code can be compiled but raised the same error as toDF at the line
above.

On Wed, May 13, 2015 at 6:22 PM Sebastian Alfers
[sebastian.alf...@googlemail.com](mailto:sebastian.alf...@googlemail.com)
<http://mailto:[sebastian.alf...@googlemail.com](mailto:sebastian.alf...@googlemail.com)>
wrote:

I use:
>
> val conf = new SparkConf()...
> val sc = new SparkContext(conf)
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
>
> val rdd: RDD[...] = ...
> val schema: StructType = ...
>
> sqlContext.createDataFrame(rdd, schema)
>
>
>
> 2015-05-13 12:00 GMT+02:00 SLiZn Liu <sliznmail...@gmail.com>:
>
>> Additionally, after I successfully packaged the code, and submitted via 
>> spark-submit
>> webcat_2.11-1.0.jar, the following error was thrown at the line where
>> toDF() been called:
>>
>> Exception in thread "main" java.lang.NoSuchMethodError: 
>> scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaUniverse$JavaMirror;
>>   at WebcatApp$.main(webcat.scala:49)
>>   at WebcatApp.main(webcat.scala)
>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>   at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>   at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>   at java.lang.reflect.Method.invoke(Method.java:606)
>>   at 
>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:569)
>>   at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
>>   at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
>>   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
>>   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>> Unsurprisingly, if I remove toDF, no error occurred.
>>
>> I have moved the case class definition outside of main but inside the
>> outer object scope, and removed the provided specification in build.sbt.
>> However, when I tried *Dean Wampler*‘s suggestion of using
>> sc.createDataFrame() the compiler says this function is not a member of
>> sc, and I cannot find any reference in the latest documents. What else
>> should I try?
>>
>> REGARDS,
>> Todd Leo
>> ​
>>
>> On Wed, May 13, 2015 at 11:27 AM SLiZn Liu <sliznmail...@gmail.com>
>> wrote:
>>
>>> Thanks folks, really appreciate all your replies! I tried each of your
>>> suggestions and in particular, *Animesh*‘s second suggestion of *making
>>> case class definition global* helped me getting off the trap.
>>>
>>> Plus, I should have paste my entire code with this mail to help the
>>> diagnose.
>>>
>>> REGARDS,
>>> Todd Leo
>>> ​
>>>
>>> On Wed, May 13, 2015 at 12:10 AM Dean Wampler <deanwamp...@gmail.com>
>>> wrote:
>>>
>>>> It's the import statement Olivier showed that makes the method
>>>> available.
>>>>
>>>> Note that you can also use `sc.createDataFrame(myRDD)`, without the
>>>> need for the import statement. I personally prefer this approach.
>>>>
>>>> Dean Wampler, Ph.D.
>>>> Author: Programming Scala, 2nd Edition
>>>> <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
>>>> Typesafe <http://typesafe.com>
>>>> @deanwampler <http://twitter.com/deanwampler>
>>>> http://polyglotprogramming.com
>>>>
>>>> On Tue, May 12, 2015 at 9:33 AM, Olivier Girardot <ssab...@gmail.com>
>>>> wrote:
>>>>
>>>>> you need to instantiate a SQLContext :
>>>>> val sc : SparkContext = ...
>>>>> val sqlContext = new SQLContext(sc)
>>>>> import sqlContext.implicits._
>>>>>
>>>>> Le mar. 12 mai 2015 à 12:29, SLiZn Liu <sliznmail...@gmail.com> a
>>>>> écrit :
>>>>>
>>>>>> I added `libraryDependencies += "org.apache.spark" % "spark-sql_2.11"
>>>>>> % "1.3.1"` to `build.sbt` but the error remains. Do I need to import
>>>>>> modules other than `import org.apache.spark.sql.{ Row, SQLContext }`?
>>>>>>
>>>>>> On Tue, May 12, 2015 at 5:56 PM Olivier Girardot <ssab...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> toDF is part of spark SQL so you need Spark SQL dependency + import
>>>>>>> sqlContext.implicits._ to get the toDF method.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Olivier.
>>>>>>>
>>>>>>> Le mar. 12 mai 2015 à 11:36, SLiZn Liu <sliznmail...@gmail.com> a
>>>>>>> écrit :
>>>>>>>
>>>>>>>> Hi User Group,
>>>>>>>>
>>>>>>>> I’m trying to reproduce the example on Spark SQL Programming Guide
>>>>>>>> <https://spark.apache.org/docs/latest/sql-programming-guide.html#inferring-the-schema-using-reflection>,
>>>>>>>> and got a compile error when packaging with sbt:
>>>>>>>>
>>>>>>>> [error] myfile.scala:30: value toDF is not a member of 
>>>>>>>> org.apache.spark.rdd.RDD[Person]
>>>>>>>> [error] val people = 
>>>>>>>> sc.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(p
>>>>>>>>  => Person(p(0), p(1).trim.toInt)).toDF()
>>>>>>>> [error]                                                                
>>>>>>>>                                                               ^
>>>>>>>> [error] one error found
>>>>>>>> [error] (compile:compileIncremental) Compilation failed
>>>>>>>> [error] Total time: 3 s, completed May 12, 2015 4:11:53 PM
>>>>>>>>
>>>>>>>> I double checked my code includes import sqlContext.implicits._
>>>>>>>> after reading this post
>>>>>>>> <https://mail-archives.apache.org/mod_mbox/spark-user/201503.mbox/%3c1426522113299-22083.p...@n3.nabble.com%3E>
>>>>>>>> on spark mailing list, even tried to use toDF("col1", "col2")
>>>>>>>> suggested by Xiangrui Meng in that post and got the same error.
>>>>>>>>
>>>>>>>> The Spark version is specified in build.sbt file as follows:
>>>>>>>>
>>>>>>>> scalaVersion := "2.11.6"
>>>>>>>> libraryDependencies += "org.apache.spark" % "spark-core_2.11" % 
>>>>>>>> "1.3.1" % "provided"
>>>>>>>> libraryDependencies += "org.apache.spark" % "spark-mllib_2.11" % 
>>>>>>>> "1.3.1"
>>>>>>>>
>>>>>>>> Anyone have ideas the cause of this error?
>>>>>>>>
>>>>>>>> REGARDS,
>>>>>>>> Todd Leo
>>>>>>>> ​
>>>>>>>>
>>>>>>>
>>>>
>  ​

Reply via email to