RDDOperationScope is in spark-core_2.1x jar file.

  7148 Mon Feb 29 09:21:32 PST 2016
org/apache/spark/rdd/RDDOperationScope.class

Can you check whether the spark-core jar is in classpath ?

FYI

On Mon, Feb 29, 2016 at 1:40 PM, Taylor, Ronald C <ronald.tay...@pnnl.gov>
wrote:

> Hi Jules, folks,
>
>
>
> I have tried “hdfs://<HDFS filepath>” as well as “file://<local Linux
> filepath>”.  And several variants. Every time, I get the same msg –
> NoClassDefFoundError. See below. Why do I get such a msg, if the problem is
> simply that Spark cannot find the text file? Doesn’t the error msg indicate
> some other source of the problem?
>
>
>
> I may be missing something in the error report; I am a Java person, not a
> Python programmer.  But doesn’t it look like a call to a Java class
> –something associated with “o9.textFile” -  is failing?  If so, how do I
> fix this?
>
>
>
>   Ron
>
>
>
>
>
> "/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/python/pyspark/context.py",
> line 451, in textFile
>
>     return RDD(self._jsc.textFile(name, minPartitions), self,
>
>   File
> "/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py",
> line 538, in __call__
>
>   File
> "/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/python/pyspark/sql/utils.py",
> line 36, in deco
>
>     return f(*a, **kw)
>
>   File
> "/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py",
> line 300, in get_return_value
>
> py4j.protocol.Py4JJavaError: An error occurred while calling o9.textFile.
>
> : java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.spark.rdd.RDDOperationScope$
>
>
>
> Ronald C. Taylor, Ph.D.
>
> Computational Biology & Bioinformatics Group
>
> Pacific Northwest National Laboratory (U.S. Dept of Energy/Battelle)
>
> Richland, WA 99352
>
> phone: (509) 372-6568,  email: ronald.tay...@pnnl.gov
>
> web page:  http://www.pnnl.gov/science/staff/staff_info.asp?staff_num=7048
>
>
>
> *From:* Jules Damji [mailto:dmat...@comcast.net]
> *Sent:* Sunday, February 28, 2016 10:07 PM
> *To:* Taylor, Ronald C
> *Cc:* user@spark.apache.org; ronald.taylo...@gmail.com
> *Subject:* Re: a basic question on first use of PySpark shell and
> example, which is failing
>
>
>
>
>
> Hello Ronald,
>
>
>
> Since you have placed the file under HDFS, you might same change the path
> name to:
>
>
>
> val lines = sc.textFile("hdfs://user/taylor/Spark/Warehouse.java")
>
>
> Sent from my iPhone
>
> Pardon the dumb thumb typos :)
>
>
> On Feb 28, 2016, at 9:36 PM, Taylor, Ronald C <ronald.tay...@pnnl.gov>
> wrote:
>
>
>
> Hello folks,
>
>
>
> I  am a newbie, and am running Spark on a small Cloudera CDH 5.5.1 cluster
> at our lab. I am trying to use the PySpark shell for the first time. and am
> attempting to  duplicate the documentation example of creating an RDD
> which I called "lines" using a text file.
>
> I placed a a text file called Warehouse.java in this HDFS location:
>
>
> [rtaylor@bigdatann ~]$ hadoop fs -ls /user/rtaylor/Spark
> -rw-r--r--   3 rtaylor supergroup    1155355 2016-02-28 18:09
> /user/rtaylor/Spark/Warehouse.java
> [rtaylor@bigdatann ~]$
>
> I then invoked sc.textFile()in the PySpark shell.That did not work. See
> below. Apparently a class is not found? Don't know why that would be the
> case. Any guidance would be very much appreciated.
>
> The Cloudera Manager for the cluster says that Spark is operating  in the
> "green", for whatever that is worth.
>
>  - Ron Taylor
>
>
> >>> lines = sc.textFile("file:///user/taylor/Spark/Warehouse.java")
>
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File
> "/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/python/pyspark/context.py",
> line 451, in textFile
>     return RDD(self._jsc.textFile(name, minPartitions), self,
>   File
> "/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py",
> line 538, in __call__
>   File
> "/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/python/pyspark/sql/utils.py",
> line 36, in deco
>     return f(*a, **kw)
>   File
> "/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py",
> line 300, in get_return_value
> py4j.protocol.Py4JJavaError: An error occurred while calling o9.textFile.
> : java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.spark.rdd.RDDOperationScope$
>     at org.apache.spark.SparkContext.withScope(SparkContext.scala:709)
>     at org.apache.spark.SparkContext.textFile(SparkContext.scala:825)
>     at
> org.apache.spark.api.java.JavaSparkContext.textFile(JavaSparkContext.scala:191)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:606)
>     at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
>     at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
>     at py4j.Gateway.invoke(Gateway.java:259)
>     at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
>     at py4j.commands.CallCommand.execute(CallCommand.java:79)
>     at py4j.GatewayConnection.run(GatewayConnection.java:207)
>     at java.lang.Thread.run(Thread.java:745)
>
> >>>
>
>

Reply via email to