date:20141117

Re: spark-submit question

2014-11-17 Thread Samarth Mailinglist

I figured it out. I had to use pyspark.files.SparkFiles to get the
locations of files loaded into Spark.


On Mon, Nov 17, 2014 at 1:26 PM, Sean Owen so...@cloudera.com wrote:

 You are changing these paths and filenames to match your own actual
 scripts and file locations right?
 On Nov 17, 2014 4:59 AM, Samarth Mailinglist 
 mailinglistsama...@gmail.com wrote:

 I am trying to run a job written in python with the following command:

 bin/spark-submit --master spark://localhost:7077 
 /path/spark_solution_basic.py --py-files /path/*.py --files 
 /path/config.properties

 I always get an exception that config.properties is not found:

 INFO - IOError: [Errno 2] No such file or directory: 'config.properties'

 Why isn't this working?

RandomGenerator class not found exception

2014-11-17 Thread Ritesh Kumar Singh

My sbt file for the project includes this:

libraryDependencies ++= Seq(
org.apache.spark  %% spark-core  % 1.1.0,
org.apache.spark  %% spark-mllib % 1.1.0,
org.apache.commons % commons-math3 % 3.3
)
=

Still I am getting this error:

java.lang.NoClassDefFoundError:
org/apache/commons/math3/random/RandomGenerator

=

The jar at location: ~/.m2/repository/org/apache/commons/commons-math3/3.3
contains the random generator class:

 $ jar tvf commons-math3-3.3.jar | grep RandomGenerator
org/apache/commons/math3/random/RandomGenerator.class
org/apache/commons/math3/random/UniformRandomGenerator.class
org/apache/commons/math3/random/SynchronizedRandomGenerator.class
org/apache/commons/math3/random/AbstractRandomGenerator.class
org/apache/commons/math3/random/RandomGeneratorFactory$1.class
org/apache/commons/math3/random/RandomGeneratorFactory.class
org/apache/commons/math3/random/StableRandomGenerator.class
org/apache/commons/math3/random/NormalizedRandomGenerator.class
org/apache/commons/math3/random/JDKRandomGenerator.class
org/apache/commons/math3/random/GaussianRandomGenerator.class


Please help

Re: RandomGenerator class not found exception

2014-11-17 Thread Akhil Das

Add this jar
http://mvnrepository.com/artifact/org.apache.commons/commons-math3/3.3
while creating the sparkContext.

sc.addJar(/path/to/commons-math3-3.3.jar)

And make sure it is shipped and available in the environment tab (4040)


Thanks
Best Regards

On Mon, Nov 17, 2014 at 1:54 PM, Ritesh Kumar Singh 
riteshoneinamill...@gmail.com wrote:

 My sbt file for the project includes this:

 libraryDependencies ++= Seq(
 org.apache.spark  %% spark-core  % 1.1.0,
 org.apache.spark  %% spark-mllib % 1.1.0,
 org.apache.commons % commons-math3 % 3.3
 )
 =

 Still I am getting this error:

 java.lang.NoClassDefFoundError:
 org/apache/commons/math3/random/RandomGenerator

 =

 The jar at location: ~/.m2/repository/org/apache/commons/commons-math3/3.3
 contains the random generator class:

  $ jar tvf commons-math3-3.3.jar | grep RandomGenerator
 org/apache/commons/math3/random/RandomGenerator.class
 org/apache/commons/math3/random/UniformRandomGenerator.class
 org/apache/commons/math3/random/SynchronizedRandomGenerator.class
 org/apache/commons/math3/random/AbstractRandomGenerator.class
 org/apache/commons/math3/random/RandomGeneratorFactory$1.class
 org/apache/commons/math3/random/RandomGeneratorFactory.class
 org/apache/commons/math3/random/StableRandomGenerator.class
 org/apache/commons/math3/random/NormalizedRandomGenerator.class
 org/apache/commons/math3/random/JDKRandomGenerator.class
 org/apache/commons/math3/random/GaussianRandomGenerator.class


 Please help

Re: RandomGenerator class not found exception

2014-11-17 Thread Chitturi Padma

Include the commons-math3/3.3 in class path while submitting jar to spark
cluster. Like..
spark-submit --driver-class-path maths3.3jar --class MainClass --master
spark cluster url appjar

On Mon, Nov 17, 2014 at 1:55 PM, Ritesh Kumar Singh [via Apache Spark User
List] ml-node+s1001560n19055...@n3.nabble.com wrote:

 My sbt file for the project includes this:

 libraryDependencies ++= Seq(
 org.apache.spark  %% spark-core  % 1.1.0,
 org.apache.spark  %% spark-mllib % 1.1.0,
 org.apache.commons % commons-math3 % 3.3
 )
 =

 Still I am getting this error:

 java.lang.NoClassDefFoundError:
 org/apache/commons/math3/random/RandomGenerator

 =

 The jar at location: ~/.m2/repository/org/apache/commons/commons-math3/3.3
 contains the random generator class:

  $ jar tvf commons-math3-3.3.jar | grep RandomGenerator
 org/apache/commons/math3/random/RandomGenerator.class
 org/apache/commons/math3/random/UniformRandomGenerator.class
 org/apache/commons/math3/random/SynchronizedRandomGenerator.class
 org/apache/commons/math3/random/AbstractRandomGenerator.class
 org/apache/commons/math3/random/RandomGeneratorFactory$1.class
 org/apache/commons/math3/random/RandomGeneratorFactory.class
 org/apache/commons/math3/random/StableRandomGenerator.class
 org/apache/commons/math3/random/NormalizedRandomGenerator.class
 org/apache/commons/math3/random/JDKRandomGenerator.class
 org/apache/commons/math3/random/GaussianRandomGenerator.class


 Please help


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://apache-spark-user-list.1001560.n3.nabble.com/RandomGenerator-class-not-found-exception-tp19055.html
  To start a new topic under Apache Spark User List, email
 ml-node+s1001560n1...@n3.nabble.com
 To unsubscribe from Apache Spark User List, click here
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=1code=bGVhcm5pbmdzLmNoaXR0dXJpQGdtYWlsLmNvbXwxfC03NzExMjUwMg==
 .
 NAML
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/RandomGenerator-class-not-found-exception-tp19055p19057.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Functions in Spark

2014-11-17 Thread Gerard Maas

One 'rule of thumbs' is to use rdd.toDebugString and check the lineage for
ShuffleRDD. As long as there's no need for restructuring the RDD,
operations can be pipelined on each partition.

rdd.toDebugString is your friend :-)

-kr, Gerard.


On Mon, Nov 17, 2014 at 7:37 AM, Mukesh Jha me.mukesh@gmail.com wrote:

 Thanks I did go through the video it was very informative, but I think I's
 looking for the Transformations section @ page
 https://spark.apache.org/docs/0.9.1/scala-programming-guide.html.


 On Mon, Nov 17, 2014 at 10:31 AM, Samarth Mailinglist 
 mailinglistsama...@gmail.com wrote:

 Check this video out:
 https://www.youtube.com/watch?v=dmL0N3qfSc8list=UURzsq7k4-kT-h3TDUBQ82-w

 On Mon, Nov 17, 2014 at 9:43 AM, Deep Pradhan pradhandeep1...@gmail.com
 wrote:

 Hi,
 Is there any way to know which of my functions perform better in Spark?
 In other words, say I have achieved same thing using two different
 implementations. How do I judge as to which implementation is better than
 the other. Is processing time the only metric that we can use to claim the
 goodness of one implementation to the other?
 Can anyone please share some thoughts on this?

 Thank You





 --


 Thanks  Regards,

 *Mukesh Jha me.mukesh@gmail.com*

Landmarks in GraphX section of Spark API

2014-11-17 Thread Deep Pradhan

Hi,
I was going through the graphx section in the Spark API in
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.graphx.lib.ShortestPaths$

Here, I find the word landmark. Can anyone explain to me what is landmark
means. Is it a simple English word or does it mean something else in graphx.

Thank You

Re: How to incrementally compile spark examples using mvn

2014-11-17 Thread Sean Owen

The downloads just happen once so this is not a problem.

If you are just building one module in a project, it needs a compiled
copy of other modules. It will either use your locally-built and
locally-installed artifact, or, download one from the repo if
possible.

This isn't needed if you are compiling all modules at once. If you
want to compile everything and reuse the local artifacts later, you
need 'install' not 'package'.

On Mon, Nov 17, 2014 at 12:27 AM, Yiming (John) Zhang sdi...@gmail.com wrote:
Thank you Marcelo. I tried your suggestion (# mvn -pl :spark-examples_2.10
compile), but it required to download many spark components (as listed
below), which I have already compiled on my server.

Downloading:
https://repo1.maven.org/maven2/org/apache/spark/spark-core_2.10/1.1.0/spark-core_2.10-1.1.0.pom
...
Downloading:
https://repo1.maven.org/maven2/org/apache/spark/spark-streaming_2.10/1.1.0/spark-streaming_2.10-1.1.0.pom
...
Downloading:
https://repository.jboss.org/nexus/content/repositories/releases/org/apache/spark/spark-hive_2.10/1.1.0/spark-hive_2.10-1.1.0.pom
...

This problem didn't happen when I compiled the whole project using ``mvn
-DskipTests package''. I guess some configurations have to be made to tell
mvn the dependencies are local. Any idea for that?

Thank you for your help!

Cheers,
Yiming

-邮件原件-
发件人: Marcelo Vanzin [mailto:van...@cloudera.com]
发送时间: 2014年11月16日 10:26
收件人: sdi...@gmail.com
抄送: user@spark.apache.org
主题: Re: How to incrementally compile spark examples using mvn

I haven't tried scala:cc, but you can ask maven to just build a particular
sub-project. For example:

mvn -pl :spark-examples_2.10 compile

On Sat, Nov 15, 2014 at 5:31 PM, Yiming (John) Zhang sdi...@gmail.com wrote:
Hi,

I have already successfully compile and run spark examples. My problem
is that if I make some modifications (e.g., on SparkPi.scala or
LogQuery.scala) I have to use “mvn -DskipTests package” to rebuild the
whole spark project and wait a relatively long time.

I also tried “mvn scala:cc” as described in
http://spark.apache.org/docs/latest/building-with-maven.html, but I
could only get infinite stop like:

68 matches

Mail list logo