Hello Xiangrui, I am using this spark-submit command (as I do for all other jobs):
/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/bin/spark-submit --class MLlib --master local[2] --jars $(echo /home/ec2-user/sparkApps/learning-spark/lib/*.jar | tr ' ' ',') /home/ec2-user/sparkApps/learning-spark/target/simple-project-1.1.jar Thank you for the help! Best, Su On Mon, Apr 27, 2015 at 9:58 AM, Xiangrui Meng <men...@gmail.com> wrote: > How did you run the example app? Did you use spark-submit? -Xiangrui > > On Thu, Apr 23, 2015 at 2:27 PM, Su She <suhsheka...@gmail.com> wrote: >> Sorry, accidentally sent the last email before finishing. >> >> I had asked this question before, but wanted to ask again as I think >> it is now related to my pom file or project setup. Really appreciate the >> help! >> >> I have been trying on/off for the past month to try to run this MLlib >> example: >> https://github.com/databricks/learning-spark/blob/master/src/main/scala/com/oreilly/learningsparkexamples/scala/MLlib.scala >> >> I am able to build the project successfully. When I run it, it returns: >> >> features in spam: 8 >> features in ham: 7 >> >> and then freezes. According to the UI, the description of the job is >> "count at DataValidators.scala.38. This corresponds to this line in >> the code: >> >> val model = lrLearner.run(trainingData) >> >> I've tried just about everything I can think of...changed numFeatures >> from 1 -> 10,000, set executor memory to 1g, set up a new cluster, at >> this point I think I might have missed dependencies as that has >> usually been the problem in other spark apps I have tried to run. This >> is my pom file, that I have used for other successful spark apps. >> Please let me know if you think I need any additional dependencies or >> there are incompatibility issues, or a pom.xml that is better to use. >> Thank you! >> >> Cluster information: >> >> Spark version: 1.2.0-SNAPSHOT (in my older cluster it is 1.2.0) >> java version "1.7.0_25" >> Scala version: 2.10.4 >> hadoop version: hadoop 2.5.0-cdh5.3.3 (older cluster was 5.3.0) >> >> >> >> <project xmlns = "http://maven.apache.org/POM/4.0.0" >> xmlns:xsi="http://w3.org/2001/XMLSchema-instance" xsi:schemaLocation >> ="http://maven.apache.org/POM/4.0.0 >> http://maven.apache.org/maven-v4_0_0.xsd"> >> <groupId> edu.berkely</groupId> >> <artifactId> simple-project </artifactId> >> <modelVersion> 4.0.0</modelVersion> >> <name> Simple Project </name> >> <packaging> jar </packaging> >> <version> 1.0 </version> >> <repositories> >> <repository> >> <id>cloudera</id> >> <url> >> http://repository.cloudera.com/artifactory/cloudera-repos/</url> >> </repository> >> >> <repository> >> <id>scala-tools.org</id> >> <name>Scala-tools Maven2 Repository</name> >> <url>http://scala-tools.org/repo-releases</url> >> </repository> >> >> </repositories> >> >> <pluginRepositories> >> <pluginRepository> >> <id>scala-tools.org</id> >> <name>Scala-tools Maven2 Repository</name> >> <url>http://scala-tools.org/repo-releases</url> >> </pluginRepository> >> </pluginRepositories> >> >> <build> >> <plugins> >> <plugin> >> <groupId>org.scala-tools</groupId> >> <artifactId>maven-scala-plugin</artifactId> >> <executions> >> >> <execution> >> <id>compile</id> >> <goals> >> <goal>compile</goal> >> </goals> >> <phase>compile</phase> >> </execution> >> <execution> >> <id>test-compile</id> >> <goals> >> <goal>testCompile</goal> >> </goals> >> <phase>test-compile</phase> >> </execution> >> <execution> >> <phase>process-resources</phase> >> <goals> >> <goal>compile</goal> >> </goals> >> </execution> >> </executions> >> </plugin> >> <plugin> >> <artifactId>maven-compiler-plugin</artifactId> >> <configuration> >> <source>1.7</source> >> <target>1.7</target> >> </configuration> >> </plugin> >> </plugins> >> </build> >> >> >> <dependencies> >> <dependency> <!--Spark dependency --> >> <groupId> org.apache.spark</groupId> >> <artifactId>spark-core_2.10</artifactId> >> <version>1.2.0-cdh5.3.0</version> >> </dependency> >> >> <dependency> >> <groupId>org.apache.hadoop</groupId> >> <artifactId>hadoop-client</artifactId> >> <version>2.5.0-mr1-cdh5.3.0</version> >> </dependency> >> >> <dependency> >> <groupId>org.scala-lang</groupId> >> <artifactId>scala-library</artifactId> >> <version>2.10.4</version> >> </dependency> >> >> <dependency> >> <groupId>org.scala-lang</groupId> >> <artifactId>scala-compiler</artifactId> >> <version>2.10.4</version> >> </dependency> >> >> <dependency> >> <groupId>com.101tec</groupId> >> <artifactId>zkclient</artifactId> >> <version>0.3</version> >> </dependency> >> >> <dependency> >> <groupId>com.yammer.metrics</groupId> >> <artifactId>metrics-core</artifactId> >> <version>2.2.0</version> >> </dependency> >> >> >> <dependency> >> <groupId>org.apache.hadoop</groupId> >> <artifactId>hadoop-yarn-server-web-proxy</artifactId> >> <version>2.5.0</version> >> </dependency> >> >> <dependency> >> <groupId>org.apache.thrift</groupId> >> <artifactId>libthrift</artifactId> >> <version>0.9.2</version> >> </dependency> >> >> <dependency> >> <groupId>com.google.guava</groupId> >> <artifactId>guava</artifactId> >> <version>18.0</version> >> </dependency> >> >> <dependency> >> <groupId>junit</groupId> >> <artifactId>junit</artifactId> >> <version>3.8.1</version> >> <scope>test</scope> >> </dependency> >> >> <dependency> >> <groupId>org.apache.spark</groupId> >> <artifactId>spark-mllib_2.10</artifactId> >> <version>1.2.0</version> >> </dependency> >> >> <dependency> >> <groupId>org.scalanlp</groupId> >> <artifactId>breeze-math_2.10</artifactId> >> <version>0.4</version> >> </dependency> >> >> <dependency> >> <groupId>com.googlecode.netlib-java</groupId> >> <artifactId>netlib-java</artifactId> >> <version>1.0</version> >> </dependency> >> >> <dependency> >> <groupId>org.jblas</groupId> >> <artifactId>jblas</artifactId> >> <version>1.2.3</version> >> </dependency> >> >> </dependencies> >> >> </project> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org