How did you run the example app? Did you use spark-submit? -Xiangrui On Thu, Apr 23, 2015 at 2:27 PM, Su She <suhsheka...@gmail.com> wrote: > Sorry, accidentally sent the last email before finishing. > > I had asked this question before, but wanted to ask again as I think > it is now related to my pom file or project setup. Really appreciate the help! > > I have been trying on/off for the past month to try to run this MLlib > example: > https://github.com/databricks/learning-spark/blob/master/src/main/scala/com/oreilly/learningsparkexamples/scala/MLlib.scala > > I am able to build the project successfully. When I run it, it returns: > > features in spam: 8 > features in ham: 7 > > and then freezes. According to the UI, the description of the job is > "count at DataValidators.scala.38. This corresponds to this line in > the code: > > val model = lrLearner.run(trainingData) > > I've tried just about everything I can think of...changed numFeatures > from 1 -> 10,000, set executor memory to 1g, set up a new cluster, at > this point I think I might have missed dependencies as that has > usually been the problem in other spark apps I have tried to run. This > is my pom file, that I have used for other successful spark apps. > Please let me know if you think I need any additional dependencies or > there are incompatibility issues, or a pom.xml that is better to use. > Thank you! > > Cluster information: > > Spark version: 1.2.0-SNAPSHOT (in my older cluster it is 1.2.0) > java version "1.7.0_25" > Scala version: 2.10.4 > hadoop version: hadoop 2.5.0-cdh5.3.3 (older cluster was 5.3.0) > > > > <project xmlns = "http://maven.apache.org/POM/4.0.0" > xmlns:xsi="http://w3.org/2001/XMLSchema-instance" xsi:schemaLocation > ="http://maven.apache.org/POM/4.0.0 > http://maven.apache.org/maven-v4_0_0.xsd"> > <groupId> edu.berkely</groupId> > <artifactId> simple-project </artifactId> > <modelVersion> 4.0.0</modelVersion> > <name> Simple Project </name> > <packaging> jar </packaging> > <version> 1.0 </version> > <repositories> > <repository> > <id>cloudera</id> > <url> http://repository.cloudera.com/artifactory/cloudera-repos/</url> > </repository> > > <repository> > <id>scala-tools.org</id> > <name>Scala-tools Maven2 Repository</name> > <url>http://scala-tools.org/repo-releases</url> > </repository> > > </repositories> > > <pluginRepositories> > <pluginRepository> > <id>scala-tools.org</id> > <name>Scala-tools Maven2 Repository</name> > <url>http://scala-tools.org/repo-releases</url> > </pluginRepository> > </pluginRepositories> > > <build> > <plugins> > <plugin> > <groupId>org.scala-tools</groupId> > <artifactId>maven-scala-plugin</artifactId> > <executions> > > <execution> > <id>compile</id> > <goals> > <goal>compile</goal> > </goals> > <phase>compile</phase> > </execution> > <execution> > <id>test-compile</id> > <goals> > <goal>testCompile</goal> > </goals> > <phase>test-compile</phase> > </execution> > <execution> > <phase>process-resources</phase> > <goals> > <goal>compile</goal> > </goals> > </execution> > </executions> > </plugin> > <plugin> > <artifactId>maven-compiler-plugin</artifactId> > <configuration> > <source>1.7</source> > <target>1.7</target> > </configuration> > </plugin> > </plugins> > </build> > > > <dependencies> > <dependency> <!--Spark dependency --> > <groupId> org.apache.spark</groupId> > <artifactId>spark-core_2.10</artifactId> > <version>1.2.0-cdh5.3.0</version> > </dependency> > > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-client</artifactId> > <version>2.5.0-mr1-cdh5.3.0</version> > </dependency> > > <dependency> > <groupId>org.scala-lang</groupId> > <artifactId>scala-library</artifactId> > <version>2.10.4</version> > </dependency> > > <dependency> > <groupId>org.scala-lang</groupId> > <artifactId>scala-compiler</artifactId> > <version>2.10.4</version> > </dependency> > > <dependency> > <groupId>com.101tec</groupId> > <artifactId>zkclient</artifactId> > <version>0.3</version> > </dependency> > > <dependency> > <groupId>com.yammer.metrics</groupId> > <artifactId>metrics-core</artifactId> > <version>2.2.0</version> > </dependency> > > > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-yarn-server-web-proxy</artifactId> > <version>2.5.0</version> > </dependency> > > <dependency> > <groupId>org.apache.thrift</groupId> > <artifactId>libthrift</artifactId> > <version>0.9.2</version> > </dependency> > > <dependency> > <groupId>com.google.guava</groupId> > <artifactId>guava</artifactId> > <version>18.0</version> > </dependency> > > <dependency> > <groupId>junit</groupId> > <artifactId>junit</artifactId> > <version>3.8.1</version> > <scope>test</scope> > </dependency> > > <dependency> > <groupId>org.apache.spark</groupId> > <artifactId>spark-mllib_2.10</artifactId> > <version>1.2.0</version> > </dependency> > > <dependency> > <groupId>org.scalanlp</groupId> > <artifactId>breeze-math_2.10</artifactId> > <version>0.4</version> > </dependency> > > <dependency> > <groupId>com.googlecode.netlib-java</groupId> > <artifactId>netlib-java</artifactId> > <version>1.0</version> > </dependency> > > <dependency> > <groupId>org.jblas</groupId> > <artifactId>jblas</artifactId> > <version>1.2.3</version> > </dependency> > > </dependencies> > > </project> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org >
--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org