RC3 works for me now. :)
-----Ursprüngliche Nachricht----- Von: Pat Ferrel [mailto:p...@occamsmachete.com] Gesendet: Montag, 6. März 2017 17:32 An: user@mahout.apache.org Cc: Michael Müller <michael.muel...@condat.de> Betreff: Re: 0.13.0-RC not fully compatible with Spark 1.6.3? Thanks for finding this. It appears to be because the jar passed to Spark with classes to be serialized was not updated when some code was refactored. We have a fix under test that will be in the next RC. If you could test the next RC (maybe ready tomorrow) we’d be very grateful. On Mar 3, 2017, at 12:58 PM, Michael Müller <michael.muel...@condat.de> wrote: > So you are downloading the binary and running the Mahout spark-itemsimilarity > driver from that binary? yes > You say “using the same Spark cluster” How is this setup, an env var like > MASTER=? > Can you supply you you point to the cluster and your CLI for the job? These are my environment settings for Spark and Mahout: export MAHOUT_HOME=/home/aml/mahout/apache-mahout-distribution-0.13.0 #export MAHOUT_LOCAL=true export SPARK_HOME=/home/aml/spark/spark-1.6.3-bin-hadoop2.6 export MASTER=spark://ubuntu:7077 export JAVA_HOME=/usr/lib/jvm/java-8-oracle/jre I'm starting the job like this: /home/aml/mahout/apache-mahout-distribution-0.13.0/bin/mahout spark-itemsimilarity --master spark://ubuntu:7077 --input ~/data/rating_200k.csv --output ~/data/rating_200k_output --itemIDColumn 1 --rowIDColumn 0 --sparkExecutorMem 6g And when i change MAHOUT_HOME to point to my Mahout 0.12.2 installation (-> /home/aml/mahout/apache-mahout-distribution-0.12.2) and then start the job like that, it succeeds: /home/aml/mahout/apache-mahout-distribution-0.12.2/bin/mahout spark-itemsimilarity --master spark://ubuntu:7077 --input ~/data/rating_200k.csv --output ~/data/rating_200k_output --itemIDColumn 1 --rowIDColumn 0 --sparkExecutorMem 6g -----Ursprüngliche Nachricht----- Von: Pat Ferrel [mailto:p...@occamsmachete.com] Gesendet: Freitag, 3. März 2017 20:49 An: Michael Müller Cc: user@mahout.apache.org Betreff: Re: 0.13.0-RC not fully compatible with Spark 1.6.3? Thanks, I’ll see if I can reproduce. So you are downloading the binary and running the Mahout spark-itemsimilarity driver from that binary? You say “using the same Spark cluster” How is this setup, an env var like MASTER=? Can you supply you you point to the cluster and your CLI for the job? On Mar 3, 2017, at 1:26 AM, Michael Müller <michael.muel...@condat.de> wrote: Hi all, is Mahout 0.13.0 supposed to work with Spark 1.6.3? I would think so as the master-pom.xml explicitly references Spark 1.6.3. But when I run a spark-itemsimilarity command (on the 0.13.0-RC) against my Spark 1.6.3-standalone cluster, the command fails with: 17/03/03 10:08:40 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, reco-master): java.io.IOException: org.apache.spark.SparkException: Failed to register classes with Kryo at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1212) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:165) ... Caused by: java.lang.ClassNotFoundException: org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply(KryoSerializer.scala:123) at org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply(KryoSerializer.scala:123) at scala.Option.map(Option.scala:145) at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:123) When I run the exactly same command on the 0.12.2 release distribution against the same Spark cluster, the command completes sucessfully. My Environment is: * Ubuntu 14.04 * Oracle-JDK 1.8.0_121 * Spark standalone cluster using this distribution: http://d3kbcqa49mib13.cloudfront.net/spark-1.6.3-bin-hadoop2.6.tgz * Mahout 0.13.0-RC: https://repository.apache.org/content/repositories/orgapachemahout-1034/org/apache/mahout/apache-mahout-distribution/0.13.0/apache-mahout-distribution-0.13.0.tar.gz TIA -- Michael Müller Condat AG, Berlin