RC3 works for me now. :)

-----Ursprüngliche Nachricht-----
Von: Pat Ferrel [mailto:p...@occamsmachete.com] 
Gesendet: Montag, 6. März 2017 17:32
An: user@mahout.apache.org
Cc: Michael Müller <michael.muel...@condat.de>
Betreff: Re: 0.13.0-RC not fully compatible with Spark 1.6.3?

Thanks for finding this.

It appears to be because the jar passed to Spark with classes to be serialized 
was not updated when some code was refactored. We have a fix under test that 
will be in the next RC. If you could test the next RC (maybe ready tomorrow) 
we’d be very grateful.


On Mar 3, 2017, at 12:58 PM, Michael Müller <michael.muel...@condat.de> wrote:

> So you are downloading the binary and running the Mahout spark-itemsimilarity 
> driver from that binary?

yes


> You say “using the same Spark cluster” How is this setup, an env var like 
> MASTER=?
> Can you supply you you point to the cluster and your CLI for the job?


These are my environment settings for Spark and Mahout:

export MAHOUT_HOME=/home/aml/mahout/apache-mahout-distribution-0.13.0
#export MAHOUT_LOCAL=true
export SPARK_HOME=/home/aml/spark/spark-1.6.3-bin-hadoop2.6
export MASTER=spark://ubuntu:7077
export JAVA_HOME=/usr/lib/jvm/java-8-oracle/jre

I'm starting the job like this:

/home/aml/mahout/apache-mahout-distribution-0.13.0/bin/mahout 
spark-itemsimilarity --master spark://ubuntu:7077 --input 
~/data/rating_200k.csv --output ~/data/rating_200k_output --itemIDColumn 1 
--rowIDColumn 0 --sparkExecutorMem 6g



And when i change MAHOUT_HOME to point to my Mahout 0.12.2 installation (-> 
/home/aml/mahout/apache-mahout-distribution-0.12.2) and then start the job like 
that, it succeeds:

/home/aml/mahout/apache-mahout-distribution-0.12.2/bin/mahout 
spark-itemsimilarity --master spark://ubuntu:7077 --input 
~/data/rating_200k.csv --output ~/data/rating_200k_output --itemIDColumn 1 
--rowIDColumn 0 --sparkExecutorMem 6g




-----Ursprüngliche Nachricht-----
Von: Pat Ferrel [mailto:p...@occamsmachete.com]
Gesendet: Freitag, 3. März 2017 20:49
An: Michael Müller
Cc: user@mahout.apache.org
Betreff: Re: 0.13.0-RC not fully compatible with Spark 1.6.3?

Thanks, I’ll see if I can reproduce. 

So you are downloading the binary and running the Mahout spark-itemsimilarity 
driver from that binary? You say “using the same Spark cluster” How is this 
setup, an env var like MASTER=? Can you supply you you point to the cluster and 
your CLI for the job?



On Mar 3, 2017, at 1:26 AM, Michael Müller <michael.muel...@condat.de> wrote:

Hi all,

is Mahout 0.13.0 supposed to work with Spark 1.6.3? I would think so as the 
master-pom.xml explicitly references Spark 1.6.3.
But when I run a spark-itemsimilarity command (on the 0.13.0-RC) against my 
Spark 1.6.3-standalone cluster, the command fails with:

17/03/03 10:08:40 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 
reco-master): java.io.IOException: org.apache.spark.SparkException: Failed to 
register classes with Kryo
        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1212)
        at 
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:165)
...
Caused by: java.lang.ClassNotFoundException: 
org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at 
org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply(KryoSerializer.scala:123)
        at 
org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply(KryoSerializer.scala:123)
        at scala.Option.map(Option.scala:145)
        at 
org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:123)

When I run the exactly same command on the 0.12.2 release distribution against 
the same Spark cluster, the command completes sucessfully.

My Environment is:
* Ubuntu 14.04
* Oracle-JDK 1.8.0_121
* Spark standalone cluster using this distribution: 
http://d3kbcqa49mib13.cloudfront.net/spark-1.6.3-bin-hadoop2.6.tgz
* Mahout 0.13.0-RC: 
https://repository.apache.org/content/repositories/orgapachemahout-1034/org/apache/mahout/apache-mahout-distribution/0.13.0/apache-mahout-distribution-0.13.0.tar.gz


TIA

--
Michael Müller
Condat AG, Berlin




Reply via email to