I think this might actually be a spark bug possibly related to: https://issues.apache.org/jira/browse/SPARK-2678
Even though we don't use spark-submit to start the mahout shell, it seems that the CLI options are being dropped somewhere in SparkILoop.process(args) I get the same error: bad option: '--driver-java-options=-Dspark.kryoserializer.buffer.mb=200' running: $ mahout spark-shell --driver-java-options="-Dspark.kryoserializer.buffer.mb=200" and: $SPARK_HOME/bin/spark-shell --driver-java-options="-Dspark.kryoserializer.buffer.mb=200" > From: [email protected] > To: [email protected] > Subject: RE: setting spark config parameters for shell > Date: Thu, 11 Sep 2014 13:09:21 -0400 > > I'm just using it to test out some of the changes that I've made to NB at the > math-scala level. It's great to test these abstract things out with. > > I'm gonna look through the mahout script- i seem to remember that's around > where i stopped looking a couple months ago. > > > > Date: Thu, 11 Sep 2014 09:58:53 -0700 > > Subject: Re: setting spark config parameters for shell > > From: [email protected] > > To: [email protected] > > > > yeah these things need to be tweaked for a particular application. Truth > > is, i have not yet used the shell for anything formiddable yet. For me at > > this point it is just a fine concept. I've been doing embedded spark use > > (at which point one of course has a full control over SparkConf stuff). > > > > On Thu, Sep 11, 2014 at 9:55 AM, Andrew Palumbo <[email protected]> wrote: > > > > > thanks. I was looking into this before a while back but got sidetracked > > > and am just coming back to it. But I do remember thinking that the > > > arguments may have been dropped by /bin/mahout spark-shell > > > > > > i tried: $/bin/mahout spark-shell -Dspark.kryoserializer.buffer.mb=200 > > > > > > but its not showing up in propertires the environment tab of > > > localhost:/8080 -> "Mahout Spark Shell" > > > > > > I'll look back into /bin/Mahout to see if there's a problem there. > > > > > > I'm getting the following error: > > > > > > com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0, > > > required: 1274 > > > > > > after doing some work on a Drm containing the output of seq2sparse from > > > the 20newsgroups example. > > > > > > it seems to be failing at .collect > > > > > > > > > > > > > > > > Date: Thu, 11 Sep 2014 09:39:32 -0700 > > > > Subject: Re: setting spark config parameters for shell > > > > From: [email protected] > > > > To: [email protected] > > > > > > > > I remember i had a good answer for these type of things in context of > > > > the > > > > shell, but have forgotten the answer... bummer: ) > > > > > > > > In spark, you can just pass them in with -Dname=value. May need tweaking > > > > bin/mahout script though. that's what i dont remember. > > > > > > > > I thought we were setting a reasonable default though.. > > > > > > > > > > > > On Thu, Sep 11, 2014 at 9:22 AM, Andrew Palumbo <[email protected]> > > > wrote: > > > > > > > > > Does anybody know of an easy way to set the config parameters for the > > > > > mahout spark-shell? > > > > > > > > > > I need to adjust: spark.kryoserializer.buffer.mb > > > > > > > > > > I've been diging through the spark docs but not having much luck. > > > > > > > > > > > > > > > > >
