Er, actually creating the context through the bindings call
// set various system properties here
sc = mahoutSparkContext(masterUrl = options.master, appName =
"ItemSimilarityJob",
customJars = Traversable.empty[String])
Since the context is created here, any extra conf properties must be defined
before the call via
System.setProperty("spark.executor.memory", "2g")
Seems like if would be better to split this into setup and context creation so
you could create the conf, access it via
sparkConf.set("spark.executor.memory", "2g”).set(…)
Then once all setup of the conf is done get the created context, which actually
starts the job.
might look something like:
mc = mahoutContext(masterUrl = options.master, appName =
"ItemSimilarityJob",
customJars = Traversable.empty[String]).set("spark.executor.memory",
"2g”).set(…).start
So only the start returns a context, the other methods return the current conf.
I’m sure it could be done it other ways.
On May 28, 2014, at 9:07 AM, Pat Ferrel <[email protected]> wrote:
For purposes outside the DSL it seems we need to wrap the SparkContext with
something like a MahoutContext. The current sparkBindings object looks pretty
DSL specific. It sets some kyro properties but these need to be accessible to
code outside the DSL. I’ve been creating a raw SparkContext and passing around
the ubiquitous “sc”, which works but is this the way we should be doing this?