On Mon, Oct 20, 2014 at 11:44 AM, Pat Ferrel <[email protected]> wrote:

> Maybe a more fundamental issue is that we don’t know for sure whether we
> have missing classes or not. The job.jar at least used the pom dependencies
> to guarantee every needed class was present. So the job.jar seems to solve
> the problem but may ship some unnecessary duplicate code, right?
>

No, as i wrote spark doesn't  work with job jar format. Neither as it turns
out more recent hadoop MR btw.

Yes, this is A LOT of duplicate code (will take normally MINUTES to startup
tasks with all of it just on copy time). This is absolutely not the way to
go with this.



> There may be any number of bugs waiting for the time we try running on a
> node machine that doesn’t have some class in it’s classpath.


No. Assuming any given method is tested on all its execution paths, there
will be no bugs. The bugs of that sort will only appear if the user is
using algebra directly and calls something that is not on the path, from
the closure. In which case our answer to this is the same as for the solver
methodology developers -- use customized SparkConf while creating context
to include stuff you really want.

Also another right answer to this is that we probably should reasonably
provide the toolset here. For example, all the stats stuff found in R base
and R stat packages so the user is not compelled to go non-native.




> This is exactly what happened with RandomGenerator when it was dropped
> from Spark. If I hadn’t run a test by hand on the cluster it would never
> have shown up in the unit tests. I suspect that this may have led to other
> odd error reports.
>
> Would a script to run all unit tests on a cluster help find out whether we
> have missing classes or not? As I understand it without a job.jar we can’t
> really be sure.
>

this is probably a good idea, indeed. in fact, i may have introduced some
of those when i transitioned stochastic stuff to mahout random utils
without retesting it in distributed setting.

But i would think all one'd need is some little mod to standard spark-based
test trait that creates the context in order to check out for something
like TEST_MASTER in the environment and use the $TEST_MASTER master instead
of local if one is found. once that tweak is done, one can easily rerun
unit tests simply by giving

TEST_MASTER=spark://localhost:7077 mvn test

(similarly the way master is overriden for shell -- but of course we don't
want tests to react to global MASTER variable just in case it is defined,
so we need aptly named but different one).


>
> On Oct 20, 2014, at 11:16 AM, Dmitriy Lyubimov <[email protected]> wrote:
>
> PS
>
> all jar-finding routines rely on MAHOUT_HOME variable to find jars. so if
> you add some logic to add custom mahout jar to context, it should rely on
> it too.
>
> Perhaps the solution could be along the following lines.
>
> findMahoutJars() finds minimally required set of jars to run. Perhaps we
> can add all Mahout transitive dependencies (bar stuff like hadoop and hbase
> which already present in Spark) to some folder in mahout tree, say
> $MAHOUT_HOME/libManaged (similar to SBT).
>
> Knowing that, we perhaps can add a helper, findMahoutDependencyJars(),
> which will accept one or more artifact name for finding jars from
> $MAHOUT_HOME/libManged, similarly to how findMahoutJars() do it.
>
> findMahoutDependencyJars() should assert that it found all jars requested.
>
> Then driver code could use that helper to create additoinal jars in
> SparkConf before requesting Spark context.
>
> So for example, in your case driver should say
>
> findMahoutDependencyJars( "commons-math" :: Nil )
>
>
> and then add the result to SparkConf.
>
>
>
> On Mon, Oct 20, 2014 at 11:05 AM, Dmitriy Lyubimov <[email protected]>
> wrote:
>
> >
> >
> > On Mon, Oct 20, 2014 at 10:49 AM, Pat Ferrel <[email protected]>
> > wrote:
> >
> >> I agree it’s just that different classes, required by mahout are missing
> >> from the environment depending on what happens to be in Spark. These
> deps
> >> should be supplied in the job.jar assemblies, right?
> >>
> >
> > No. They should be physically available as jars, somewhere. E.g. in
> > compiled mahout tree.
> >
> > the "job.xml" assembly in the "spark" module is but a left over from an
> > experiment i ran on job jars with Spark long ago. It's just hanging
> around
> > there but not actually being built. Sorry for confusion. DRM doesn't use
> > job jars. As far as I have established, Spark does not understand job
> jars
> > (it's purely a Hadoop notion -- but even there it has been unsupported or
> > depricated for a long time now).
> >
> > So. we can e.g. create a new assembly for spark, such as "optional
> > dependencies" jars, and put it somewhere into the compiled tree. (I guess
> > similar to "managed libraries" notion in SBT.).
> >
> > Then, if you need any of those, your driver code needs to do the
> > following. The mahoutSparkContext() method accepts optional SparkConf
> > parameter. Additional jars could be added to SparkConf before passing on
> to
> > mahoutSparkContext. If you don't supply SparkConf, the method will create
> > default one. If you do, it will merge all mahout specific settings and
> > standard jars to the context information you supply.
> >
> > As far as i see, by default context includes only math, math-scala, spark
> > and mrlegacy jars. No third party jars. (line 212 in sparkbindings
> > package). The test that checks that is in SparkBindingsSuite.scala. (yes
> > you are correct, the one you mentioned.)
> >
> >
> >
> >
> >
> >
> >>
> >> Trying out the
> >>  test("context jars") {
> >>  }
> >>
> >> findMahoutContextJars(closeables) gets the .jars, and seems to
> explicitly
> >> filter out the job.jars. The job.jars include needed dependencies so
> for a
> >> clustered environment shouldn’t these be the only ones used?
> >>
> >>
> >> On Oct 20, 2014, at 10:39 AM, Dmitriy Lyubimov <[email protected]>
> wrote:
> >>
> >> either way i don't believe there's something specific to 1.0.1, 1.0.2 or
> >> 1.1.0 that is causing/not causing classpath errors. it's just jars are
> >> picked by explicitly hardcoded artifact "opt-in" policy, not the other
> way
> >> around.
> >>
> >> It is not enough just to modify pom in order for something to appear in
> >> task classpath.
> >>
> >> On Mon, Oct 20, 2014 at 9:35 AM, Dmitriy Lyubimov <[email protected]>
> >> wrote:
> >>
> >>> Note that classpaths for "cluster" environment is tested trivially by
> >>> starting 1-2 workers and standalone spark manager processes locally. No
> >>> need to build anything "real". Workers would not know anything about
> >> mahout
> >>> so unless proper jars are exposed in context, they would have no way of
> >>> "faking" the access to classes.
> >>>
> >>> On Mon, Oct 20, 2014 at 9:28 AM, Pat Ferrel <[email protected]>
> >> wrote:
> >>>
> >>>> Yes, asap.
> >>>>
> >>>> To test this right it has to run on a cluster so I’m upgrading. When
> >>>> ready it will just be a “mvn clean install" if you already have Spark
> >> 1.1.0
> >>>> running.
> >>>>
> >>>> I would have only expected errors on the CLI drivers so if anyone else
> >>>> sees runtime errors please let us know. Some errors are very hard to
> >> unit
> >>>> test since the environment is different for local(unit tests) and
> >> cluster
> >>>> execution.
> >>>>
> >>>>
> >>>> On Oct 20, 2014, at 9:14 AM, Mahesh Balija <
> [email protected]
> >>>
> >>>> wrote:
> >>>>
> >>>> Hi Pat,
> >>>>
> >>>> Can you please give detailed steps to build Mahout against Spark
> 1.1.0.
> >>>> I build against 1.1.0 but still had class not found errors, thats why
> I
> >>>> reverted back to Spark 1.0.2 even though first few steps are
> successful
> >>>> but still facing some issues in running Mahout spark-shell sample
> >> commands
> >>>> (drmData) throws some errors even on 1.0.2.
> >>>>
> >>>> Best,
> >>>> Mahesh.B.
> >>>>
> >>>> On Mon, Oct 20, 2014 at 1:46 AM, peng <[email protected]> wrote:
> >>>>
> >>>>> From my experience 1.1.0 is quite stable, plus some performance
> >>>>> improvements that totally worth the effort.
> >>>>>
> >>>>>
> >>>>> On 10/19/2014 06:30 PM, Ted Dunning wrote:
> >>>>>
> >>>>>> On Sun, Oct 19, 2014 at 1:49 PM, Pat Ferrel <[email protected]>
> >>>>>> wrote:
> >>>>>>
> >>>>>> Getting off the dubious Spark 1.0.1 version is turning out to be a
> >> bit
> >>>> of
> >>>>>>> work. Does anyone object to upgrading our Spark dependency? I’m not
> >>>> sure
> >>>>>>> if
> >>>>>>> Mahout built for Spark 1.1.0 will run on 1.0.1 so it may mean
> >>>> upgrading
> >>>>>>> your Spark cluster.
> >>>>>>>
> >>>>>>
> >>>>>> It is going to have to happen sooner or later.
> >>>>>>
> >>>>>> Sooner may actually be less total pain.
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

Reply via email to