Re: Why does spark REPL not embed scala REPL?

Aaron Davidson Fri, 30 May 2014 09:31:37 -0700

There's some discussion here as well on just using the Scala REPL for 2.11:
http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-on-Scala-2-11-td6506.html#a6523


Matei's response mentions the features we needed to change from the Scala
REPL (class-based wrappers and where to output the generated classes),
which were added as options to the 2.11 REPL, so we may be able to trim
down a bunch when 2.11 becomes standard.


On Fri, May 30, 2014 at 4:16 AM, Kan Zhang <kzh...@apache.org> wrote:

> One reason is standard Scala REPL uses object based wrappers and their
> static initializers will be run on remote worker nodes, which may fail due
> to differences between driver and worker nodes. See discussion here
> https://groups.google.com/d/msg/scala-internals/h27CFLoJXjE/JoobM6NiUMQJ
>
>
> On Fri, May 30, 2014 at 1:12 AM, Aniket <aniket.bhatna...@gmail.com>
> wrote:
>
> > My apologies in advance if this is a dev mailing list topic. I am working
> > on
> > a small project to provide web interface to spark REPL. The interface
> will
> > allow people to use spark REPL and perform exploratory analysis on the
> > data.
> > I already have a play application running that provides web interface to
> > standard scala REPL and I am just looking to extend it to optionally
> > include
> > support for spark REPL. My initial idea was to include spark dependencies
> > in
> > the project, create a new instance of SparkContext and bind it to a
> > variable
> > (lets say 'sc') using imain.bind("sc", sparkContext). While theoretically
> > this may work, I am trying to understand why spark REPL takes a different
> > path by creating it's own SparkILoop, SparkIMain, etc. Can anyone help me
> > understand why there was a need to provide custom versions of IMain,
> ILoop,
> > etc instead of embedding the standard scala REPL and binding SparkContext
> > instance?
> >
> > Here is my analysis so far:
> > 1. ExecutorClassLoader - I understand this is need to load classes from
> > HDFS. Perhaps this could have been plugged into the standard scala REPL
> > using settings.embeddedDefaults(classLoaderInstance). Also, it's not
> clear
> > what ConstructorCleaner does.
> >
> > 2. SparkCommandLine & SparkRunnerSettings - Allow for providing an extra
> -i
> > file argument to the REPL. The standard sourcepath wouldn't have
> sufficed?
> >
> > 3. SparkExprTyper - The only difference between standard ExprTyper and
> > SparkExprTyper is that repldbg is replaced with logDebug. Not sure if
> this
> > was intentional/needed.
> >
> > 4. SparkILoop - Has a few deviations from standard ILoop class but this
> > could have been managed by extending or wrapping ILoop class or using
> > settings. Not sure what triggered the need to copy the source code and
> make
> > edits.
> >
> > 5. SparkILoopInit - Changes the welcome message and binds spark context
> in
> > the interpreter. Welcome message could have been changed by extending
> > ILoopInit.
> >
> > 6. SparkIMain - Contains quiet a few changes around class
> > loading/logging/etc but I found it very hard to figure out if extension
> of
> > IMain was an option and what exactly didn't work/will not work with
> IMain.
> >
> > Rest of the classes seem very similar to their standard counterparts. I
> > have
> > a feeling the spark REPL can be refactored to embed standard scala REPL.
> I
> > know refactoring would not help Spark project as such but would help
> people
> > embed the spark REPL much in the same way it's done with standard scala
> > REPL. Thoughts?
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-spark-developers-list.1001551.n3.nabble.com/Why-does-spark-REPL-not-embed-scala-REPL-tp6871.html
> > Sent from the Apache Spark Developers List mailing list archive at
> > Nabble.com.
> >
>

Re: Why does spark REPL not embed scala REPL?

Reply via email to