Hi,

We use spark-shell heavily for ad-hoc data analysis as well as iterative
development of the analytics code. A common workflow consists the following
steps:

   1. Write a small Scala module, assemble the fat jar
   2. Start spark-shell with the assembly jar file
   3. Try out some ideas in the shell, then capture the code back into the
   module
   4. Go back to step 1 and restart the shell

This is very similar to what people do in web-app development. And the pain
point is similar: in web-app development, a lot of time is spent waiting
for new code to be deployed; here, a lot of time is spent waiting for Spark
to restart. Having the ability to hot-deploy code in the REPL would help a
lot, just as being able to hot-deploy in containers like Play, or using
JRebel, has helped boost productivity tremendously.

I do have code that works with the 1.5.2 release.  Is this something that's
interesting enough to be included in Spark proper?  If so, should I create
a Jira ticket or github PR for the master branch?


Cheers,

Kai

Reply via email to