I found in general it's a pain to build/run Spark inside IntelliJ IDEA.
I guess most people resort to this approach so that they can leverage
the integrated debugger to debug and/or learn Spark internals. A more
convenient way I'm using recently is resorting to the remote debugging
feature. In
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 15:04 min
mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
-Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11
2. Tested pyspark, mlib - running as well as compare results with 1.3.0
pyspark works
Thanks Cheng. Yes, the problem is that the way to set up to run inside
Intellij changes v frequently. It is unfortunately not simply a one-time
investment to get IJ debugging working properly: the steps required are a
moving target approximately monthly to bi-monthly.
Doing remote debugging is