Github user elbamos commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/463#issuecomment-162751848
@jongyoul
I think I was not clear. I do *not* think we should treat Spark as just
like any other interpreter.
I think we should either treat it as just another interpreter, or treat it
as special. We should pick one of those. We should then commit to that choice
fully. We should not be half-one and half-the-other.
(If we were to vote, I would probably vote for Spark to be "special.")
BUT -- either way, whether it is special or not, has no effect on this PR.
Whether Zeppelin is one or the other, either way it should not be trying to
install Spark. Either way, the SparkInterpreter should not try to connect to
Spark if SPARK_HOME is not set. Even if Spark is special, it should not be
installed by Zeppelin!
The use of "manual Spark" does not work reliably work. It breaks the build
process consistently. It produces failures that are very hard to trace. It
breaks whenever a new version of Spark comes out. I suspect this is also the
cause of many CI failures.
The installation of Spark by Zeppelin does not reliably work. It often
results in installations that fail because of second-, third-, fourth-degree
indirect dependency conflicts. For example, if you build against 1.5 then run
against 1.4, often you will get errors about incompatible versions of akka at
runtime.
(Here's something to try -- try setting SPARK_HOME, spark.home, and the
Spark build version to be different versions of Spark. Watch what happens.)
"Manual spark" has proven to not be maintainable.
In addition: I have been told that the decision has **already been made**
to take-out Spark without SPARK_HOME. That there is a pending PR *already* to
remove this "feature."
In that case, this PR is pointless. It does not matter if SparkInterpreter
is special or not.
But **this** PR would be a bad idea even if "manual Spark" was staying.
if "manual Spark" was staying, the correct plan now would be to fix the
dependency architecture so we do not have to change 2000 lines of code and
break *everything* whenever a new Spark version comes out.
So -- no matter what about "special" Spark, and no matter about "manual
Spark," **this** PR is a bad idea.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---