Github user heathermiller commented on the pull request:
https://github.com/apache/spark/pull/1929#issuecomment-52683367
Yep, so shortly after I posted my last message, I had a conference call
with @gkossakowski and then chatted briefly with @mateiz. Here's an attempt to
summarize our discussions. Important note: @gkossakowski is currently
recovering from surgery on his wrists, so written responses from him ATM are
precious :) (and we shouldn't push him for too many...)
### RE: point 2)
@gkossakowski and I discussed the issue of shadowing other jars on the
classpath.
He explained a bit about the history of `invalidateClassPathEntries`; he
was previously tasked to work on the resident compiler to enable faster
incremental compilation, exploitable by both the Scala IDE and sbt. Thus, the
functionality he was working on would have to manage large and complicated
builds where shadowing and managing multiple versions of the same library would
be a real challenge (he explained a bit the sort of maintenance that'd have to
be done with Symbols, and all of the things that could go wrong, and it's
really a nightmare). In the end, the approach was abandoned due to its
complexity. `invalidateClassPathEntries` is a not-completely-baked artifact
from this effort, and as far as I understand, no, it doesn't support any of the
scenarios you list.
However, my intuition (which could be wrong) was that, **for now**, this is
likely less important for users of spark-shell. Rationale: you're manually
adding jars to the classpath; it's unlikely that the list of jars one would add
to the classpath be so numerous that you'd shadow something else that you've
added. The real concern would be that someone adds a different version of a
library that's already on the classpath â and you're right, for this we have
no support as far as I can tell. I agree with you, that if folks are going to
import jars at runtime, they should just _be aware_ of conflicting classes (at
least importing jars at runtime would now once again be a possibility).
@gkossakowski also assured me that for the rest of 2.10.x, logic that this
PR touches most likely wouldn't fluctuate, so we shouldn't worry about any of
this breaking for the rest of the 2.10.x series.
The particularly important bit of @gkossakowski & I's conversation though
centered on what spark-shell would do when migrating to 2.11.x when you can't
even rely on a partial solution like `invalidateClassPathEntries`. In the end,
the consensus was that, on the side of scala/scalac, we need at minimum some
lightweight fix for [SI-6502](https://issues.scala-lang.org/browse/SI-6502)
that's at least sufficient for spark-shell.
Right now, the idea would be to write a specification of what the bare
minimum that the spark-shell/the Scala REPL should do when there are conflicts
on the classpath. And, ideally, we could try to get this fix into one of the
[next two releases of the 2.11.x
series](https://github.com/scala/scala/milestones) (due dates: Sept 19, Nov 28)
so that this fix doesn't have to live in Spark and depend on delicate scalac
internals.
### RE: point 1)
After chatting briefly with @mateiz, it was determined that the
later-appended jars should take precedence (right Matei? or do I have that
backwards?)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]