[ 
https://issues.apache.org/jira/browse/SPARK-16725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394587#comment-15394587
 ] 

Sean Owen commented on SPARK-16725:
-----------------------------------

Thanks for the reminder [~vanzin], I caught up again on the current state here. 

There's a long story here, but, no, Spark shades Guava and therefore doesn't 
leak it. However it used to have to 'leak' the Optional class in Guava before 
2.x.

Spark depends on Hadoop, Hadoop depends on unshaded Guava. Any assembly 
containing Spark needs Hadoop and some unshaded Guava, therefore. That's what 
the Guava 14 is about IIUC. I suspect that in principle Spark could both 
include Guava 14 for other dependencies and use shaded Guava 19 internally, but 
that could be tricky. I suppose there just hasn't been a compelling reason.

Guava isn't backwards compatible for more than a few versions, actually. Fair 
enough, they're all major releases. But moving the dependency forward in 
general does break things.

The bad news is that, whatever Spark does, you still face leakage from Hadoop 
or other projects. Hence the advice to shield yourself by shading is pretty 
good, downsides notwithstanding, because it's a problem that crops up in more 
than just Spark.

> Migrate Guava to 16+?
> ---------------------
>
>                 Key: SPARK-16725
>                 URL: https://issues.apache.org/jira/browse/SPARK-16725
>             Project: Spark
>          Issue Type: Improvement
>          Components: Build
>    Affects Versions: 2.0.1
>            Reporter: Min Wei
>            Priority: Minor
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> Currently Spark depends on an old version of Guava, version 14. However 
> Spark-cassandra driver asserts on Guava version 16 and above. 
> It would be great to update the Guava dependency to version 16+
> diff --git a/core/src/main/scala/org/apache/spark/SecurityManager.scala 
> b/core/src/main/scala/org/apache/spark/SecurityManager.scala
> index f72c7de..abddafe 100644
> --- a/core/src/main/scala/org/apache/spark/SecurityManager.scala
> +++ b/core/src/main/scala/org/apache/spark/SecurityManager.scala
> @@ -23,7 +23,7 @@ import java.security.{KeyStore, SecureRandom}
>  import java.security.cert.X509Certificate
>  import javax.net.ssl._
>  
> -import com.google.common.hash.HashCodes
> +import com.google.common.hash.HashCode
>  import com.google.common.io.Files
>  import org.apache.hadoop.io.Text
>  
> @@ -432,7 +432,7 @@ private[spark] class SecurityManager(sparkConf: SparkConf)
>          val secret = new Array[Byte](length)
>          rnd.nextBytes(secret)
>  
> -        val cookie = HashCodes.fromBytes(secret).toString()
> +        val cookie = HashCode.fromBytes(secret).toString()
>          SparkHadoopUtil.get.addSecretKeyToUserCredentials(SECRET_LOOKUP_KEY, 
> cookie)
>          cookie
>        } else {
> diff --git a/core/src/main/scala/org/apache/spark/SparkEnv.scala 
> b/core/src/main/scala/org/apache/spark/SparkEnv.scala
> index af50a6d..02545ae 100644
> --- a/core/src/main/scala/org/apache/spark/SparkEnv.scala
> +++ b/core/src/main/scala/org/apache/spark/SparkEnv.scala
> @@ -72,7 +72,7 @@ class SparkEnv (
>  
>    // A general, soft-reference map for metadata needed during HadoopRDD 
> split computation
>    // (e.g., HadoopFileRDD uses this to cache JobConfs and InputFormats).
> -  private[spark] val hadoopJobMetadata = new 
> MapMaker().softValues().makeMap[String, Any]()
> +  private[spark] val hadoopJobMetadata = new 
> MapMaker().weakValues().makeMap[String, Any]()
>  
>    private[spark] var driverTmpDir: Option[String] = None
>  
> diff --git a/pom.xml b/pom.xml
> index d064cb5..7c3e036 100644
> --- a/pom.xml
> +++ b/pom.xml
> @@ -368,8 +368,7 @@
>        <dependency>
>          <groupId>com.google.guava</groupId>
>          <artifactId>guava</artifactId>
> -        <version>14.0.1</version>
> -        <scope>provided</scope>
> +        <version>19.0</version>
>        </dependency>
>        <!-- End of shaded deps -->
>        <dependency>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to