[
https://issues.apache.org/jira/browse/SPARK-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15299793#comment-15299793
]
Steve Loughran commented on SPARK-13599:
----------------------------------------
Sorry to hear about this: I know precisely how frustrating JAR problems can be.
# I don't know if/when a 1.6.2 will ship: you'll have to raise that on the
spark developer list
# I do know that the version of spark shipped by hortonworks does have groovy
stripped from the assembly. I don't know about CDH. [~srowen] should know there.
I was about to point you at Hadoop's FindClass entry point, whose aim in life
is to show where a class is surfacing from, but HADOOP-9044 isn't in any
shipping Hadoop version. You could look at the patch and see how to
re-implement it yourself; it's pretty simple.
Finally, note that groovy was pulled to address a security issue when
deserializing java- or kryo- serialized objects. If you add groovy to the
classpath, you may restore that vulnerability.
> Groovy-all ends up in spark-assembly if hive profile set
> --------------------------------------------------------
>
> Key: SPARK-13599
> URL: https://issues.apache.org/jira/browse/SPARK-13599
> Project: Spark
> Issue Type: Improvement
> Components: Build
> Affects Versions: 1.5.0, 1.6.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Minor
> Fix For: 1.6.2, 2.0.0
>
>
> If you do a build with {{-Phive,hive-thriftserver}} then the contents of
> {{org.codehaus.groovy:groovy-all}} gets into the spark-assembly.jar
> This bad because
> * it makes the JAR bigger
> * it makes the build longer
> * it's an uber-JAR itself, so can include things (maybe even conflicting
> things)
> * It's something else that needs to be kept up to date security-wise
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]