Steve Loughran commented on HADOOP-13692:

just an FYI, this broke *my own* unit tests on the SPARK-1481 branch

2016-10-13 18:51:13,888 [ScalaTest-main] INFO  cloud.CloudSuite 
(Logging.scala:logInfo(54)) - Loading configuration from ../../cloud.xml
2016-10-13 18:51:14,214 [ScalaTest-main] WARN  util.NativeCodeLoader 
(NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for 
your platform... using builtin-java classes where applicable
  at com.fasterxml.jackson.databind.ObjectMapper.<init>(ObjectMapper.java:537)
  at com.fasterxml.jackson.databind.ObjectMapper.<init>(ObjectMapper.java:448)
  at com.amazonaws.util.json.Jackson.<clinit>(Jackson.java:32)
  at com.amazonaws.internal.config.InternalConfig.load(InternalConfig.java:247)
  at com.amazonaws.util.VersionInfoUtils.userAgent(VersionInfoUtils.java:139)
  at com.amazonaws.util.VersionInfoUtils.getUserAgent(VersionInfoUtils.java:95)
  at com.amazonaws.ClientConfiguration.<clinit>(ClientConfiguration.java:42)
[INFO] ------------------------------------------------------------------------

But this problem goes away in spark-assembly, the release of spark, etc. Purely 
this module. Which is why I didn't catch this earlier as the system integration 
tests were all happy.


# there's a newer version of jackson in use in spark (2.6.5)
# which overrides the declarations of {{jackson-annotations}} and 
{{jackson-databind}} under hadoop-aws
# and which have transitive dependencies on jackson-common.
# the explicitdeclaration of jackson-common has pulled that reference one step 
up the dependency graph (i.e. from under 
spark-cloud/hadoop-aws/amazon-aws/jackson-common.jar) to 
# which gives the hadoop-aws version precedence over the one transitively 
referenced by the (overridde) jackson-annotations, pulled in directly from 
spark-core JAR.
# so creating a version inconsistency which surfaces during test runs.

The problem isn't in spark-assembly.jar as it refers to spark-core jar 
directly, plcking that version up instead.

Essentially: the fact that maven uses closest-version first in its version 
resolution policy means that the depth of transitive dependencies controls 
whether things run or not; the explicit declaration of the dependency was 
enough to cause this to surface.

Fix: explicitly exclude the hadoop-aws jackson dependencies, as was already 
done for hadoop-azure. 

This is not me faulting my own work (how would I!), only showing that you do 
need to be careful across projects as to what transitive stuff you pull in, as 
it turns out to be incredibly brittle. We didn't change the jackson version 
here, only made that choice explicit, and a downstream test suite fails.

> hadoop-aws should declare explicit dependency on Jackson 2 jars to prevent 
> classpath conflicts.
> -----------------------------------------------------------------------------------------------
>                 Key: HADOOP-13692
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13692
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>            Priority: Minor
>             Fix For: 2.8.0
>         Attachments: HADOOP-13692-branch-2.001.patch
> If an end user's application has a dependency on hadoop-aws and no other 
> Hadoop artifacts, then it picks up a transitive dependency on Jackson 2.5.3 
> jars through the AWS SDK.  This can cause conflicts at deployment time, 
> because Hadoop has a dependency on version 2.2.3, and the 2 versions are not 
> compatible with one another.  We can prevent this problem by changing 
> hadoop-aws to declare explicit dependencies on the Jackson artifacts, at the 
> version Hadoop wants.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to