[ 
https://issues.apache.org/jira/browse/SPARK-20573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-20573:
---------------------------------
    Labels: bulk-closed  (was: )

> --packages fails when transitive dependency can only be resolved from 
> repository specified in POM's <repositories> tag
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-20573
>                 URL: https://issues.apache.org/jira/browse/SPARK-20573
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Submit
>    Affects Versions: 2.0.0, 2.1.0
>            Reporter: Josh Rosen
>            Priority: Major
>              Labels: bulk-closed
>
> With a clean Ivy cache, run the following command:
> {code}
> ./bin/spark-shell --packages com.twitter.elephantbird:elephant-bird-core:4.4
> {code}
> This will fail with {{unresolved dependency: 
> com.hadoop.gplcompression#hadoop-lzo;0.4.16: not found}}.
>  If you look at the elephant-bird-core POM (at 
> http://central.maven.org/maven2/com/twitter/elephantbird/elephant-bird-core/4.4/elephant-bird-core-4.4.pom)
>  you'll see a direct dependency on hadoop-lzo. This library is only present 
> in Twitter's public Maven repository, hosted at http://maven.twttr.com.The 
> elephant-bird-core POM does not directly declare Twitter's external 
> repository. Instead, that external repository is inherited from 
> elephant-bird-core's parent POM (at 
> http://central.maven.org/maven2/com/twitter/elephantbird/elephant-bird/4.4/elephant-bird-4.4.pom).
> From the Ivy output it looks like it it didn't even attempt to resolve from 
> the Twitter repo:
> {code}
> :: problems summary ::
> :::: WARNINGS
>               module not found: com.hadoop.gplcompression#hadoop-lzo;0.4.16
>       ==== local-m2-cache: tried
>         
> file:/Users/joshrosen/.m2/repository/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.pom
>         -- artifact 
> com.hadoop.gplcompression#hadoop-lzo;0.4.16!hadoop-lzo.jar:
>         
> file:/Users/joshrosen/.m2/repository/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.jar
>       ==== local-ivy-cache: tried
>         
> /Users/joshrosen/.ivy2/local/com.hadoop.gplcompression/hadoop-lzo/0.4.16/ivys/ivy.xml
>         -- artifact 
> com.hadoop.gplcompression#hadoop-lzo;0.4.16!hadoop-lzo.jar:
>         
> /Users/joshrosen/.ivy2/local/com.hadoop.gplcompression/hadoop-lzo/0.4.16/jars/hadoop-lzo.jar
>       ==== central: tried
>         
> https://repo1.maven.org/maven2/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.pom
>         -- artifact 
> com.hadoop.gplcompression#hadoop-lzo;0.4.16!hadoop-lzo.jar:
>         
> https://repo1.maven.org/maven2/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.jar
>       ==== spark-packages: tried
>         
> http://dl.bintray.com/spark-packages/maven/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.pom
>         -- artifact 
> com.hadoop.gplcompression#hadoop-lzo;0.4.16!hadoop-lzo.jar:
>         
> http://dl.bintray.com/spark-packages/maven/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.jar
>               ::::::::::::::::::::::::::::::::::::::::::::::
>               ::          UNRESOLVED DEPENDENCIES         ::
>               ::::::::::::::::::::::::::::::::::::::::::::::
>               :: com.hadoop.gplcompression#hadoop-lzo;0.4.16: not found
>               ::::::::::::::::::::::::::::::::::::::::::::::
> {code}
> If you manually specify the Twitter repository as an additional external 
> repository then everything works fine.
> This is a somewhat frustrating behavior from an end-user's point of view 
> because unless they dig through the POMs themselves it is not obvious why 
> things are broken or how to fix them. When Maven resolves this coordinate it 
> properly fetches the transitive dependencies from the additional repositories 
> specified in the referencing POMs. My hunch is that this behavior is caused 
> by either a bug in Ivy itself or a bug in Spark's usage / configuration of 
> the embedded Ivy resolver.
> It would be great to see if we can find other test-cases to narrow down the 
> scope of the bug. I'm wondering whether POM-specified repositories will work 
> if they're specified in the POM of the top-level dependency being resolved. 
> It would also be useful to determine whether Ivy handles additional 
> repositories in the top-level of transitive dependencies' POMs: maybe the 
> problem is the specific combination of transitive dep + repository inherited 
> from that dep's parent POM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to