[
https://issues.apache.org/jira/browse/IMPALA-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joe McDonnell reassigned IMPALA-10455:
--------------------------------------
Assignee: Joe McDonnell
> Reorder Maven repositories to have cleaner mirror semantics
> -----------------------------------------------------------
>
> Key: IMPALA-10455
> URL: https://issues.apache.org/jira/browse/IMPALA-10455
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend, Infrastructure
> Affects Versions: Impala 4.0
> Reporter: Joe McDonnell
> Assignee: Joe McDonnell
> Priority: Major
>
> Using a Maven mirror to replace Maven Central can speed up the Impala build
> substantially. However, the artifacts that are present in the toolchain s3
> bucket are unlikely to be able to resolved by the mirror, because they are
> not in Maven Central or other repositories. If the Maven mirror has a long
> list of source repositories, a miss can be expensive, because it may try each
> of the mirror's source repositories. It would be useful to exclude the s3
> bucket Maven repositories from the mirroring. For example, this settings.xml
> would do that:
> {noformat}
> <settings>
> <mirrors>
> <mirror>
> <mirrorOf>external:*,!impala.cdp.repo</mirrorOf>
> <name>mirror-repo</name>
> <url>http://url.to.the.mirror/</url>
> <id>mirror-repo</id>
> </mirror>
> </mirrors>
> </settings>{noformat}
> It mirrors everything that is not local and not from impala.cdp.repo (which
> points to an S3 bucket).
> Unfortunately, this rule doesn't work. Everything still tries the mirror.
> Maven is trying repositories in the order that they are specified in the
> pom.xml, and it sees cdh.rcs.releases.repo before it sees impala.cdp.repo (
> [https://github.com/apache/impala/blob/master/java/pom.xml#L150
> ).|https://github.com/apache/impala/blob/master/java/pom.xml#L150)] It also
> sees multiple banned repos (i.e. repos where both snapshots and releases are
> disabled). Based on my testing, seeing the cdh.rcs.releases.repo causes it to
> try the mirror, because it matches the mirrorOf conditions. It seems like the
> banned repositories may also a problem, depending on how smart Maven is.
> Reordering the repositories can fix these semantics. If the impala.cdp.repo
> comes first (along with the impala.toolchain.kudu.repo), then anything that
> matches that would avoid hitting the mirror. Specifically, it seems like the
> best ordering would be impala.toolchain.kudu.repo (a local filesystem repo),
> impala.cdp.repo (an s3 repo), then the normal server repos, and lastly the
> banned repositories.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]