[ 
http://jira.codehaus.org/browse/MINDEXER-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=261910#action_261910
 ] 

Tamás Cservenák commented on MINDEXER-19:
-----------------------------------------

For history' sake, here is the description of the problem:


The JBoss Nexus introduces Audit information files stored as 
"artifactId-version.ext.audit.json". The problem is that these "metafiles" 
_violates_ the M2 repository layout in a way, that repositories with this Audit 
information now have _more than one main artifacts_. As proof, here is an 
example: In case of [Apache Avalon Framework 
4.1.5|https://repository.jboss.org/nexus/content/repositories/thirdparty-releases/apache-avalon/avalon-framework/4.1.5/],
 where _real_ main artifact is of type "jar", you can address both ("real" and 
"fake") main artifacts as this below, making Maven download and consume them:

{noformat}
  <dependency>
    <groupId>apache-avalon</groupId>
    <artifactId>avalon-framework</artifactId>
    <version>4.1.5</version>
    <type>jar</type>
  </dependency>
{noformat}

Note: the "type" is redundant, but I added just for clarity sake ("JAR" is the 
default dependency type in Maven).

Having this as dependency in a project, produces this expected build output:

{noformat}
cstamas@marvin test$ mvn -s settings.xml clean install
[INFO] Scanning for projects...
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building test Maven Mojo 1.0+-SNAPSHOT
[INFO] ------------------------------------------------------------------------
Downloading: 
https://repository.jboss.org/nexus/content/repositories/thirdparty-releases/apache-avalon/avalon-framework/4.1.5/avalon-framework-4.1.5.jar
Downloaded: 
https://repository.jboss.org/nexus/content/repositories/thirdparty-releases/apache-avalon/avalon-framework/4.1.5/avalon-framework-4.1.5.jar
 (72 KB at 21.8 KB/sec)
....
{noformat}

But alas, modifying the "type" node of the dependency, clearly shows that the 
"jar.audit.json" is treated the same way as the jar by Maven. Just modify in 
your "test" project the dependency, change it's type from "jar" to 
"jar.audit.json":

{noformat}
  <dependency>
    <groupId>apache-avalon</groupId>
    <artifactId>avalon-framework</artifactId>
    <version>4.1.5</version>
    <type>jar.audit.json</type>
  </dependency>
{noformat}

And build it:

{noformat}
cstamas@marvin test$ mvn -s settings.xml clean install
[INFO] Scanning for projects...
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building test Maven Mojo 1.0+-SNAPSHOT
[INFO] ------------------------------------------------------------------------
Downloading: 
https://repository.jboss.org/nexus/content/repositories/thirdparty-releases/apache-avalon/avalon-framework/4.1.5/avalon-framework-4.1.5.jar.audit.json
[WARNING] Checksum validation failed, no checksums available from the 
repository for 
https://repository.jboss.org/nexus/content/repositories/thirdparty-releases/apache-avalon/avalon-framework/4.1.5/avalon-framework-4.1.5.jar.audit.json
Downloaded: 
https://repository.jboss.org/nexus/content/repositories/thirdparty-releases/apache-avalon/avalon-framework/4.1.5/avalon-framework-4.1.5.jar.audit.json
 (189 B at 0.1 KB/sec)
....
{noformat}

This proves, that the GAV apache-avalon : avalon-framework : 4.1.5 in JBoss 
repository _has two main artifacts_ with different types.

Note: same stands for POM, there is "artifactId-version.pom.audit.json" present 
also. This is important in discussion below.

---

Now, what this causes in Maven Indexer is next problem: Indexer expects 
following assumptions to be true: in GAV directory (the ["version 
directory"|https://repository.jboss.org/nexus/content/repositories/thirdparty-releases/apache-avalon/avalon-framework/4.1.5/]
 _it's expected to have one main artifact, and 0 or more classified (artifacts 
with classifier) artifacts_.

What happens, is that Indexer finds multiple extensions ("pom.audit.json", 
"jar" and "jar.audit.json") for same GAV 
(apache-avalon:avalon-framework:4.1.5). Indexer also maintains _index 
uniqueness_ based on GAV. Hence, the _first "artifact suspect" file it stumbles 
upon_ becomes indexed, and the second is just skipped (uniqueness check fails 
on GAV). Here, the "pom.audit.json" (which is not ".pom", but neither 
".pom.sha1" or ".pom.md5" that are filtered out or "recognized" as artifact POM 
or checksums as part of M2 repository layout) is stumbled upon first, and is 
taken/considered as main artifact.

Indexer _intentionally_ cannot take the POM -> artifact matching route, since 
_figuring out artifact extension_ out of POM "packaging" is not always possible 
(think 3rd party extensions, like "nexus-plugin" packaging is actually "jar" 
extension, etc). Hence, it goes the artifact -> POM route (which is trivial, 
strip off extension and replace it with ".pom"). This is one of the oldest 
limitations in Indexer code.

But, the "heuristics" fails here, since GAV parser (that parses file's path to 
"reengineer" it's GAV) _succeeds_ in parsing the "pom.audit.json" file path 
resulting in GAV "apache-avalon:avalon-framework:4.1.5" and extension 
"pom.audit.json", and assumption is made _this is the main artifact_ (parsed 
GAV would contain classifier if classifier is present). 

Clearly, this is where Indexer fails, since in this very case, packaging set in 
POM is known ("jar"), but due to initial requirements when Indexer was 
implemented, this check (to "crank up the POM and parse it"), was sacrificed 
over _speed_ of scanning. The POM reading would still not offer "full 
solution", again, think non-core packagings and 3rd party build extensions. 

This again raises the general question: "What is the extension of packaging 
FOO?" -- to have answered without having ArtifactHandler access (this happens 
in Nexus, not in Maven). The other way ("What is the packaging of extension 
FOO?") is even more tricky and impossible, since packaging to extension mapping 
is not _bijective_ (ie. packaging "ear" and "nexus-plugin" both produce 
extension "jar", just like packaging "jar" is).


Possible solutions:

a) stop using "artifactId-version.ext.audit.json" as auditing JSON filename, 
but change it to something that "breaks" the M2 layout, like this 
".artifactId-version.ext.audit.json" (prepend with dot). This will make Indexer 
skip this file (will not be considered as artifact), but also Nexus will _hide 
it_ while browsing the repository (direct request to file will still work, only 
"browsing" the repository will not show it).

b) Introduce some "skip it" extension in Indexer (this needs to be added to 
Maven Indexer), like ".noindex". If the filename _ends_ with exactly this 
string, Indexer should skip it, not consider it for indexing (so the Audit JSON 
filename would be "artifactId-version.ext.audit.json.noindex").

c) move off to attributes to store audit information


> Make ArtifactContextProduces smarter about main artifact selection
> ------------------------------------------------------------------
>
>                 Key: MINDEXER-19
>                 URL: http://jira.codehaus.org/browse/MINDEXER-19
>             Project: Maven Indexer
>          Issue Type: Improvement
>    Affects Versions: 4.0.0
>            Reporter: Tamás Cservenák
>
> Make ArtifactContextProduces smarter about main artifact selection.
> See https://issues.sonatype.org/browse/NEXUS-4187
> Main problem here is that repository in issue above is _not obeying_ M2 
> repository layout.
> At least an improvement could be, to prefer "core packaging" (those known by 
> Maven Core) over others, but this will not fix the issue with new packagings 
> introduced by extensions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to