So, from any incoming request we can set what layout to apply based on
number of "/".
For default layouts, all required info are well splited in the path.
For legacy one, your proposal of loading the POM to read artifactId is
broken :
- if the managed/proxied repository also uses a legacy layout (so that
replacing extension with "pom" gets the pom, we must anyway remove any
classifier (jaxen-1.0-FCS-full.jar for example -> faxen-1.0-FCS.pom)... and
the POM must exists !.
- if the target (proxied) repository use default layout, we need the
artifactId + version (+classifier) to get the pom path to request....
So in both cases we cannot get the POM prior to knowing the artifactId :
version : classifier from the incoming request.
Your idea of moving this problem to the web layer, and avoinding
BidirectionalLayouts from beeing so central is interesting. I don't know
enought about impacts on consumers and other archiva components to give an
opinion. I just tried to refactor this interface and got so much changes...
If the new RepositoryContent interface works in front of a repository
(including proxies connectors), it can check for artifacts/pom to exist, so
it can build the list of all possible artifactId:version:classifier from an
incoming legacy request, and search for the artifact to exist - and then
find the expected one - before returning the ArtifactReference. The current
RepositoryLayoutUtils / VersionUtils could be used to avoid to much network
traffic, as it solves many artifactIds/version well.
Nico.
2007/10/6, Joakim Erdfelt <[EMAIL PROTECTED]>:
>
> Hmm, You are correct.
>
> Shortest path I can think of is junit/junit/3.8.1/junit-3.8.1.jar
> That would be 4 parts, no?
>
> > 4. If 3 parts ( dir/dir/filename ) then
> > 4.1. If part 2 name ends in "s" then test for potential legacy
> layout.
> > 4.1.1. Identify filename extension.
> > 4.1.2. Get potential list of artifact types for extension.
> > 4.1.3. If part 2 (minus the end "s") is in the list of
> > artifact types == legacy layout
> > 4.2. Can't be legacy, then hand off to default layout.
>
> Lets change 4.2 to read ...
> 4.2. Invalid legacy layout.
>
> - Joakim
>
> nicolas de loof wrote:
> > Just on question about the proposed logic for detecting layout.
> >
> > 4 If 3 parts ( dir/dir/filename ) then
> > ...
> >
> > Is there any case where a 3 part path can be a maven2 path ???
> >
> > Nico.
> >
> > 2007/10/5, Joakim Erdfelt <[EMAIL PROTECTED]>:
> >
> >> This is a long email, read it all before commenting, and you'll likely
> >> see a response to your earlier questions. :-)
> >>
> >> I'm currently working on MRM-432 and MRM-519, and I'm in the middle of
> an
> >> important change to how Archiva handles Layout detection, interaction,
> >> and parsing.
> >>
> >> :Background:
> >>
> >> Layouts in Archiva have 2 main purposes.
> >>
> >> 1. to convert a path to an artifact reference.
> >> 2. to convert an artifact reference to a path.
> >>
> >> Layouts are used by the following.
> >>
> >> 1. The "/repository/${repoid}/" urls use layouts to determine the
> >> Artifact Reference that the client is requesting.
> >> The "/repository/" url is layout neutral, and can have maven 1
> >> clients ask for content in legacy format, or maven 2 clients ask
> >> for content in default layout.
> >> 2. Proxy requests out to remote repositories utilize layouts to take
> >> an internal Artifact Reference, convert it to a path appropriate
> >> to the remote layout configuration and obtain the content that is
> >> desired.
> >> 3. Simple Consumers utilize layouts to obtain File references, and
> >> Artifact References to the repository content for purposes of
> >> operating on the content in a way that they desire.
> >> 4. Complex consumers (such as metadata updater, and snapshots purge)
> >> utilize layouts to obtain lists of versions and artifacts.
> >>
> >> What Works.
> >>
> >> * Converting an Artifact Reference to a path.
> >> * Discovering Versions in a default layout.
> >> (needed by metadata update / snapshot purge)
> >> * Converting a default layout path to an Artifact Reference correctly.
> >>
> >> What Doesn't Work.
> >>
> >> * Detecting the layout in use 100% of the time.
> >> * Converting a legacy layout path to an Artifact Reference 100% of
> >> the time.
> >> * Discovering versions in a legacy layout.
> >> (do we need metadata update / snapshot purge here?)
> >> * Reporting problems correctly.
> >>
> >> :The Problem:
> >>
> >> The inability to parse useful information in a consistent way for all
> >> provided paths.
> >> Gleaning the following information from the path.
> >>
> >> * Layout Type (default / legacy)
> >> * Group ID
> >> * Artifact ID
> >> * Version (Deployed version & Base version)
> >> * Classifier (Not applicable in legacy layout)
> >> * Type (Not the same as Extension)
> >>
> >> Example Paths: (included in this email for discussion, actual list
> >> from test cases)
> >>
> >> groupId/jars/-1.0.jar
> >> org.apache.maven.test/jars/artifactId-1.0.war
> >> ch.ethz.ganymed/jars/ganymed-ssh2-build210.jar
> >> javax/jars/comm-3.0-u1.jar
> >> javax.persistence/jars/ejb-3.0-public_review.jar
> >> maven/jars/maven-test-plugin-1.8.2.jar
> >> commons-lang/jars/commons-lang-2.1.jar
> >> org.apache.derby/jars/derby-10.2.2.0.jar
> >> com.foo/ejbs/foo-client-1.0.jar
> >> com.foo.lib/javadoc.jars/foo-lib-2.1-alpha-1-javadoc.jar
> >> com.foo.lib/java-sources/foo-lib-2.1-alpha-1-sources.jar
> >> com.foo/jars/foo-tool-1.0.jar
> >> org.apache.geronimo.specs/jars/geronimo-ejb_2.1_spec-1.0.1.jar
> >> directory-clients/poms/ldap-clients-0.9.1-SNAPSHOT.pom
> >> org.apache.archiva.test/jars/redonkulous-
> 3.1-beta-1-20050831.101112-42.jar
> >> invalid/invalid/1.0-20050611.123456-1/invalid-1.0-20050611.123456-1.jar
> >> ch/ethz/ganymed/ganymed-ssh2/build210/ganymed-ssh2-build210.jar
> >> javax/comm/3.0-u1/comm-3.0-u1.jar
> >> javax/persistence/ejb/3.0-public_review/ejb-3.0-public_review.jar
> >> maven/maven-test-plugin/1.8.2/maven-test-plugin-1.8.2.pom
> >> test/maven-arch/test-arch/2.0.3-SNAPSHOT/test-arch-2.0.3-SNAPSHOT.pom
> >>
> com/company/department/com.company.department/0.2/com.company.department-
> >> 0.2.pom
> >>
> >>
> com/company/department/com.company.department.project/0.3/com.company.department.project-
> >> 0.3.pom
> >> com/foo/foo-tool/1.0/foo-tool-1.0.jar
> >> commons-lang/commons-lang/2.1/commons-lang-2.1.jar
> >> com/foo/foo-client/1.0/foo-client-1.0.jar
> >> com/foo/lib/foo-lib/2.1-alpha-1/foo-lib-2.1-alpha-1-sources.jar
> >> org/apache/archiva/test/redonkulous/3.1-beta-1-SNAPSHOT/redonkulous-
> >> 3.1-beta-1-20050831.101112-42.jar
> >>
> >> :Proposal:
> >>
> >> The proposed logic for detecting layout.
> >>
> >> 1. Split path by directory seperators.
> >> 2. If more than 3 parts ( dir/dir/dir/dir/filename ) == default layout.
> >> 3. If less than 3 parts ( dir/filename ) == invalid path.
> >> 4. If 3 parts ( dir/dir/filename ) then
> >> 4.1. If part 2 name ends in "s" then test for potential legacy
> layout.
> >> 4.1.1. Identify filename extension.
> >> 4.1.2. Get potential list of artifact types for extension.
> >> 4.1.3. If part 2 (minus the end "s") is in the list of
> >> artifact types == legacy layout
> >> 4.2. Can't be legacy, then hand off to default layout.
> >>
> >> The problem with this approach is maintaining the list of extensions to
> >> artifact type. The artifact type is arbitrary, and can be expanded
> >> upon by the user to include types that we can't even imagine today.
> >> (See MRM-481: issue with extension .xml.zip)
> >>
> >> The proposed logic for parsing default layout paths.
> >>
> >> This one is easy. paths are in the following format ...
> >>
> >>
> >>
> >>
> ${groupId}/${artifactId}/${baseVersion}/${artifactId}-${version}-${classifier}.${type}
> >>
> >> Once we seperate out the directories from the filename, we get the
> >> following order.
> >>
> >> dirs[dirs.length] = base version.
> >> dirs[dirs.length-1] = artifact Id.
> >> dirs[0] thru dirs[dirs.length-2] = groupId.
> >>
> >> That gives us the crucial pieces in the filename
> >> ${artifactId}-${version}, which makes detecting the classifier and
> >> type easy enough.
> >>
> >> The proposed logic for parsing legacy layout paths.
> >>
> >> Legacy layouts are tricky. It is nearly impossible to detect, using
> >> the path alone, the correct artifactId or version. So the process
> >> will need to read the pom file associated with the artifact Id in order
> >> to determine the correct Artifact Reference pieces.
> >>
> >> The problem with this approach is that we now need 2 pieces of
> >> information, the repository root (location or url) and the path.
> >> Plus we incur a hit / read of the pom file.
> >>
> >> So, if we use the pseudo-pattern ...
> >> [:groupId:]/[:type:]s/[:filename:].[:ext:]
> >> as a starting point, swap out the [:type:] and [:ext:] for "pom" and
> >> load the pom from the actual repository to determine the groupId,
> >> artifactId, and version, we can then have an valid Artifact Reference.
> >>
> >> The problem with relying on the pom is that it is now required for
> >> legacy layout "from path" logic, this changes the assumption that poms
> >> are optional and not required, as well as changing the interface
> >> to the layout objects to needing a repository as well.
> >>
> >> The proposed changes to the codebase.
> >>
> >> * Eliminate RepositoryLayoutUtils, roll layout specific filename
> >> parsing routines into their respected layouts.
> >> * Eliminate direct usage of BidirectionalRepositoryLayout by
> >> consumers.
> >> * Create RepositoryContentRequest that takes the freeform requests
> >> arriving in from the "/repository/" urls and puts it through
> >> the logic as outlined above.
> >> * Rename BidirectionalRepositoryLayout interface to RepositoryContent
> >> to simplify name and represent new role of accessing repository
> >> content that requires a repository reference.
> >> * Create DefaultRepositoryContent and LegacyRepositoryContent
> >> implementations, that utilize techniques described above, and
> >> logic already present in DefaultBidirectionalRepositoryLayout and
> >> LegacayBidirectionalRepositoryLayout.
> >> * Create AnonymousProjectReader that takes a File object pointing to
> >> a pom, read the <pomVersion> or <modelVersion> elements and load
> >> the pom information as appropriate.
> >> * Create RepositoryContentFactory that returns a RepositoryContent
> >> implementation for the provided repository id.
> >>
> >> Example of new RepositoryContent interface.
> >>
> >> --(snip)--
> >> package org.apache.maven.archiva.repository;
> >>
> >> /*
> >> * Licensed to the Apache Software Foundation (ASF) under one
> >> * or more contributor license agreements. See the NOTICE file
> >> * distributed with this work for additional information
> >> * regarding copyright ownership. The ASF licenses this file
> >> * to you under the Apache License, Version 2.0 (the
> >> * "License"); you may not use this file except in compliance
> >> * with the License. You may obtain a copy of the License at
> >> *
> >> * http://www.apache.org/licenses/LICENSE-2.0
> >> *
> >> * Unless required by applicable law or agreed to in writing,
> >> * software distributed under the License is distributed on an
> >> * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
> >> * KIND, either express or implied. See the License for the
> >> * specific language governing permissions and limitations
> >> * under the License.
> >> */
> >>
> >> import org.apache.maven.archiva.model.ArtifactReference;
> >> import org.apache.maven.archiva.model.ProjectReference;
> >> import org.apache.maven.archiva.model.VersionedReference;
> >> import org.apache.maven.archiva.repository.layout.LayoutException;
> >>
> >> import java.util.List;
> >>
> >> /**
> >> * RepositoryContent interface for interacting with a managed repository
> >> * in an abstract way, without the need for processing based on
> >> * filesystem paths, or working with the database.
> >> *
> >> * @author <a href="mailto:[EMAIL PROTECTED]">Joakim Erdfelt</a>
> >> * @version $Id$
> >> */
> >> public interface RepositoryContent
> >> {
> >> /**
> >> * Determines if the project referenced exists in the repository.
> >> *
> >> * @param reference the project reference to check for.
> >> * @return true it the project referenced exists.
> >> */
> >> public boolean hasContent( ProjectReference reference );
> >>
> >> /**
> >> * Determines if the version reference exists in the repository.
> >> *
> >> * @param reference the version reference to check for.
> >> * @return true if the version referenced exists.
> >> */
> >> public boolean hasContent( VersionedReference reference );
> >>
> >> /**
> >> * Determines if the artifact referenced exists in the repository.
> >> *
> >> * @param reference the artifact reference to check for.
> >> * @return true if the artifact referenced exists.
> >> */
> >> public boolean hasContent( ArtifactReference reference );
> >>
> >> /**
> >> * Given a repository relative path to a filename, return the
> >> * [EMAIL PROTECTED] VersionedReference} object suitable for the path.
> >> *
> >> * @param path the path relative to the repository base dir for
> >> * the artifact.
> >> * @return the [EMAIL PROTECTED] ArtifactReference} representing the
> >> path.
> >> * (or null if path cannot be converted to a
> >> * [EMAIL PROTECTED] ArtifactReference})
> >> * @throws LayoutException if there was a problem converting the
> >> * path to an artifact.
> >> */
> >> public ArtifactReference toArtifactReference( String path );
> >>
> >> /**
> >> * Given an ArtifactReference, return the relative path to the
> >> * artifact.
> >> *
> >> * @param reference the artifact reference to use.
> >> * @return the relative path to the artifact.
> >> */
> >> public String toPath( ArtifactReference reference );
> >>
> >> /**
> >> * Given an ArtifactReference, return the file reference to the
> >> * artifact.
> >> *
> >> * @param reference the artifact reference to use.
> >> * @return the relative path to the artifact.
> >> */
> >> public File toFile( ArtifactReference reference );
> >>
> >> /**
> >> * Given an ArtifactReference, return the url to the artifact.
> >> *
> >> * @param reference the artifact reference to use.
> >> * @return the relative path to the artifact.
> >> */
> >> public URL toURL( ArtifactReference reference );
> >>
> >>
> >> /**
> >> * Gather up the list of related artifacts to the ArtifactReference
> >> * provided. This typically inclues the pom files, and those things
> >> * with classifiers (such as doc, source code, test libs, etc...)
> >> *
> >> * NOTE: Some layouts (such as maven 1 "legacy"), and remote
> >> * repositories are not compatible with this query.
> >> *
> >> * @param reference the reference to work off of.
> >> * @return the list of ArtifactReferences for related artifacts.
> >> * @throws ContentNotFoundException if the initial artifact
> reference
> >> * does not exist within the repository.
> >> */
> >> public List<ArtifactReference> getRelatedArtifacts(
> >> ArtifactReference reference )
> >> throws ContentNotFoundException, NotSupportedException;
> >>
> >> /**
> >> * Given a specific VersionedReference, return the list of available
> >> * versions for that versioned reference.
> >> *
> >> * NOTE: This is really only useful when working with SNAPSHOTs.
> >> * Not compatible with remote repositories.
> >> *
> >> * @param reference the versioned reference to work off of.
> >> * @return the list of versions found.
> >> * @throws ContentNotFoundException if the versioned reference does
> >> * not exist within the repository.
> >> */
> >> public List<String> getVersions( VersionedReference reference )
> >> throws ContentNotFoundException, NotSupportedException;
> >>
> >> /**
> >> * Given a specific ProjectReference, return the list of available
> >> * versions for that project reference.
> >> *
> >> * @param reference the project reference to work off of.
> >> * @return the list of versions found for that project reference.
> >> * @throws ContentNotFoundException if the project reference does
> not
> >> * exist within the repository.
> >> */
> >> public List<String> getVersions( ProjectReference reference )
> >> throws ContentNotFoundException, NotSupportedException;
> >> }
> >> --(snip)--
> >>
> >> I feel this is a better long term solution for the persistent layout
> >> parsing issues we have within Archiva. However not all of the problems
> >> have been solved. I've outlined the ones that need help above in this
> >> email, but I'm sure there are ones that have been overlooked.
> >>
> >> Disclaimer: Yes, this is in the form of a proposal, but I'm already
> >> working on this, and will continue down this path unless
> >> someone here has a strong objection about this approach.
> >>
> >> --
> >> - Joakim Erdfelt
> >> Committer and PMC Member, Apache Maven
> >> Archiva Developer
> >> [EMAIL PROTECTED]
> >>
> >>
> >>
> >
> >
>
>
> --
> - Joakim Erdfelt
> [EMAIL PROTECTED]
> Open Source Software (OSS) Developer
>
>