Just on question about the proposed logic for detecting layout.
4 If 3 parts ( dir/dir/filename ) then
...
Is there any case where a 3 part path can be a maven2 path ???
Nico.
2007/10/5, Joakim Erdfelt <[EMAIL PROTECTED]>:
>
>
> This is a long email, read it all before commenting, and you'll likely
> see a response to your earlier questions. :-)
>
> I'm currently working on MRM-432 and MRM-519, and I'm in the middle of an
> important change to how Archiva handles Layout detection, interaction,
> and parsing.
>
> :Background:
>
> Layouts in Archiva have 2 main purposes.
>
> 1. to convert a path to an artifact reference.
> 2. to convert an artifact reference to a path.
>
> Layouts are used by the following.
>
> 1. The "/repository/${repoid}/" urls use layouts to determine the
> Artifact Reference that the client is requesting.
> The "/repository/" url is layout neutral, and can have maven 1
> clients ask for content in legacy format, or maven 2 clients ask
> for content in default layout.
> 2. Proxy requests out to remote repositories utilize layouts to take
> an internal Artifact Reference, convert it to a path appropriate
> to the remote layout configuration and obtain the content that is
> desired.
> 3. Simple Consumers utilize layouts to obtain File references, and
> Artifact References to the repository content for purposes of
> operating on the content in a way that they desire.
> 4. Complex consumers (such as metadata updater, and snapshots purge)
> utilize layouts to obtain lists of versions and artifacts.
>
> What Works.
>
> * Converting an Artifact Reference to a path.
> * Discovering Versions in a default layout.
> (needed by metadata update / snapshot purge)
> * Converting a default layout path to an Artifact Reference correctly.
>
> What Doesn't Work.
>
> * Detecting the layout in use 100% of the time.
> * Converting a legacy layout path to an Artifact Reference 100% of
> the time.
> * Discovering versions in a legacy layout.
> (do we need metadata update / snapshot purge here?)
> * Reporting problems correctly.
>
> :The Problem:
>
> The inability to parse useful information in a consistent way for all
> provided paths.
> Gleaning the following information from the path.
>
> * Layout Type (default / legacy)
> * Group ID
> * Artifact ID
> * Version (Deployed version & Base version)
> * Classifier (Not applicable in legacy layout)
> * Type (Not the same as Extension)
>
> Example Paths: (included in this email for discussion, actual list
> from test cases)
>
> groupId/jars/-1.0.jar
> org.apache.maven.test/jars/artifactId-1.0.war
> ch.ethz.ganymed/jars/ganymed-ssh2-build210.jar
> javax/jars/comm-3.0-u1.jar
> javax.persistence/jars/ejb-3.0-public_review.jar
> maven/jars/maven-test-plugin-1.8.2.jar
> commons-lang/jars/commons-lang-2.1.jar
> org.apache.derby/jars/derby-10.2.2.0.jar
> com.foo/ejbs/foo-client-1.0.jar
> com.foo.lib/javadoc.jars/foo-lib-2.1-alpha-1-javadoc.jar
> com.foo.lib/java-sources/foo-lib-2.1-alpha-1-sources.jar
> com.foo/jars/foo-tool-1.0.jar
> org.apache.geronimo.specs/jars/geronimo-ejb_2.1_spec-1.0.1.jar
> directory-clients/poms/ldap-clients-0.9.1-SNAPSHOT.pom
> org.apache.archiva.test/jars/redonkulous-3.1-beta-1-20050831.101112-42.jar
> invalid/invalid/1.0-20050611.123456-1/invalid-1.0-20050611.123456-1.jar
> ch/ethz/ganymed/ganymed-ssh2/build210/ganymed-ssh2-build210.jar
> javax/comm/3.0-u1/comm-3.0-u1.jar
> javax/persistence/ejb/3.0-public_review/ejb-3.0-public_review.jar
> maven/maven-test-plugin/1.8.2/maven-test-plugin-1.8.2.pom
> test/maven-arch/test-arch/2.0.3-SNAPSHOT/test-arch-2.0.3-SNAPSHOT.pom
> com/company/department/com.company.department/0.2/com.company.department-
> 0.2.pom
>
> com/company/department/com.company.department.project/0.3/com.company.department.project-
> 0.3.pom
> com/foo/foo-tool/1.0/foo-tool-1.0.jar
> commons-lang/commons-lang/2.1/commons-lang-2.1.jar
> com/foo/foo-client/1.0/foo-client-1.0.jar
> com/foo/lib/foo-lib/2.1-alpha-1/foo-lib-2.1-alpha-1-sources.jar
> org/apache/archiva/test/redonkulous/3.1-beta-1-SNAPSHOT/redonkulous-
> 3.1-beta-1-20050831.101112-42.jar
>
> :Proposal:
>
> The proposed logic for detecting layout.
>
> 1. Split path by directory seperators.
> 2. If more than 3 parts ( dir/dir/dir/dir/filename ) == default layout.
> 3. If less than 3 parts ( dir/filename ) == invalid path.
> 4. If 3 parts ( dir/dir/filename ) then
> 4.1. If part 2 name ends in "s" then test for potential legacy layout.
> 4.1.1. Identify filename extension.
> 4.1.2. Get potential list of artifact types for extension.
> 4.1.3. If part 2 (minus the end "s") is in the list of
> artifact types == legacy layout
> 4.2. Can't be legacy, then hand off to default layout.
>
> The problem with this approach is maintaining the list of extensions to
> artifact type. The artifact type is arbitrary, and can be expanded
> upon by the user to include types that we can't even imagine today.
> (See MRM-481: issue with extension .xml.zip)
>
> The proposed logic for parsing default layout paths.
>
> This one is easy. paths are in the following format ...
>
>
>
> ${groupId}/${artifactId}/${baseVersion}/${artifactId}-${version}-${classifier}.${type}
>
> Once we seperate out the directories from the filename, we get the
> following order.
>
> dirs[dirs.length] = base version.
> dirs[dirs.length-1] = artifact Id.
> dirs[0] thru dirs[dirs.length-2] = groupId.
>
> That gives us the crucial pieces in the filename
> ${artifactId}-${version}, which makes detecting the classifier and
> type easy enough.
>
> The proposed logic for parsing legacy layout paths.
>
> Legacy layouts are tricky. It is nearly impossible to detect, using
> the path alone, the correct artifactId or version. So the process
> will need to read the pom file associated with the artifact Id in order
> to determine the correct Artifact Reference pieces.
>
> The problem with this approach is that we now need 2 pieces of
> information, the repository root (location or url) and the path.
> Plus we incur a hit / read of the pom file.
>
> So, if we use the pseudo-pattern ...
> [:groupId:]/[:type:]s/[:filename:].[:ext:]
> as a starting point, swap out the [:type:] and [:ext:] for "pom" and
> load the pom from the actual repository to determine the groupId,
> artifactId, and version, we can then have an valid Artifact Reference.
>
> The problem with relying on the pom is that it is now required for
> legacy layout "from path" logic, this changes the assumption that poms
> are optional and not required, as well as changing the interface
> to the layout objects to needing a repository as well.
>
> The proposed changes to the codebase.
>
> * Eliminate RepositoryLayoutUtils, roll layout specific filename
> parsing routines into their respected layouts.
> * Eliminate direct usage of BidirectionalRepositoryLayout by
> consumers.
> * Create RepositoryContentRequest that takes the freeform requests
> arriving in from the "/repository/" urls and puts it through
> the logic as outlined above.
> * Rename BidirectionalRepositoryLayout interface to RepositoryContent
> to simplify name and represent new role of accessing repository
> content that requires a repository reference.
> * Create DefaultRepositoryContent and LegacyRepositoryContent
> implementations, that utilize techniques described above, and
> logic already present in DefaultBidirectionalRepositoryLayout and
> LegacayBidirectionalRepositoryLayout.
> * Create AnonymousProjectReader that takes a File object pointing to
> a pom, read the <pomVersion> or <modelVersion> elements and load
> the pom information as appropriate.
> * Create RepositoryContentFactory that returns a RepositoryContent
> implementation for the provided repository id.
>
> Example of new RepositoryContent interface.
>
> --(snip)--
> package org.apache.maven.archiva.repository;
>
> /*
> * Licensed to the Apache Software Foundation (ASF) under one
> * or more contributor license agreements. See the NOTICE file
> * distributed with this work for additional information
> * regarding copyright ownership. The ASF licenses this file
> * to you under the Apache License, Version 2.0 (the
> * "License"); you may not use this file except in compliance
> * with the License. You may obtain a copy of the License at
> *
> * http://www.apache.org/licenses/LICENSE-2.0
> *
> * Unless required by applicable law or agreed to in writing,
> * software distributed under the License is distributed on an
> * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
> * KIND, either express or implied. See the License for the
> * specific language governing permissions and limitations
> * under the License.
> */
>
> import org.apache.maven.archiva.model.ArtifactReference;
> import org.apache.maven.archiva.model.ProjectReference;
> import org.apache.maven.archiva.model.VersionedReference;
> import org.apache.maven.archiva.repository.layout.LayoutException;
>
> import java.util.List;
>
> /**
> * RepositoryContent interface for interacting with a managed repository
> * in an abstract way, without the need for processing based on
> * filesystem paths, or working with the database.
> *
> * @author <a href="mailto:[EMAIL PROTECTED]">Joakim Erdfelt</a>
> * @version $Id$
> */
> public interface RepositoryContent
> {
> /**
> * Determines if the project referenced exists in the repository.
> *
> * @param reference the project reference to check for.
> * @return true it the project referenced exists.
> */
> public boolean hasContent( ProjectReference reference );
>
> /**
> * Determines if the version reference exists in the repository.
> *
> * @param reference the version reference to check for.
> * @return true if the version referenced exists.
> */
> public boolean hasContent( VersionedReference reference );
>
> /**
> * Determines if the artifact referenced exists in the repository.
> *
> * @param reference the artifact reference to check for.
> * @return true if the artifact referenced exists.
> */
> public boolean hasContent( ArtifactReference reference );
>
> /**
> * Given a repository relative path to a filename, return the
> * [EMAIL PROTECTED] VersionedReference} object suitable for the path.
> *
> * @param path the path relative to the repository base dir for
> * the artifact.
> * @return the [EMAIL PROTECTED] ArtifactReference} representing the path.
> * (or null if path cannot be converted to a
> * [EMAIL PROTECTED] ArtifactReference})
> * @throws LayoutException if there was a problem converting the
> * path to an artifact.
> */
> public ArtifactReference toArtifactReference( String path );
>
> /**
> * Given an ArtifactReference, return the relative path to the
> * artifact.
> *
> * @param reference the artifact reference to use.
> * @return the relative path to the artifact.
> */
> public String toPath( ArtifactReference reference );
>
> /**
> * Given an ArtifactReference, return the file reference to the
> * artifact.
> *
> * @param reference the artifact reference to use.
> * @return the relative path to the artifact.
> */
> public File toFile( ArtifactReference reference );
>
> /**
> * Given an ArtifactReference, return the url to the artifact.
> *
> * @param reference the artifact reference to use.
> * @return the relative path to the artifact.
> */
> public URL toURL( ArtifactReference reference );
>
>
> /**
> * Gather up the list of related artifacts to the ArtifactReference
> * provided. This typically inclues the pom files, and those things
> * with classifiers (such as doc, source code, test libs, etc...)
> *
> * NOTE: Some layouts (such as maven 1 "legacy"), and remote
> * repositories are not compatible with this query.
> *
> * @param reference the reference to work off of.
> * @return the list of ArtifactReferences for related artifacts.
> * @throws ContentNotFoundException if the initial artifact reference
> * does not exist within the repository.
> */
> public List<ArtifactReference> getRelatedArtifacts(
> ArtifactReference reference )
> throws ContentNotFoundException, NotSupportedException;
>
> /**
> * Given a specific VersionedReference, return the list of available
> * versions for that versioned reference.
> *
> * NOTE: This is really only useful when working with SNAPSHOTs.
> * Not compatible with remote repositories.
> *
> * @param reference the versioned reference to work off of.
> * @return the list of versions found.
> * @throws ContentNotFoundException if the versioned reference does
> * not exist within the repository.
> */
> public List<String> getVersions( VersionedReference reference )
> throws ContentNotFoundException, NotSupportedException;
>
> /**
> * Given a specific ProjectReference, return the list of available
> * versions for that project reference.
> *
> * @param reference the project reference to work off of.
> * @return the list of versions found for that project reference.
> * @throws ContentNotFoundException if the project reference does not
> * exist within the repository.
> */
> public List<String> getVersions( ProjectReference reference )
> throws ContentNotFoundException, NotSupportedException;
> }
> --(snip)--
>
> I feel this is a better long term solution for the persistent layout
> parsing issues we have within Archiva. However not all of the problems
> have been solved. I've outlined the ones that need help above in this
> email, but I'm sure there are ones that have been overlooked.
>
> Disclaimer: Yes, this is in the form of a proposal, but I'm already
> working on this, and will continue down this path unless
> someone here has a strong objection about this approach.
>
> --
> - Joakim Erdfelt
> Committer and PMC Member, Apache Maven
> Archiva Developer
> [EMAIL PROTECTED]
>
>