AW: [DISCUSS] Incorporating an ArchitectureId into the GAVCT of the repository

Christofer Dutz Thu, 01 Sep 2016 04:10:00 -0700

We are having exactly the same problems in the Apache FlexJS project. Here we 
have libraries that compile to Flash (swc) or Libs that compile to JavaScript 
(js) or Libs that compile to both (jswc). Depending on the target 
"architecture" it would be great to have different dependency trees.



But I guess your problem could be solved (and has been for SWT deps) by using 
architecture-activated profiles and using classifiers for the architecture.


I know this doesn't work transitively though.


Chris

________________________________
Von: Stephen Connolly <stephen.alan.conno...@gmail.com>
Gesendet: Donnerstag, 1. September 2016 12:07:18
An: Maven Developers List
Betreff: [DISCUSS] Incorporating an ArchitectureId into the GAVCT of the 
repository

One of the things I feel is necessary to grow Maven in the modelVersion
5.0.0 world is to start taking account of architecture specific artifacts.

Currently, the Maven repository layout does not handle architecture
specific dependencies well.

So, for example:

Say I have a foo.jar that depends on a native library... bar.dll /
libbar.so / etc

Ideally we'd like to say that foo just depends on bar...

A consumer of foo that is running on, say my local machine, could then see
that I am running on os-x- x86_64 and because I am wanting to run tests...
it would look for bar with the architecture of `os-x-x86_64` to get the
native library for me

When I am building the installer for windows on my os-x machine (using say
.NET and the WiX toolchain) the corresponding (future does not exist yet)
maven plugin could request the win-x86 architecture of the dependency and
the rpm plugin could request the linux-ppc, linux-arm64, linux-x86 and
linux-x86_64 artifacts in order to produce the corresponding rpm
architecture artifacts

So when I think about this concept... I feel it is important that we find a
way to introduce the architectureId into the GACVT of the repository.

When we do this, to my mind, we need to be mindful that modelVersion 4.0.0
consumers would like to be able to consume these architecture specific
dependencies also... and the 4.0.0 GAV constraints will constrain the
possible solutions that we can pick if we value letting 4.0.0 consumers
access these architecture specific artifacts via the `default` layout we
currently employ for the maven repository.

So the first things first... our current `default` layout transforms the
GroupId:ArtifactId:Version:Classifier:Type into a repository URL of

`${groupId.replaceAll('.','/')}/${artifactId}/${version}/${artifactId}-${version}${classifier==null?'':'-'+classifier}.${type}`

If we want to add architectureId into that URL Path and still have that
resolvable by GAVCT at a modelVersion 4.0.0 coordinate, we are basically
left with stuffing the architectureId into one of the existing components...

Now when we think about an architecture specific artifact, the first thing
that comes to mind is that each architecture specific artifact most likely
has different dependencies... hopefully the .pdt file (that would be
deployed at the GAV without an architecture... modulo multi-machine builds)
would provide the architecture specific dependency trees so that
modelVersion 5.0.0 aware consumers would - just naturally - be aware of
those differences in dependencies

But - if we want to give the modelVersion 4.0.0 consumers our best effort -
we probably need to give each architectureId it's own modelVersion 4.0.0
pom.

In other words, I do not think we should try to munge the architectureId
into either classifier or type as both of those would force the
dependencies to be viewed as having the same dependencies in the
modelVersion 4.0.0 world

So that leaves us with groupId, artifactId and version...

I personally think version is a non-runner. In modelVersion 4.0.0 you can
only depend on one version of a dependency at a time... version ranges
would become completely and utterly unusable (never mind that they are
unusable now)... plus my gut tells me that it would be a total mess!

So that leaves groupId and artifactId... our choices basically boil down to

legacyGroupId == '${groupId}'; legacyArtifactId ==
'${architectureId}.${artifactId}'
legacyGroupId == '${groupId}'; legacyArtifactId ==
'${architectureId}-${artifactId}'
legacyGroupId == '${groupId}'; legacyArtifactId ==
'${artifactId}.${architectureId}'
legacyGroupId == '${groupId}'; legacyArtifactId ==
'${artifactId}-${architectureId}'
legacyGroupId == '${groupId}.${architectureId}'; legacyArtifactId ==
'${artifactId}'
legacyGroupId == '${groupId}.${artifactId}'; legacyArtifactId ==
'${architectureId}'

I personally think that the ones that place `architectureId` lexically
before `artifactId` are not "right"... the most important coordinate is the
groupId, the next most is the artifactId, then the architecture, then the
version, etc

So to my mind that leaves us with:

legacyGroupId == '${groupId}'; legacyArtifactId ==
'${artifactId}.${architectureId}'
legacyGroupId == '${groupId}'; legacyArtifactId ==
'${artifactId}-${architectureId}'
legacyGroupId == '${groupId}.${artifactId}'; legacyArtifactId ==
'${architectureId}'

Now when we look at how, say, a modelVersion 4.0.0 consumer would use these
dependencies... the variant where we shift the artifactId into the groupId
would mean that you would end up with loads of `linux-arm`
"legacyArtifactId" dependencies in your modelVersion 4.0.0 consumer...
which would presumably be ugly (just like now if you have two matching
`artifactId` dependencies in your .war which forces us to disambiguate by
prefixing the groupId when copying into WEB-INF/lib)... so I am going to
reject that one also.

The convention seems to be that the artifactId does not contain a `.` with
most artifacts that I am aware of using `-` as the separator... this could
be used to argue either way... my preference is to run with `-` as the
separator... though I am open to using `.` to provide a convention that
architecture is distinguished using a `.`

So how would this work...

Ok, I have my foobar project that builds a .jar and the native libraries
that are required by that .jar

So from the reactor for that project we want to deploy

com.example:foobar:::1.0:pom (the legacy pom for the .jar to allow
modelVersion 4.0.0 consumption of the jar)
com.example:foobar:::1.0:pdt (the modern project dependency trees for all
attached artifacts)
com.example:foobar:::1.0:jar (the jar)
com.example:foobar::javadoc:1.0:jar (the javadoc jar)
com.example:foobar::sources:1.0:jar (the source jar)
com.example:foobar:win_x86::1.0:pom (the legacy pom for the 32-bit DLL)
com.example:foobar:win_x86::1.0:dll (the 32-bit DLL... alternatively the
type might be `native-library` or `lib` but let's assume DLL)
com.example:foobar:win_x86_64::1.0:pom (the legacy pom for the 64-bit DLL)
com.example:foobar:win_x86_64::1.0:dll (the 64-bit DLL)
com.example:foobar:osx_x86_64::1.0:pom (the legacy pom for the 64-bit OS-X
.dylib)
com.example:foobar:osx_x86_64::1.0:dylib (the 64-bit .dylib...
alternatively the type might be `native-library` or `lib` but let's assume
dylib)
com.example:foobar:elf_arm::1.0:pom (the legacy pom for the linux ARM .so)
com.example:foobar:elf_arm::1.0:so (the ARM .so ... alternatively the type
might be `native-library` or `lib` but let's assume so)
com.example:foobar:elf_x86::1.0:pom (the legacy pom for the linux x86
32-bit .so)
com.example:foobar:elf_x86::1.0:so (the x86-32-bit .so)
com.example:foobar:elf_x86_64::1.0:pom (the legacy pom for the linux x86
64-bit .so)
com.example:foobar:elf_x86_64::1.0:so (the x86 64-bit .so)

My main build machine cannot cross-compile for PPC or ARM64... so we have
two other build machines that will want to produce the extra architecture
specific artifacts...

com.example:foobar:elf_ppc::1.0:pom (the legacy pom for the linux PPC .so)
com.example:foobar:elf_ppc::1.0:so (the PPC .so)

and

com.example:foobar:elf_arm_64::1.0:pom (the legacy pom for the linux ARM
64-bit .so)
com.example:foobar:elf_arm_64::1.0:so (the ARM 64-bit .so)

In order to accommodate delayed deployment, I am going to suggest that the
PPC and ARM64 deployments should publish their *supplemental* pdts at their
coordinates, e.g.

com.example:foobar:elf_ppc::1.0:pdt (the suplemental project dependency
trees for the PPC reactor artifacts)

and

com.example:foobar:elf_arm_64::1.0:pdt (the suplemental project dependency
trees for the ARM64 reactor artifacts)

So ultimately we would end up with the following files being deployed (in
three "atomic" deployments):

com/example/foobar/1.0/foobar-1.0.pom
com/example/foobar/1.0/foobar-1.0.pdt
com/example/foobar/1.0/foobar-1.0.jar
com/example/foobar/1.0/foobar-1.0-javadoc.jar
com/example/foobar/1.0/foobar-1.0-sources.jar
com/example/foobar-win_x86/1.0/foobar-win_x86-1.0.pom
com/example/foobar-win_x86/1.0/foobar-win_x86-1.0.dll
com/example/foobar-win_x86_64/1.0/foobar-win_x86_64-1.0.pom
com/example/foobar-win_x86_64/1.0/foobar-win_x86_64-1.0.dll
com/example/foobar-osx_x86_64/1.0/foobar-win_x86_64-1.0.pom
com/example/foobar-osx_x86_64/1.0/foobar-win_x86_64-1.0.dylib
com/example/foobar-elf_arm/1.0/foobar-elf_arm-1.0.pom
com/example/foobar-elf_arm/1.0/foobar-elf_arm-1.0.so
com/example/foobar-elf_x86/1.0/foobar-elf_x86-1.0.pom
com/example/foobar-elf_x86/1.0/foobar-elf_x86-1.0.so
com/example/foobar-elf_x86_64/1.0/foobar-elf_x86_64-1.0.pom
com/example/foobar-elf_x86_64/1.0/foobar-elf_x86_64-1.0.so

com/example/foobar-elf_ppc/1.0/foobar-elf_ppc-1.0.pom
com/example/foobar-elf_ppc/1.0/foobar-elf_ppc-1.0.pdt
com/example/foobar-elf_ppc/1.0/foobar-elf_ppc-1.0.so

com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pom
com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pdt
com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.so

When a modelVersion 5.0.0 consumer does something like:

compile: {
  dependencies: ["com.example:foobar:1.0:jar"]
}
test: {
  dependencies: ["org.junit:junit:5.0:jar"]
}

and wants to run its tests on linux ARM64 it will start by resolving
`com/example/foobar/1.0/foobar-1.0.pdt` this will give it the dependency
tree of the `.jar` which will declare an architecture dependent native
library dependency (somehow or other... this is why we may use
`native-library` as the "type")... because it knows that it is running on
ARM64 architecture it will then know that it needs
`com.example:foobar:elf_arm_64::1.0:so` since this is not available in the
`com/example/foobar/1.0/foobar-1.0.pdt` trees it will then attempt to
download `com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pdt` if
that exists, it will use that tree... if it doesn't exist... we fail the
build (technically we could fall back to checking for
`com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pom` and
`com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.so` before failing
the build... but as we know the artifacts were produced by a 5.0.0 aware
producer - as we have `com/example/foobar/1.0/foobar-1.0.pdt` resolved)

A modelVersion 4.0.0 consumer is not really going to be able to have as
flexible a build... but at least they can - through declarations such as

<dependency>
  <groupId>com.example</groupId>
  <artifactId>foobar-elf_arm64</artifactId>
  <version>1.0</version>
  <type>so</type>
</dependency>

grab the .so to bundle into a .zip or installer and if they want to write a
pom with architecture based profile activation injecting test scoped
dependencies they can do that also

WDYT?

If anyone has any experience from the NMaven experiments, or learnings from
.deb or .rpm attempts to solve architecture dependent artifacts mixed with
noarch artifacts... please step forward and join the discussion.

-Stephen

Notes:

1. I am not saying what conventions will be used to define the
`architectureId` values here
2. I am not discussing the schema for the .pdt files here... other than the
general priciple that they will contain multiple dependency trees for each
artifact produced by the project
3. I am not discussing how a modelVersion 5.0.0 build would be invoked or
detect that it should just do the PPC deployment
4. This proposal does not include the new metadata schema that we would
likely require to assist with such a deployment format
5. I am not discussing or proposing a modelVersion 5.0.0 schema here... I
use a non-XML format to help people mentally disassociate thinking about
the architectureId specific things from the current 4.0.0 way of doing
things

AW: [DISCUSS] Incorporating an ArchitectureId into the GAVCT of the repository

Reply via email to