Re: [gradle-dev] some thoughts on the dsl for multiple outputs for jvm based projects

Adam Murdoch Mon, 04 Feb 2013 14:50:28 -0800

On 05/02/2013, at 5:12 AM, Daz DeBoer wrote:

> On 4 February 2013 00:07, Adam Murdoch <[email protected]> wrote:
>> Hi,
>> 
>> So, we're planning to have a bunch of 'jvm binaries' that can be built from
>> various language source sets and other things. There will be a few different
>> types of binaries, such as class directory binaries and jar binaries,
>> possibly some others.
>> 
>> Something we need to sort out is how to structure the DSL for these
>> executable things. The current plan is to have a single container that owns
>> all of these jvm binaries, so you might declare something like this:
>> 
>> jvm {
>>    binaries {
>>        mainClasses(ClassesDirectoryBinary) {
>>            … some inputs and other configuration ...
>>        }
>>        mainJar(JarBinary) {
>>            … some inputs and other configuration …
>>        }
>>    }
>> }
>> 
>> There might be a similar container for native binaries:
>> 
>> native {
>>    binaries {
>>        windowsX86DebugShared(SharedLibraryBinary) {
>>            … some inputs and other configuration …
>>        }
>>        windowsX86DebugStatic(StaticLibraryBinary) {
>>            ...
>>        }
>>        windowsX86DebugExe(ExecutableBinary) {
>>            …
>>        }
>>    }
>> }
>> 
>> Some questions:
>> 
>> * Is using a flat name the best way to identify these things? Once you add a
>> few dimensions, the names start to get awkward. This is certainly can be the
>> case for native binaries, and can also be the case for jvm binaries. For
>> example, I might have (feature, binary type, groovy version, jvm version) as
>> relevant dimensions for a Groovy library that targets multiple groovy
>> versions and jvm versions.
> 
> Are the names of these things important at all? Or in general are we
> just forcing users to come up with a name that adds little value?


I think it varies for different types of things. For some things, a name is a 
natural way of identifying the thing. For other things (most things?) it makes 
more sense to identify a thing by its type and some attributes about the thing.

The complication is that the set of attributes that identify a thing vary based 
on what I'm building. For example:

* If I have a single publication, then I want to refer to it as 'the 
publication'. The other stuff (type, groupId, artefactId, version) are just 
attributes of the publication.
* If I publish 2 maven modules, then I want to refer to them as the 'api 
publication' and the 'impl publication', say.
* If I build debug and release variants of my windows executable, then I want 
to refer to them as the 'debug executable' and the 'release executable'. All 
the other stuff (windows, amd64, multi-threaded, visual-c++ compiler, 
optimisation-level) are just attributes of the publication.
* If I build debug and release variants on windows and linux for x86 and amd64, 
then I want to refer to them using a tuple such as (windows, amd64, release).

That is, a thing often just has a bunch of attributes, any of which could be 
used to identify it, and it's how the thing is different to the others that is 
useful for identifying it.

One nice aspect of ditching the name is that a thing can more naturally live in 
different containers and be grouped in different ways. Which would mean that 
some of these questions about how things are grouped become less important - 
just group them whichever way you like.


> How
> often does a user need to differentiate between them by name?

There are a few main reasons, I think:

1. To configure something that some other logic (a plugin, say) has already 
defined.
2. To configure the tasks that do work with the thing (compile it, generate the 
pom.xml for it, publish it).
3. To find the thing to use it as input for some other thing.
4. To refer to the thing before the 'identifying' attributes have been 
calculated. For example, to refer to a publication before the version has been 
calculated.

None this necessarily requires a name - this is just what the name is used for 
at the moment.


> 
> We could consider a DSL similar to the repositories syntax:
> 
> jvm {
>    binaries {
>        classes {
>            name "main" // optional
>            … some inputs and other configuration ...
>        }
>        jar {
>            ... we generate a sensible name ...
>            … some inputs and other configuration …
>        }
>    }
> }
> 
> It's possible that we treat this as a standard pattern, whereby a
> NamedDomainObjectContainer could support both with some sort of DSL
> magic:
> 
> container {
>      name(Type) {}
>      subtype { // generated name }
> }
> 
> Or maybe get rid of the 'name' method altogether, and go with:
> 
> // In all cases the added element must provide a unique name, which
> may or may not be configured explicitly.
> container {
>       generalType(SubType) {} // eg 'publication' for 'publications'
> container, or 'dependency' for 'dependencies' container.
>       subType { } // eg 'ivy' for 'publications' or 'project' for
> 'dependencies'
> }

These are both interesting options for defining things. One question is how do 
I get something out again, to either configure it or use it?

> 
> 
>> * What do we do with specialised types of jvm binaries, that run on the jvm
>> but which require a certain runtime and that are packaged in a certain way:
>> a WAR or exploded J2EE web app or an OSGi bundle or Gradle plugin?
>> 
>> * Is the separation between jvm binaries and native binaries useful? Should
>> there be a single `binaries` container? Or should it be finer-grained to
>> include type, so that there is a `jvm.binaries.classes` and a
>> `jvm.binaries.jar` container and a `native.binaries.staticLibs` container?
>> Is the type of runtime actually less important than the type of thing, so
>> that it should be `binaries.jvm` and `binaries.native`?
> 
> Where would combined-and-optimised javascript fit into this model?
> What about shell-scripts that are tailored for a runtime?

If we consider these things as binaries (and we might), then the answer to this 
depends somewhat on the question about specialised binaries, above. Javascript 
'binaries' would target a different type of runtime, just like jvm and native 
binaries target different types of runtimes. Shell scripts might be better 
treated as a way of packaging a command-line application, as either a 'native' 
or maybe a more specialised 'shell' binary.

I think a question we need to answer is whether there is something common here 
between all these things, either as an abstraction or a pattern, or whether 
it's all just coincidence.

So far, we've been using the term 'binary' in a pretty abstract way, to mean 
'something that can run on a particular runtime', where 'runtime' is some 
abstract environment or container. The idea is that both binaries and runtimes 
will be typed in some way.

If we think that the abstract model is a good idea, then do we jam all binaries 
into the same container? Do we group them by runtime? By role? By runtime 
'family' (e.g. 'jvm', 'native', 'javascript')? By type? Something else?


> 
> Maybe we need a few more use cases to flesh out the DSL.

There are plenty. Any ideas what would be useful?


> Or would
> these be declared in a different container?
> 
>> 
>> Very similar questions to source sets, re. how to arrange them and which
>> dimension wins over the others and which need to be encoded in the name and
>> which are encoded in the structure. Maybe we should rethink our container
>> DSL a bit more deeply. The publications would also benefit from have a
>> composite identifier (e.g. groupId, artifactId, version).
> 
> Yes I think this could benefit from a re-think - the current proposed DSL is:
> 
>    publications {
>        myPublication(IvyPublication) {
>            organisation 'my-organisation'
>            module 'my-module'
>            revision '1.2'
>        }
>    }
> 
> The only thing the name "myPublication" is currently used for is
> generating task names. Other than that, it adds little value. We will
> be enforcing that the org:module:revision is unique, and this is
> really how the publication is identified.
> 
> In that case, something like the last option above might work better:
> 
>    publications {
>        ivy {
>            organisation 'my-organisation'
>            module 'my-module'
>            revision '1.2'
>        }

For most publications, these attributes can be inferred. Is the idea that you 
define the full identifier, or just describe how it's different to the default?


--
Adam Murdoch
Gradle Co-founder
http://www.gradle.org
VP of Engineering, Gradleware Inc. - Gradle Training, Support, Consulting
http://www.gradleware.com

Re: [gradle-dev] some thoughts on the dsl for multiple outputs for jvm based projects

Reply via email to