Re: [gradle-dev] some thoughts on the dsl for multiple outputs for jvm based projects

Luke Daley Wed, 06 Feb 2013 02:10:59 -0800

On 06/02/2013, at 12:57 AM, Adam Murdoch <[email protected]> wrote:


> 
> On 06/02/2013, at 10:45 AM, Luke Daley wrote:
> 
>> 
>> 
>> On 05/02/2013, at 23:08, Adam Murdoch <[email protected]> wrote:
>> 
>>> 
>>> On 06/02/2013, at 2:27 AM, Daz DeBoer wrote:
>>> 
>>>> On 4 February 2013 15:50, Adam Murdoch <[email protected]> wrote:
>>>> 
>>>> On 05/02/2013, at 5:12 AM, Daz DeBoer wrote:
>>>> 
>>>>> On 4 February 2013 00:07, Adam Murdoch <[email protected]> 
>>>>> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> So, we're planning to have a bunch of 'jvm binaries' that can be built 
>>>>>> from
>>>>>> various language source sets and other things. There will be a few 
>>>>>> different
>>>>>> types of binaries, such as class directory binaries and jar binaries,
>>>>>> possibly some others.
>>>>>> 
>>>>>> Something we need to sort out is how to structure the DSL for these
>>>>>> executable things. The current plan is to have a single container that 
>>>>>> owns
>>>>>> all of these jvm binaries, so you might declare something like this:
>>>>>> 
>>>>>> jvm {
>>>>>>    binaries {
>>>>>>        mainClasses(ClassesDirectoryBinary) {
>>>>>>            … some inputs and other configuration ...
>>>>>>        }
>>>>>>        mainJar(JarBinary) {
>>>>>>            … some inputs and other configuration …
>>>>>>        }
>>>>>>    }
>>>>>> }
>>>>>> 
>>>>>> There might be a similar container for native binaries:
>>>>>> 
>>>>>> native {
>>>>>>    binaries {
>>>>>>        windowsX86DebugShared(SharedLibraryBinary) {
>>>>>>            … some inputs and other configuration …
>>>>>>        }
>>>>>>        windowsX86DebugStatic(StaticLibraryBinary) {
>>>>>>            ...
>>>>>>        }
>>>>>>        windowsX86DebugExe(ExecutableBinary) {
>>>>>>            …
>>>>>>        }
>>>>>>    }
>>>>>> }
>>>>>> 
>>>>>> Some questions:
>>>>>> 
>>>>>> * Is using a flat name the best way to identify these things? Once you 
>>>>>> add a
>>>>>> few dimensions, the names start to get awkward. This is certainly can be 
>>>>>> the
>>>>>> case for native binaries, and can also be the case for jvm binaries. For
>>>>>> example, I might have (feature, binary type, groovy version, jvm 
>>>>>> version) as
>>>>>> relevant dimensions for a Groovy library that targets multiple groovy
>>>>>> versions and jvm versions.
>>>>> 
>>>>> Are the names of these things important at all? Or in general are we
>>>>> just forcing users to come up with a name that adds little value?
>>>> 
>>>> I think it varies for different types of things. For some things, a name 
>>>> is a natural way of identifying the thing. For other things (most things?) 
>>>> it makes more sense to identify a thing by its type and some attributes 
>>>> about the thing.
>>>> 
>>>> The complication is that the set of attributes that identify a thing vary 
>>>> based on what I'm building. For example:
>>>> 
>>>> * If I have a single publication, then I want to refer to it as 'the 
>>>> publication'. The other stuff (type, groupId, artefactId, version) are 
>>>> just attributes of the publication.
>>>> * If I publish 2 maven modules, then I want to refer to them as the 'api 
>>>> publication' and the 'impl publication', say.
>>>> * If I build debug and release variants of my windows executable, then I 
>>>> want to refer to them as the 'debug executable' and the 'release 
>>>> executable'. All the other stuff (windows, amd64, multi-threaded, 
>>>> visual-c++ compiler, optimisation-level) are just attributes of the 
>>>> publication.
>>>> * If I build debug and release variants on windows and linux for x86 and 
>>>> amd64, then I want to refer to them using a tuple such as (windows, amd64, 
>>>> release).
>>>> 
>>>> That is, a thing often just has a bunch of attributes, any of which could 
>>>> be used to identify it, and it's how the thing is different to the others 
>>>> that is useful for identifying it.
>>>> 
>>>> Right, so it "name" just another one of those ways of identifying? 
>>>> Sometimes I want to give something a meaningful name, sometimes forcing me 
>>>> to come up with a name is a pain in the ass.
>>>>  
>>>> One nice aspect of ditching the name is that a thing can more naturally 
>>>> live in different containers and be grouped in different ways. Which would 
>>>> mean that some of these questions about how things are grouped become less 
>>>> important - just group them whichever way you like.
>>>> 
>>>> 
>>>>> How
>>>>> often does a user need to differentiate between them by name?
>>>> 
>>>> There are a few main reasons, I think:
>>>> 
>>>> 1. To configure something that some other logic (a plugin, say) has 
>>>> already defined.
>>>> 2. To configure the tasks that do work with the thing (compile it, 
>>>> generate the pom.xml for it, publish it).
>>>> 3. To find the thing to use it as input for some other thing.
>>>> 4. To refer to the thing before the 'identifying' attributes have been 
>>>> calculated. For example, to refer to a publication before the version has 
>>>> been calculated.
>>>> 
>>>> None this necessarily requires a name - this is just what the name is used 
>>>> for at the moment.
>>>> 
>>>> And I'm not sure any of these are the 'standard' case either. Again I 
>>>> refer to repositories: imagine that we used the new "name(Type)" syntax. 
>>>> Users would be forced to come up with a name for each of their 
>>>> repositories, which would likely not be used elsewhere. Instead, we give 
>>>> the ability to supply a name _if_ they want to refer to the repository 
>>>> elsewhere.
>>>> 
>>>> One thing that concerns me about the "name(Type) {}" syntax is that it's 
>>>> possibly trickier to document, and trickier for users to grok what's going 
>>>> on. In some cases it might make for a cleaner DSL, but I'm not certain 
>>>> it's worth the cost.
>>>>> We could consider a DSL similar to the repositories syntax:
>>>>> 
>>>>> jvm {
>>>>>    binaries {
>>>>>        classes {
>>>>>            name "main" // optional
>>>>>            … some inputs and other configuration ...
>>>>>        }
>>>>>        jar {
>>>>>            ... we generate a sensible name ...
>>>>>            … some inputs and other configuration …
>>>>>        }
>>>>>    }
>>>>> }
>>>>> 
>>>>> It's possible that we treat this as a standard pattern, whereby a
>>>>> NamedDomainObjectContainer could support both with some sort of DSL
>>>>> magic:
>>>>> 
>>>>> container {
>>>>>      name(Type) {}
>>>>>      subtype { // generated name }
>>>>> }
>>>>> 
>>>>> Or maybe get rid of the 'name' method altogether, and go with:
>>>>> 
>>>>> // In all cases the added element must provide a unique name, which
>>>>> may or may not be configured explicitly.
>>>>> container {
>>>>>       generalType(SubType) {} // eg 'publication' for 'publications'
>>>>> container, or 'dependency' for 'dependencies' container.
>>>>>       subType { } // eg 'ivy' for 'publications' or 'project' for
>>>>> 'dependencies'
>>>>> }
>>>> 
>>>> These are both interesting options for defining things. One question is 
>>>> how do I get something out again, to either configure it or use it?
>>>> 
>>>> There would be options:
>>>> container.findOne({attrib == "value"})
>>>> container.findOne(attrib1: "value", attrib2: "value")
>>>> container['name']
>>>> container.name
>>>> 
>>>> Note that I'm not suggesting doing away with "name" altogether, but 
>>>> instead making it optional.
>>> 
>>> It might be interesting to push this further, and make name a decoration of 
>>> some kind. We've already discussed here a few cases where sometimes name is 
>>> relevant and sometimes its not. This isn't a function of the type of thing, 
>>> but it is instead a function of how the thing is used. Here are some other 
>>> cases:
>>> 
>>> * Sometimes a piece of code is used as a task and sometimes as an action. A 
>>> task is really just an action with a name. The name allows us to do some 
>>> useful stuff with the piece of code (e.g. track its history, declare 
>>> dependencies and so on), but sometimes we don't care about this useful 
>>> stuff.
>> 
>> The task name is also the primary interface between the user and Gradle.
> 
> Indeed. This is part of the 'useful stuff'.

Point taken, but I think it's worth pointing out that this is beyond 
fundamental to the way that Gradle works currently.

>>> * When using, say, a JavaSourceSet as an input, we don't care about the 
>>> name of the source set. We just care that it can describe some source files 
>>> and compile dependencies. If we keep name off JavaSourceSet, we allow other 
>>> interesting implementations that can be used as input (but not necessarily 
>>> output) without forcing each one to have an arbitrary name.
>> 
>> How do we require names for this now?
> 
> Because these things (sometimes) need to be buildable, and to build something 
> we currently need a name for it. Whereas to consume something, we don't need 
> an identity if we have an object reference to the thing.

I still don't get it. There are all kinds of unnamed buildable things, e.g. 
file collections.

>>> * Coming from the other direction: Some of our domain objects are defined 
>>> using attributes other than a name. For example, dependencies are defined 
>>> using (group, module, version). However, these are treated as the 
>>> identifier of the dependency and cannot be changed, even though its quite 
>>> ok that these are changed, up to the point that they are consumed.  In 
>>> other words, they're just attributes of the dependency. Having a consistent 
>>> way to define domain objects in terms of their attributes, and making 
>>> identity a decoration, would mean dependencies and publish artefacts can be 
>>> defined and used in the same way as everything else.
>> 
>> 
>> 
>>> 
>>> Putting together a few ideas from this thread (this DSL isn't quite right, 
>>> but should give the idea):
>>> 
>>> // defines a NativeExecutable, with a generated name. With some AST magic 
>>> the name might be 'someNativeBinary'
>>> def someNativeBinary = items.nativeExecutable { os 'windows'; architecture 
>>> 'amd64'; debug: true }
>> 
>> I know it's not the point but we should be _very_ care about introducing any 
>> more ASTs. There use can be very confusing for users and could make IDE 
>> support even more difficult.
> 
> Absolutely. It needs to be worth it.
> 
> I don't see this particular transform as overly risky. The IDE can infer the 
> return type of items.nativeExecutable()

How could it infer it? 

I can see how it might be possible with sophisticated flow analysis. But that 
would mean the IDE needs to know which plugins have been applied at which point 
in the script and which factories they add to “items”.

> and hence the type of someNativeBinary just fine. It doesn't introduce any 
> new syntax. It just takes advantage of an otherwise quite natural syntax, ie 
> this statement would work just fine without the transform.

I'm not convinced, but I don't think it matters right now.

>>> // defines an IvyPublication, with a provided name
>>> def myPublication = items.ivyPublication { name 'main'; organisation: 
>>> 'my-org'; module: 'my-module' }
>> 
>> So items is just a factory?
> 
> Maybe. There are 2 parts: creating things and finding things. Maybe `items` 
> can do both, maybe there are 2 separate things.

It at least needs to be the graph. I would think factory like behaviour would 
be a convenience and not fundamental. Actually, more correctly, it needs to be 
a query engine for the graph. The graph is already there in the connections 
between objects, we just need a way to dig out parts.

>>> // do some things with the publication
>>> myPublication.revision = '1.2'
>>> publishing.publications << myPublication
>> 
>> Why would there even be a publications container? Couldn't you just query 
>> the items graph for all of the publications?
> 
> Good question. Currently, the publications container declares the purpose or 
> role of a publication. When it's in the container, its a public output of the 
> project. When it's not, it's a publication used for some other (undisclosed) 
> purpose and we can't infer anything about it beyond how to build it.

This could be a characteristic of the publication itself, not of its context. 
Then finding the “public” publications just becomes a more refined query.

>>> // creates a CompositeSourceSet with name `main` and implicitly adds it to 
>>> the `sources` container
>>> source {
>>>     main { … } 
>>> }
>>> 
>>> // which is the same as
>>> source.add(items.compositeSourceSet(name: 'main', { … }))
>>> 
>>> // creates an IvyRepository with generated name and implicitly adds it to 
>>> the `repositories` container
>>> repositories {
>>>     ivy { … }
>>> }
>> 
>> Off the point again, but…
>> 
>> If we are considering heavy DSL changes, helping IDEs understand should be 
>> high priority. Type tokens would go a long way (combined with DSLD).
> 
> I don't think there any real difference between
> 
> name(SomeClassName) { … }
> 
> and
> 
> someTypeName { … }
> 
> as far as inference goes. Both are static, in that I don't need to run the 
> script in order to infer the type of the closure delegate or the return 
> value, and both require some additional meta-data, such as the default 
> imports or the name -> type mapping).

Class literals don't require default imports at all, that's just a 
“convenience” we provide. 

> Using class literals has its own rather large downside, of course, in that 
> they need to be resolvable at compile time, whether they are required nor not.

Agreed, this is a big problem. However, (if) we could solve this in one place 
and then there's not much for the IDE to do. Using a “smarter” approach means a 
per IDE solution, or some new standard that they all support which is unlikely.

>>> // finds all Ivy repositories, regardless of their purpose and does 
>>> something with them
>>> items.withType.ivy { credentials.userName 'my-user'; credentials.password 
>>> 'my-password }
>>> 
>>> // finds all Ivy repositories used for publishing and does something with 
>>> them
>>> publishing.repositories.withType.ivy { … }
>>> 
>>> // finds all dependency declarations on junit
>>> def junitDependencies = items.withType.dependency(group: 'junit', module: 
>>> 'junit')
>>> 
>>> // add all runtime dependencies on a group to another configuration
>>> def deps = configurations.runtime.allDependencies(group: 'my-group')
>>> configurations.otherConfig << deps
>>> 
>>> // specify a version for all dependencies on junit
>>> items.withType.dependency(group: 'junit', module: 'junit) { version '4.11' }
>>> 
>>> // probably some way to define default values to be applied before the 
>>> config closure is executed
>>> // probably some way to listen for the creation of objects
>> 
>> There are some interesting base ideas here, but as long as we need a flat 
>> namespace for tasks, and derive those names from items, I don't see how it 
>> solves the problem. 
> 
> It moves the name out of the DSL, which means things don't have to be given a 
> name if no-one cares or when a name is not relevant, and I can deal with 
> things the same way regardless of whether they do or don't have names. Which 
> makes for a more flexible world and avoids having to answer questions like 
> 'what should we call the production linux amd64 debug static library built 
> with gcc 4.5'? If you care, give it a name. If not, we'll deal.

Right. 

> For things for which there are tasks, only the public tasks need to have a 
> human-consumable name. The others tasks might have some assigned name, or 
> possibly even no name. The name for the public tasks does not necessarily 
> need to be generated from the name of the thing, they might instead use some 
> attributes of the thing.

So how would i ask Gradle to build the “production linux amd64 debug static 
library built with gcc 4.5” ? 

If my project has one publication, being published to one destination, how 
would I perform that(i.e. what would the task be called)? Having to say 
publish[foo:bar:1.0] seems a bit much when there is only one. I'm not sure it's 
that much better when there is more than one either. 
publish[foo:bar:1.0-groovy-1.x] && publish[foo:bar:1.0-groovy-2.x]. I'd 
probably prefer more precise names that highlight the exact difference, or main 
defining characteristic. Point is, I don't think we've solved this awkward 
problem that we've encountered. 

What it does seem to solve nicely though is the graph of variants case. In the 
publication case I used above, we could say that this fits into the graph of 
variants concept as well but I'm not comfortable that that would always be the 
case with multiple publications.

It feels like we are heading down the road of you asking Gradle to perform an 
action related to a thing, instead of asking it to just take an action. 

-- 
Luke Daley
Principal Engineer, Gradleware 
http://gradleware.com


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Re: [gradle-dev] some thoughts on the dsl for multiple outputs for jvm based projects

Reply via email to