Re: [gradle-dev] some thoughts on the dsl for multiple outputs for jvm based projects

Luke Daley Tue, 05 Feb 2013 02:34:49 -0800

On 04/02/2013, at 10:50 PM, Adam Murdoch <[email protected]> wrote:


> 
> On 05/02/2013, at 5:12 AM, Daz DeBoer wrote:
> 
>> On 4 February 2013 00:07, Adam Murdoch <[email protected]> wrote:
>>> Hi,
>>> 
>>> So, we're planning to have a bunch of 'jvm binaries' that can be built from
>>> various language source sets and other things. There will be a few different
>>> types of binaries, such as class directory binaries and jar binaries,
>>> possibly some others.
>>> 
>>> Something we need to sort out is how to structure the DSL for these
>>> executable things. The current plan is to have a single container that owns
>>> all of these jvm binaries, so you might declare something like this:
>>> 
>>> jvm {
>>>    binaries {
>>>        mainClasses(ClassesDirectoryBinary) {
>>>            … some inputs and other configuration ...
>>>        }
>>>        mainJar(JarBinary) {
>>>            … some inputs and other configuration …
>>>        }
>>>    }
>>> }
>>> 
>>> There might be a similar container for native binaries:
>>> 
>>> native {
>>>    binaries {
>>>        windowsX86DebugShared(SharedLibraryBinary) {
>>>            … some inputs and other configuration …
>>>        }
>>>        windowsX86DebugStatic(StaticLibraryBinary) {
>>>            ...
>>>        }
>>>        windowsX86DebugExe(ExecutableBinary) {
>>>            …
>>>        }
>>>    }
>>> }
>>> 
>>> Some questions:
>>> 
>>> * Is using a flat name the best way to identify these things?

It's pretty easy to see the above as a graph rather than a flat space. Graphs 
are notoriously difficult to declaratively navigate though. It would have to 
become predicate based I think…

binaries {
        find(StaticLibraryBinary, { platform == "windows"; debug == true; arch 
= "x86" }) {

        } 
}

Hard to see that catching on. Another option would be to force arranging as a 
tree and path down…

binaries {
        windows.x86.debug {

        }
}

I don't see that working out though.

>>> Once you add a
>>> few dimensions, the names start to get awkward. This is certainly can be the
>>> case for native binaries, and can also be the case for jvm binaries. For
>>> example, I might have (feature, binary type, groovy version, jvm version) as
>>> relevant dimensions for a Groovy library that targets multiple groovy
>>> versions and jvm versions.
>> 
>> Are the names of these things important at all? Or in general are we
>> just forcing users to come up with a name that adds little value?
> 
> I think it varies for different types of things. For some things, a name is a 
> natural way of identifying the thing. For other things (most things?) it 
> makes more sense to identify a thing by its type and some attributes about 
> the thing.
> 
> The complication is that the set of attributes that identify a thing vary 
> based on what I'm building. For example:
> 
> * If I have a single publication, then I want to refer to it as 'the 
> publication'. The other stuff (type, groupId, artefactId, version) are just 
> attributes of the publication.
> * If I publish 2 maven modules, then I want to refer to them as the 'api 
> publication' and the 'impl publication', say.
> * If I build debug and release variants of my windows executable, then I want 
> to refer to them as the 'debug executable' and the 'release executable'. All 
> the other stuff (windows, amd64, multi-threaded, visual-c++ compiler, 
> optimisation-level) are just attributes of the publication.
> * If I build debug and release variants on windows and linux for x86 and 
> amd64, then I want to refer to them using a tuple such as (windows, amd64, 
> release).

I actually like the “main” paradigm for dealing with the > 1 boundary for 
naming or for irrelevant names. That accurately captures the reality.

> That is, a thing often just has a bunch of attributes, any of which could be 
> used to identify it, and it's how the thing is different to the others that 
> is useful for identifying it.

Which makes selecting via predicate appealing.

> One nice aspect of ditching the name is that a thing can more naturally live 
> in different containers and be grouped in different ways. Which would mean 
> that some of these questions about how things are grouped become less 
> important - just group them whichever way you like.
> 
> 
>> How
>> often does a user need to differentiate between them by name?
> 
> There are a few main reasons, I think:
> 
> 1. To configure something that some other logic (a plugin, say) has already 
> defined.
> 2. To configure the tasks that do work with the thing (compile it, generate 
> the pom.xml for it, publish it).
> 3. To find the thing to use it as input for some other thing.
> 4. To refer to the thing before the 'identifying' attributes have been 
> calculated. For example, to refer to a publication before the version has 
> been calculated.
> 
> None this necessarily requires a name - this is just what the name is used 
> for at the moment.

I can kind of see how we could avoid naming if we completely flipped around our 
current model to be build item based instead of task based (i.e. vertices are 
inputs/outputs and edges are tasks), . That's probably too big a change to even 
entertain the idea of at this point though.

>> We could consider a DSL similar to the repositories syntax:
>> 
>> jvm {
>>    binaries {
>>        classes {
>>            name "main" // optional
>>            … some inputs and other configuration ...
>>        }
>>        jar {
>>            ... we generate a sensible name ...
>>            … some inputs and other configuration …
>>        }
>>    }
>> }

Not a deal breaker, but the assumption that Named things have an immutable name 
runs pretty deep right now.

>> It's possible that we treat this as a standard pattern, whereby a
>> NamedDomainObjectContainer could support both with some sort of DSL
>> magic:
>> 
>> container {
>>      name(Type) {}
>>      subtype { // generated name }
>> }
>> 
>> Or maybe get rid of the 'name' method altogether, and go with:
>> 
>> // In all cases the added element must provide a unique name, which
>> may or may not be configured explicitly.
>> container {
>>       generalType(SubType) {} // eg 'publication' for 'publications'
>> container, or 'dependency' for 'dependencies' container.
>>       subType { } // eg 'ivy' for 'publications' or 'project' for
>> 'dependencies'
>> }
> 
> These are both interesting options for defining things. One question is how 
> do I get something out again, to either configure it or use it?
> 
>> 
>> 
>>> * What do we do with specialised types of jvm binaries, that run on the jvm
>>> but which require a certain runtime and that are packaged in a certain way:
>>> a WAR or exploded J2EE web app or an OSGi bundle or Gradle plugin?
>>> 
>>> * Is the separation between jvm binaries and native binaries useful? Should
>>> there be a single `binaries` container? Or should it be finer-grained to
>>> include type, so that there is a `jvm.binaries.classes` and a
>>> `jvm.binaries.jar` container and a `native.binaries.staticLibs` container?
>>> Is the type of runtime actually less important than the type of thing, so
>>> that it should be `binaries.jvm` and `binaries.native`?
>> 
>> Where would combined-and-optimised javascript fit into this model?
>> What about shell-scripts that are tailored for a runtime?
> 
> If we consider these things as binaries (and we might), then the answer to 
> this depends somewhat on the question about specialised binaries, above. 
> Javascript 'binaries' would target a different type of runtime, just like jvm 
> and native binaries target different types of runtimes. Shell scripts might 
> be better treated as a way of packaging a command-line application, as either 
> a 'native' or maybe a more specialised 'shell' binary.
> 
> I think a question we need to answer is whether there is something common 
> here between all these things, either as an abstraction or a pattern, or 
> whether it's all just coincidence.
> 
> So far, we've been using the term 'binary' in a pretty abstract way, to mean 
> 'something that can run on a particular runtime', where 'runtime' is some 
> abstract environment or container. The idea is that both binaries and 
> runtimes will be typed in some way.
> 
> If we think that the abstract model is a good idea, then do we jam all 
> binaries into the same container? Do we group them by runtime? By role? By 
> runtime 'family' (e.g. 'jvm', 'native', 'javascript')? By type? Something 
> else?

What does having a “binaries” container actually give us? How many such 
containers are we going to end up with? 

Probably going too far again…

If we could make it work, it seems more appealing to simply have one graph of 
things that we can create views of…

items(SourceSet).main {
        
}

items(StaticLibraryBinary, { platform == "windows" }) {
        
}

items(JavaScriptBundle, { compressed  == true }) {
        
}

something like…

interface BuildItemContainer extends DomainObjectGraph<NamedBuildable> {
  <T extends NamedBuildable> BuildItemContainer<T> find(Class<T> targetType)  
  <T extends NamedBuildable> BuildItemContainer<T> find(Class<T> targetType, 
Spec<? super T> predicate)
  <T extends NamedBuildable> BuildItemContainer<T> find(Class<T> targetType, 
String name)
}

(Where project has item(*) methods that delegate to project.items.find(*) in 
this case). Everything has a name, but it only needs to be unique in certain 
contexts. One thing about the above is that it would make IDE support a bit 
simpler as we could cut the width of the API that they need to know about right 
down.

Come to think of it, I'm not sure that solves anything that we are talking 
about.

> 
> 
>> 
>> Maybe we need a few more use cases to flesh out the DSL.
> 
> There are plenty. Any ideas what would be useful?
> 
> 
>> Or would
>> these be declared in a different container?
>> 
>>> 
>>> Very similar questions to source sets, re. how to arrange them and which
>>> dimension wins over the others and which need to be encoded in the name and
>>> which are encoded in the structure. Maybe we should rethink our container
>>> DSL a bit more deeply. The publications would also benefit from have a
>>> composite identifier (e.g. groupId, artifactId, version).
>> 
>> Yes I think this could benefit from a re-think - the current proposed DSL is:
>> 
>>    publications {
>>        myPublication(IvyPublication) {
>>            organisation 'my-organisation'
>>            module 'my-module'
>>            revision '1.2'
>>        }
>>    }
>> 
>> The only thing the name "myPublication" is currently used for is
>> generating task names. Other than that, it adds little value. We will
>> be enforcing that the org:module:revision is unique, and this is
>> really how the publication is identified.
>> 
>> In that case, something like the last option above might work better:
>> 
>>    publications {
>>        ivy {
>>            organisation 'my-organisation'
>>            module 'my-module'
>>            revision '1.2'
>>        }
> 
> For most publications, these attributes can be inferred. Is the idea that you 
> define the full identifier, or just describe how it's different to the 
> default?

-- 
Luke Daley
Principal Engineer, Gradleware 
http://gradleware.com


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Re: [gradle-dev] some thoughts on the dsl for multiple outputs for jvm based projects

Reply via email to