Re: [gradle-dev] some thoughts on the dsl for multiple outputs for jvm based projects

Adam Murdoch Tue, 05 Feb 2013 11:44:16 -0800

On 05/02/2013, at 9:33 PM, Luke Daley wrote:

> 
> On 04/02/2013, at 10:50 PM, Adam Murdoch <[email protected]> wrote:
> 
>> 
>> On 05/02/2013, at 5:12 AM, Daz DeBoer wrote:
>> 
>>> On 4 February 2013 00:07, Adam Murdoch <[email protected]> wrote:
>>>> Hi,
>>>> 
>>>> So, we're planning to have a bunch of 'jvm binaries' that can be built from
>>>> various language source sets and other things. There will be a few 
>>>> different
>>>> types of binaries, such as class directory binaries and jar binaries,
>>>> possibly some others.
>>>> 
>>>> Something we need to sort out is how to structure the DSL for these
>>>> executable things. The current plan is to have a single container that owns
>>>> all of these jvm binaries, so you might declare something like this:
>>>> 
>>>> jvm {
>>>>   binaries {
>>>>       mainClasses(ClassesDirectoryBinary) {
>>>>           … some inputs and other configuration ...
>>>>       }
>>>>       mainJar(JarBinary) {
>>>>           … some inputs and other configuration …
>>>>       }
>>>>   }
>>>> }
>>>> 
>>>> There might be a similar container for native binaries:
>>>> 
>>>> native {
>>>>   binaries {
>>>>       windowsX86DebugShared(SharedLibraryBinary) {
>>>>           … some inputs and other configuration …
>>>>       }
>>>>       windowsX86DebugStatic(StaticLibraryBinary) {
>>>>           ...
>>>>       }
>>>>       windowsX86DebugExe(ExecutableBinary) {
>>>>           …
>>>>       }
>>>>   }
>>>> }
>>>> 
>>>> Some questions:
>>>> 
>>>> * Is using a flat name the best way to identify these things?
> 
> It's pretty easy to see the above as a graph rather than a flat space. Graphs 
> are notoriously difficult to declaratively navigate though. It would have to 
> become predicate based I think…
> 
> binaries {
>       find(StaticLibraryBinary, { platform == "windows"; debug == true; arch 
> = "x86" }) {
> 
>       } 
> }
> 
> Hard to see that catching on.


Because the dsl is kind of awkward? Or some other reason?

> Another option would be to force arranging as a tree and path down…
> 
> binaries {
>       windows.x86.debug {
> 
>       }
> }
> 
> I don't see that working out though.

Why's that?

> 
>>>> Once you add a
>>>> few dimensions, the names start to get awkward. This is certainly can be 
>>>> the
>>>> case for native binaries, and can also be the case for jvm binaries. For
>>>> example, I might have (feature, binary type, groovy version, jvm version) 
>>>> as
>>>> relevant dimensions for a Groovy library that targets multiple groovy
>>>> versions and jvm versions.
>>> 
>>> Are the names of these things important at all? Or in general are we
>>> just forcing users to come up with a name that adds little value?
>> 
>> I think it varies for different types of things. For some things, a name is 
>> a natural way of identifying the thing. For other things (most things?) it 
>> makes more sense to identify a thing by its type and some attributes about 
>> the thing.
>> 
>> The complication is that the set of attributes that identify a thing vary 
>> based on what I'm building. For example:
>> 
>> * If I have a single publication, then I want to refer to it as 'the 
>> publication'. The other stuff (type, groupId, artefactId, version) are just 
>> attributes of the publication.
>> * If I publish 2 maven modules, then I want to refer to them as the 'api 
>> publication' and the 'impl publication', say.
>> * If I build debug and release variants of my windows executable, then I 
>> want to refer to them as the 'debug executable' and the 'release 
>> executable'. All the other stuff (windows, amd64, multi-threaded, visual-c++ 
>> compiler, optimisation-level) are just attributes of the publication.
>> * If I build debug and release variants on windows and linux for x86 and 
>> amd64, then I want to refer to them using a tuple such as (windows, amd64, 
>> release).
> 
> I actually like the “main” paradigm for dealing with the > 1 boundary for 
> naming or for irrelevant names. That accurately captures the reality.
> 
>> That is, a thing often just has a bunch of attributes, any of which could be 
>> used to identify it, and it's how the thing is different to the others that 
>> is useful for identifying it.
> 
> Which makes selecting via predicate appealing.
> 
>> One nice aspect of ditching the name is that a thing can more naturally live 
>> in different containers and be grouped in different ways. Which would mean 
>> that some of these questions about how things are grouped become less 
>> important - just group them whichever way you like.
>> 
>> 
>>> How
>>> often does a user need to differentiate between them by name?
>> 
>> There are a few main reasons, I think:
>> 
>> 1. To configure something that some other logic (a plugin, say) has already 
>> defined.
>> 2. To configure the tasks that do work with the thing (compile it, generate 
>> the pom.xml for it, publish it).
>> 3. To find the thing to use it as input for some other thing.
>> 4. To refer to the thing before the 'identifying' attributes have been 
>> calculated. For example, to refer to a publication before the version has 
>> been calculated.
>> 
>> None this necessarily requires a name - this is just what the name is used 
>> for at the moment.
> 
> I can kind of see how we could avoid naming if we completely flipped around 
> our current model to be build item based instead of task based (i.e. vertices 
> are inputs/outputs and edges are tasks), . That's probably too big a change 
> to even entertain the idea of at this point though.

This is exactly what we're planning to do. There will be a graph of things and 
a graph of tasks. The task graph we have to leave as is, but for the thing 
graph pretty much anything is an option. And one question there is how we 
identity the things in the thing graph.


> 
>>> We could consider a DSL similar to the repositories syntax:
>>> 
>>> jvm {
>>>   binaries {
>>>       classes {
>>>           name "main" // optional
>>>           … some inputs and other configuration ...
>>>       }
>>>       jar {
>>>           ... we generate a sensible name ...
>>>           … some inputs and other configuration …
>>>       }
>>>   }
>>> }
> 
> Not a deal breaker, but the assumption that Named things have an immutable 
> name runs pretty deep right now.
> 
>>> It's possible that we treat this as a standard pattern, whereby a
>>> NamedDomainObjectContainer could support both with some sort of DSL
>>> magic:
>>> 
>>> container {
>>>     name(Type) {}
>>>     subtype { // generated name }
>>> }
>>> 
>>> Or maybe get rid of the 'name' method altogether, and go with:
>>> 
>>> // In all cases the added element must provide a unique name, which
>>> may or may not be configured explicitly.
>>> container {
>>>      generalType(SubType) {} // eg 'publication' for 'publications'
>>> container, or 'dependency' for 'dependencies' container.
>>>      subType { } // eg 'ivy' for 'publications' or 'project' for
>>> 'dependencies'
>>> }
>> 
>> These are both interesting options for defining things. One question is how 
>> do I get something out again, to either configure it or use it?
>> 
>>> 
>>> 
>>>> * What do we do with specialised types of jvm binaries, that run on the jvm
>>>> but which require a certain runtime and that are packaged in a certain way:
>>>> a WAR or exploded J2EE web app or an OSGi bundle or Gradle plugin?
>>>> 
>>>> * Is the separation between jvm binaries and native binaries useful? Should
>>>> there be a single `binaries` container? Or should it be finer-grained to
>>>> include type, so that there is a `jvm.binaries.classes` and a
>>>> `jvm.binaries.jar` container and a `native.binaries.staticLibs` container?
>>>> Is the type of runtime actually less important than the type of thing, so
>>>> that it should be `binaries.jvm` and `binaries.native`?
>>> 
>>> Where would combined-and-optimised javascript fit into this model?
>>> What about shell-scripts that are tailored for a runtime?
>> 
>> If we consider these things as binaries (and we might), then the answer to 
>> this depends somewhat on the question about specialised binaries, above. 
>> Javascript 'binaries' would target a different type of runtime, just like 
>> jvm and native binaries target different types of runtimes. Shell scripts 
>> might be better treated as a way of packaging a command-line application, as 
>> either a 'native' or maybe a more specialised 'shell' binary.
>> 
>> I think a question we need to answer is whether there is something common 
>> here between all these things, either as an abstraction or a pattern, or 
>> whether it's all just coincidence.
>> 
>> So far, we've been using the term 'binary' in a pretty abstract way, to mean 
>> 'something that can run on a particular runtime', where 'runtime' is some 
>> abstract environment or container. The idea is that both binaries and 
>> runtimes will be typed in some way.
>> 
>> If we think that the abstract model is a good idea, then do we jam all 
>> binaries into the same container? Do we group them by runtime? By role? By 
>> runtime 'family' (e.g. 'jvm', 'native', 'javascript')? By type? Something 
>> else?
> 
> What does having a “binaries” container actually give us? How many such 
> containers are we going to end up with? 

Right now, the containers give us 2 things: a way to define a new thing, a way 
to find things to do stuff with them. Which means that the containers need to 
be a balance between too concrete (makes it harder to find things) and too 
abstract (makes it harder to understand and can't infer as much).

If we were to think about separating these concerns, then we have some other 
options. Such as what you've got below.

> 
> Probably going too far again…
> 
> If we could make it work, it seems more appealing to simply have one graph of 
> things that we can create views of…
> 
> items(SourceSet).main {
>       
> }
> 
> items(StaticLibraryBinary, { platform == "windows" }) {
>       
> }
> 
> items(JavaScriptBundle, { compressed  == true }) {
>       
> }
> 
> something like…
> 
> interface BuildItemContainer extends DomainObjectGraph<NamedBuildable> {
>  <T extends NamedBuildable> BuildItemContainer<T> find(Class<T> targetType)  
>  <T extends NamedBuildable> BuildItemContainer<T> find(Class<T> targetType, 
> Spec<? super T> predicate)
>  <T extends NamedBuildable> BuildItemContainer<T> find(Class<T> targetType, 
> String name)
> }
> 
> (Where project has item(*) methods that delegate to project.items.find(*) in 
> this case). Everything has a name, but it only needs to be unique in certain 
> contexts. One thing about the above is that it would make IDE support a bit 
> simpler as we could cut the width of the API that they need to know about 
> right down.
> 
> Come to think of it, I'm not sure that solves anything that we are talking 
> about.
> 
>> 
>> 
>>> 
>>> Maybe we need a few more use cases to flesh out the DSL.
>> 
>> There are plenty. Any ideas what would be useful?
>> 
>> 
>>> Or would
>>> these be declared in a different container?
>>> 
>>>> 
>>>> Very similar questions to source sets, re. how to arrange them and which
>>>> dimension wins over the others and which need to be encoded in the name and
>>>> which are encoded in the structure. Maybe we should rethink our container
>>>> DSL a bit more deeply. The publications would also benefit from have a
>>>> composite identifier (e.g. groupId, artifactId, version).
>>> 
>>> Yes I think this could benefit from a re-think - the current proposed DSL 
>>> is:
>>> 
>>>   publications {
>>>       myPublication(IvyPublication) {
>>>           organisation 'my-organisation'
>>>           module 'my-module'
>>>           revision '1.2'
>>>       }
>>>   }
>>> 
>>> The only thing the name "myPublication" is currently used for is
>>> generating task names. Other than that, it adds little value. We will
>>> be enforcing that the org:module:revision is unique, and this is
>>> really how the publication is identified.
>>> 
>>> In that case, something like the last option above might work better:
>>> 
>>>   publications {
>>>       ivy {
>>>           organisation 'my-organisation'
>>>           module 'my-module'
>>>           revision '1.2'
>>>       }
>> 
>> For most publications, these attributes can be inferred. Is the idea that 
>> you define the full identifier, or just describe how it's different to the 
>> default?
> 
> -- 
> Luke Daley
> Principal Engineer, Gradleware 
> http://gradleware.com
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
> 
>    http://xircles.codehaus.org/manage_email
> 
> 


--
Adam Murdoch
Gradle Co-founder
http://www.gradle.org
VP of Engineering, Gradleware Inc. - Gradle Training, Support, Consulting
http://www.gradleware.com

Re: [gradle-dev] some thoughts on the dsl for multiple outputs for jvm based projects

Reply via email to