Re: [gradle-dev] producing multiple outputs from jvm languages

Luke Daley Mon, 28 Jan 2013 02:54:58 -0800

On 24/01/2013, at 4:17 AM, Adam Murdoch <[email protected]> wrote:


> 
> On 24/01/2013, at 12:57 AM, Luke Daley wrote:
> 
>> 
>> On 17/01/2013, at 11:54 PM, Adam Murdoch <[email protected]> wrote:
>> 
>>> 
>>> On 17/01/2013, at 11:20 PM, Luke Daley wrote:
>>> 
>>>> What's the relationship between a component and a “functional source set”?
>>> 
>>> It varies. The model would be something like this:
>>> 
>>> - A component is physically represented using one or more packagings.
>>> - A packaging is built from one or more input build items.
>>> - A packaging is a build item.
>>> - A (functional) source set is a build item.
>>> 
>>> So, for a Java library, it'd look like this:
>>> 
>>> production source set ---> production class packaging ---> production jar 
>>> packaging
>>> 
>>> Add in some test fixtures:
>>> 
>>> production class packaging ---+
>>>                              +---> test fixture class packaging ---> test 
>>> fixture jar packaging
>>> test fixture source set ------+
>>> 
>>> Maybe add some source and docs:
>>> 
>>>                        +---> api doc packaging
>>> production source set --+
>>>                        +---> source packaging
>>> 
>>> The production jar, test fixture jar, api doc and source packagings are all 
>>> aspects of the Java library component.
>>> 
>>> For a C library, it might look like this:
>>> 
>>> production source set --+--> windows 32bit shared lib packaging
>>>                        +--> windows 32bit static lib packaging
>>>                        +--> linux 64bit shared lib packaging
>>>                        +--> …
>>> 
>>> Each of these platform-specific packagings, along with the API docs and 
>>> source packagings, are all aspects of the component.
>> 
>> The term “packaging” really starts to break down here. It seems intuitive to 
>> say that a classes dir and a jar are the same thing packaged differently, 
>> but if you try and say that javadoc is another type of packaging it doesn't 
>> feel natural. 
>> 
>> I originally took you to mean that different packagings were functionally 
>> equivalent, but required different methods of consumption.
> 
> That's what I meant. The stuff above isn't quite right. All of the things 
> above are build items. Some of them are ways of packaging a component (ie a 
> packaging is-a build item).
> 
>> It seems that you're using it in a more general sense, something closer to 
>> “facet”. The javadoc and the class files are different facets of the same 
>> logical entity.
>> 
>> So maybe components have facets, and a facet can be packaged in different 
>> ways.
> 
> We've been calling this a 'usage'. That is, there are a number of ways you 
> can use a given type of component, and a given usage implies one or more 
> (mutually exclusive) packagings:
> 
> * One such usage might be to read the API documentation for the component, 
> where the API docs can be packaged as a directory of HTML, or a ZIP file or a 
> PDF.
> * Another might be to compile some source against it (to build a windows 32 
> bit debug binary), where the headers can be packaged as a directory of 
> headers. Or as a ZIP, or in a distribution.
> * Another might be to link a binary against it (to build a windows 32 bit 
> debug binary), where the library is packaged as a .lib or a .so file, 
> depending on platform.
> * Another might be to link it at runtime (into a windows 32 but debug 
> executable), where the library is packaged as a .dll or a .so, depending on 
> the platform.

This doesn't get around the problem that you are calling the API docs a 
“packaging”, and that in that case two different packagings of the same logical 
entity are not functionally equivalent.

It seems we are missing a term for auxiliary outputs (e.g. javadoc). Or, we 
stretch what we mean by “consume” something and also define that not all 
packagings of a thing are functionally equivalent.

>>> So far, a given source set ends up in a single component. But that doesn't 
>>> necessarily need to be the case:
>>> 
>>> For an Android app, the graph might look like this:
>>> 
>>> production source set --------------+
>>> 'lite' product flavour source set --+--> 'lite release' class packaging --> 
>>> 'lite release' apk packaging
>>> 'release' build type source set ----+
>>> 
>>> production source set --------------+
>>> 'lite' product flavour source set --+--> 'lite debug' class packaging --> 
>>> 'lite debug' apk packaging
>>> 'debug' build type source set ------+
>>> 
>>> production source set --------------+
>>> 'pro' product flavour source set --+--> 'pro debug' class packaging --> 
>>> 'pro debug' apk packaging
>>> 'debug' build type source set ------+
>>> 
>>> Here, there are 2 components: the 'lite' and the 'pro' edition of the app 
>>> (*). Each component has 2 packagings: a 'release' and a 'debug' packaging. 
>>> A given source set can end up in multiple packagings for multiple 
>>> components, and a given component is built from multiple source sets.
>> 
>> Seems solid.
>> 
>> One question for me is whether the graph from component back to the source 
>> (or really, the first inputs that Gradle knows about) is captured anywhere. 
>> At the moment we don't really capture this. We go as far as sourceSet → 
>> class files, but that's it.
>> 
>> More on this below…
>> 
>>>> What if the definition of a component included the source? Or maybe, a 
>>>> certain kind of “buildable” component.
>>> 
>>> I think pretty much every component is buildable in some way, but not 
>>> necessarily buildable by Gradle. It makes sense to have some kind of link 
>>> back to the source that the component is assembled from. We might add 
>>> 'source packaging' as a first-class concept, where one way to package up a 
>>> component is to simply provide an archive containing its source files. For 
>>> some components - e.g. a C header-file only library or a javascript library 
>>> - this may be the only way that the component is packaged.
>> 
>> Would we infer the source? Or require manual specification (even if that's 
>> done conventionally via a plugin)?
>> 
>> There's potentially a complex graph from one or more source sets, connected 
>> to one or more “packagings” to the final component. It seems like it's  
>> tempting to try and infer it, but I'm not sure this is scalable to complex 
>> scenarios. Probably better to just use conventions to hide this and if you 
>> stray off that path you're responsible for matching things up.
> 
> I agree.
> 
> We might offer some way to back chain from a given build item, and infer its 
> transitive inputs, similar to make. So, if I say 'there is a java library 
> component called main', we might have some rules that say:
> - A java library 'n' can be packaged as a jar binary called 'n'.
> - A jar binary called 'n' can be built from a class directory binary called 
> 'n'
> - A class directory binary called 'n' can be built from a source set called 
> 'n'.
> - A source set called 'n' includes java source from 'src/n/java'.
> 
> So, we can work backwards through these rules, and infer from the presence 
> Java source files in 'src/main/java' how to build a java library called 
> 'main'. There's still a graph of build items here, so that the java source 
> set is a transitive input into the final binary.
> 
> Our conventions might just be a set of these kinds of rules, and the 
> statement 'this project produces a java library called main'.
> 
> The rules for how to infer the inputs for a given build item would:
> - Allow a given rule to be replaced. So, in the above, I might state: A java 
> library 'n' can be packaged as an API jar binary called 'n-api' and an 
> implementation jar binary called 'n-impl', but keep the remaining rules.
> - Be lazy, so that we trigger them only for those build items that are 
> required for the given build.
> 
> So we might end up with:
> 
> - Build items have inputs and are built from their inputs.
> - Rules can be used to infer the inputs for a given build item.
> - Rules can be used to infer how to build a given built item from its inputs.

I think there's going to be a (not insurmountable) challenge here in providing 
real models for this. SourceSets are so useful because you can hang so many 
conventions off them (at least in theory). This works because they model so 
much (as it turns out, too much). If all of these rules/conventions are more 
emergent from small bits and pieces in different plugins, it might be harder 
for plugins to decorate/enhance existing conventions because you can't actually 
get hold of them.

Put another way, who has the whole picture? e.g. What part of the model can I 
query to determine what the sources are for a particular component? Because we 
use “dumb” types at each step (e.g. filecollection over sourceset) we lose 
information as we move down the transformation graph. Or more correctly, it 
becomes difficult to reverse engineer the graph of things from the outputs back 
(unless you resort to a lot of reflecting). We know that having the inputs 
describe the outputs (e.g. SourceSets as they are now) doesn't work. Having the 
outputs describe all of their inputs looks problematic to me because of the 
information loss along the way. 

It starts to look like we need a higher level construct that models the inputs 
to outputs graph for a particular convention. But, maybe this isn't necessary. 
Maybe all of the decoration/enhancement can be done with the lower level (e.g. 
source set, component) things and it's not necessary to have a complete view. 
Given that it's a kind of meta-model, it's nothing we'd really have to do up 
front and we could wait to see if it's needed at all.

-- 
Luke Daley
Principal Engineer, Gradleware 
http://gradleware.com


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Re: [gradle-dev] producing multiple outputs from jvm languages

Reply via email to