Re: [gradle-dev] producing multiple outputs from jvm languages

Adam Murdoch Wed, 23 Jan 2013 19:43:10 -0800

On 23/01/2013, at 10:26 PM, Luke Daley wrote:

> 
> On 20/01/2013, at 9:51 PM, Adam Murdoch <[email protected]> wrote:
> 
>> 
>> On 17/01/2013, at 12:10 PM, Adam Murdoch wrote:
>> 
>>> 
>>> Another question is how to group source sets and packagings.
>>> 
>>> For source sets, we currently use two approaches:
>>> 
>>> - For Java, Scala, Groovy, ANTRL and resources, we use the functional 
>>> source sets, adding the source to `sourceSets.${function}.${language}`.
>>> - For C++, we use language specific source sets, adding the source to 
>>> `cpp.sourceSets.${function}.source` and 
>>> `cpp.sourceSets.${function}.exportedHeaders`.
>>> 
>>> I'd like to come up with a consistent pattern here, which can allow 
>>> arbitrary groupings of source files by language and by function. I can see 
>>> three options. For all these options, assume that all source sets are 
>>> composable to some degree, so, for example, you can add a given Java source 
>>> set to another Java source set, or you can add a given Java source set to a 
>>> composite source set.
>>> 
>>> 1. Java plugin style, where the primary grouping is functional: 
>>> `sourceSets.${function}.${language}`:
>>> 
>>>     sourceSets {
>>>             main {
>>>                     cpp { srcDirs = '…' }
>>>                     cppHeaders { srcDirs = '…' }
>>>                     javaScript { srcDirs = '…' }
>>>             }
>>>     }
>>> 
>>> 2. C++ plugin style, where the primary grouping is by language: 
>>> `${language}.sourceSets.${function}`
>>> 
>>>     java { 
>>>             sourceSets {
>>>                     main { srcDirs = '…' }
>>>             }
>>>     }
>>>     groovy {
>>>             sourceSets {
>>>                     main { srcDirs = '…' }
>>>             }
>>>     }
>>>     resources {
>>>             sourceSet {
>>>                     main { srcDirs = '…' }
>>>             }
>>>     }
>>>     javaScript {
>>>             sourceSets {
>>>                     main { srcDirs = '…' }
>>>             }
>>>     }
>>> 
>>> 3. Both, where defining `sourceSets.${function}.${language}` also defines 
>>> `${language}.sourceSets.${function}` and vice versa.
>>> 
>>> 4. A polymorphic container of source sets. You use whatever groupings you 
>>> like, and can add language specific or composite source sets to this 
>>> container. The opinionated language plugins would continue to group by 
>>> function and add a `main` and `test` composite source set.
>>> 
>>>     sourceSets {
>>>             main(CompositeSourceSet) {  // possibly the default type
>>>                     java { srcDirs = '…' }
>>>                     cpp { srcDirs = '…' }
>>>                     resources { srcDirs = '…' }
>>>             }
>>>             test(GroovySourceSet) {
>>>                     srcDirs = '...'
>>>             }
>>>     }
>>> 
> 
> I should have read this before my previous email, I'm largely just agreeing 
> with you.
> 
>> My preference at this stage is to go with option #1. Let's dig into this a 
>> bit more.
> 
> Same.
> 
>> The goal, regardless of whichever grouping option we choose, is to introduce 
>> language specific source sets as a layer underneath this. So, we'd add types 
>> like JavaSourceSet, GroovySourceSet, CppSourceSet, and so on. These types 
>> would have some stuff common - mainly just a set of source files - and some 
>> meta-data about the source files:
>> 
>> - For a Java source set, this would include the Java language level and Java 
>> API that the source is written against, and the compile and runtime 
>> dependencies of the source.
>> - Same for Groovy and Scala source sets, except with the Groovy and Scala 
>> languages and runtimes. The Java API is also relevant for this source, I 
>> guess. These source sets also have some way to declare the compiler 
>> macros/AST transforms that the source expects to be available, probably as a 
>> set of dependencies on the implementation libraries.
>> - For a C/C++ source set, this would include the language dialect that the 
>> source is written against, and the compile, link and runtime dependencies of 
>> the source.
>> - For an ANTLR source set, this would include the ANTLR language version 
>> that the grammars are written to.
>> - For a Javascript source set, this would include the runtime dependencies 
>> of the source.
>> 
>> One question is how to model source sets that are related to each other at 
>> the language level:
>> 
>> - C/C++ source files and their public headers and private headers.
>> - Java (Scala/Groovy) source and their resource source files.
>> - Jointly compiled Java/Scala/Groovy source files.
>> 
>> Currently, the C++ plugin groups the C++ source files and headers into a 
>> CppSourceSet, with separate SourceDirectorySets for the source and for the 
>> headers. The Jvm language plugins group the Java/Scala/Groovy and resources 
>> into a SourceSet, with separate SourceDirectorySets for each language. So, 
>> these plugins are effectively grouping the source based on the target 
>> platform, or by target output component, depending on your view.
>> 
>> Do we keep these typed groupings (or something similar), or do we model this 
>> as an untyped composite source set that contains a bunch of atomic language 
>> source sets? There are some problems with the current groupings:
>> 
>> - The ANTLR source is currently attached to the 'jvm' group, but ANTLR can 
>> generate C, C#, and bunch of other languages.
>> - Java source and JVM byte code can be compiled to native code.
>> - Groovy, Scala and ANTLR source are added in dynamically via a convention 
>> object, so don't make use of the typing anyway. It would be the same for 
>> native languages other than C/C++ as well. So you've got this first-and 
>> second-class thing going on. It feels like a good solution should not treat 
>> certain languages specially.
> 
> I'd say it's a requirement that we don't do this.
> 
>> - Sometimes multiple groups of source in a given language make up a logical 
>> group. Eg API + impl, java 5 + java 6, windows + posix, etc.
>> 
>> Let's say we remove the typed groupings (in a backwards compatible way, as 
>> always). It might work something like this:
>> 
>> - Some basic 'language' plugin adds the concept of (composite) source sets. 
>> You can define source sets and add whichever language source sets you like.
>> - A 'jvm-language' plugin adds a rule that adds a resources source set to 
>> each source set, adds the concept of JVM packagings, and a rule that knows 
>> how to copy the resource files that are inputs to a JVM packaging.
>> - A 'java-language' plugin adds a rule that adds a Java source set to each 
>> source set, and a rule that knows how to compile the Java source that are 
>> inputs to a JVM packaging.
> 
> Is this per language source set? Or does it per source set (i.e. compile all 
> java language source sets in the source set)? Seems like it has to be the 
> former (i.e. each java language source set is compiled individually)


I'm not entirely sure what you're asking here.

The rule is: when a class directory binary has one or more java language source 
sets as input, a single `JavaCompile` is added that compiles these java source 
sets in a single batch.

In the case where you want to compile some java source sets in a different way, 
you'd be responsible for wiring up an intermediate class directory that would 
form input to the final class directory (but the rule would take care of adding 
a compile task for that intermediate class directory), perhaps something like:

main java source --- <compileJava> --------------------------+
                                                             +---> main classes
custom compile source --- <compileJava> --> custom classes --+



> 
>> - Groovy and Scala language plugins, as above.
>> - A 'standard-source-sets' plugin adds the 'main' and 'test' source sets.
> 
> Unsure if we really need this. Calling this “standard” is a bit much for me. 
> It's fine for certain plugins to use this a convention, but saying this is 
> “standard” for all cases goes too far.

It doesn't matter too much what the name is. We can give it some other name. 
The key is that it's an opinionated plugin that you can choose to apply, or 
not. That is, it's a convention, as you say.

> 
>> - The 'java-base' plugin adds a rule that adds a classes dir packaging for 
>> each source set, and wires up the source set as input to the packaging.
>> - The 'java', 'groovy' and 'scala' plugins just apply various combinations 
>> of the above plugins.
>> - The 'android' plugin adds an Android resources source set to each source 
>> set (these are different to the JVM resource files above), and APK 
>> packaging, and a rule that knows how to compile the resources and classes 
>> packaging into an APK packaging.
>> - A 'native-language' plugin adds the concept of native packagings.
>> - A 'cpp-language' plugin adds a C/C++ source set, a public headers source 
>> set and a private headers source set to each source set, and a rule that 
>> compiles C/C++ source that are inputs to a native packaging.
>> - The 'cpp-lib' and 'cpp-exe' plugins just apply various combinations of the 
>> above plugins.
>> - An 'assembly-language' plugin adds an assembly source set to each source 
>> set and a rule that compiles assembly source that are inputs to a native 
>> packaging.
>> - A 'javascript-language' plugin adds a Javascript source set to each source 
>> set, and the concept of a Javascript packaging.
> 
> This all sounds good, except I'm wondering about how we model what happens to 
> these source sets.
> 
> At the moment, SourceSet models the transformation (i.e. compilation) that 
> happens on it. I think you are proposing that we keep this. If you are, I'm 
> not sure that this is going to work in this more heterogenous, complex, world.

I'm not proposing that. We'd be back where we started if we did this. The key 
is that a source set should model just a set of source plus some meta-data 
about that source. Anything else is not it's business. A given set of source 
may be used as input to some other build item, or multiple other build items, 
or none at all. A given set of source may be built (e.g. generated), or may not 
be.

The transformations hang off the build item that the source set is input to. 
So, when I add a source set as input to a classes did binary, a transformation 
is added to take care of compiling this source.

> I don't think “packaging” is quite it either.
> 
> Do we need to consider how me model the operations that turn source sets into 
> something useful? I think we do

I think it would be useful. I'd like to see something like this:

- Once a build item has been configured, some rules are fired which end up 
creating a bunch of tasks (or not) based on the configuration of the build item 
and its inputs.
- The rules for a build item are fired only after the rules for its inputs have 
been fired.
- A build item cannot be mutated after the rules have fired for it (with some 
leniency for existing types that turn into build items). This means you can't 
change the configuration, inputs, or rules for a given build item.
- It should be possible to override a given rule, to replace the built-in 
behaviour. We'd deprecate task(override: true) once we had this.
- It should be possible to attach a rule to a class of build items, or to an 
individual built item instance.
- The rules should be specified in such a way that we can later:
        - Skip firing the rules for build items that aren't needed for the 
current build.
        - Replace a buildable thing with a pre-built thing.


--
Adam Murdoch
Gradle Co-founder
http://www.gradle.org
VP of Engineering, Gradleware Inc. - Gradle Training, Support, Consulting
http://www.gradleware.com

Re: [gradle-dev] producing multiple outputs from jvm languages

Reply via email to