Re: [gradle-dev] producing multiple outputs from jvm languages

Luke Daley Mon, 28 Jan 2013 03:30:32 -0800

On 24/01/2013, at 2:36 AM, Adam Murdoch <[email protected]> wrote:


> 
> On 23/01/2013, at 8:43 PM, Luke Daley wrote:
> 
>> 
>> On 21/01/2013, at 12:24 AM, Adam Murdoch <[email protected]> wrote:
>> 
>>> 
>>> On 21/01/2013, at 8:51 AM, Adam Murdoch wrote:
>>> 
>>>> 
>>>> On 17/01/2013, at 12:10 PM, Adam Murdoch wrote:
>>>> 
>>>>> 
>>>>> Another question is how to group source sets and packagings.
>>>>> 
>>>>> For source sets, we currently use two approaches:
>>>>> 
>>>>> - For Java, Scala, Groovy, ANTRL and resources, we use the functional 
>>>>> source sets, adding the source to `sourceSets.${function}.${language}`.
>>>>> - For C++, we use language specific source sets, adding the source to 
>>>>> `cpp.sourceSets.${function}.source` and 
>>>>> `cpp.sourceSets.${function}.exportedHeaders`.
>>>>> 
>>>>> I'd like to come up with a consistent pattern here, which can allow 
>>>>> arbitrary groupings of source files by language and by function. I can 
>>>>> see three options. For all these options, assume that all source sets are 
>>>>> composable to some degree, so, for example, you can add a given Java 
>>>>> source set to another Java source set, or you can add a given Java source 
>>>>> set to a composite source set.
>>>>> 
>>>>> 1. Java plugin style, where the primary grouping is functional: 
>>>>> `sourceSets.${function}.${language}`:
>>>>> 
>>>>>   sourceSets {
>>>>>           main {
>>>>>                   cpp { srcDirs = '…' }
>>>>>                   cppHeaders { srcDirs = '…' }
>>>>>                   javaScript { srcDirs = '…' }
>>>>>           }
>>>>>   }
>>>>> 
>>>>> 2. C++ plugin style, where the primary grouping is by language: 
>>>>> `${language}.sourceSets.${function}`
>>>>> 
>>>>>   java { 
>>>>>           sourceSets {
>>>>>                   main { srcDirs = '…' }
>>>>>           }
>>>>>   }
>>>>>   groovy {
>>>>>           sourceSets {
>>>>>                   main { srcDirs = '…' }
>>>>>           }
>>>>>   }
>>>>>   resources {
>>>>>           sourceSet {
>>>>>                   main { srcDirs = '…' }
>>>>>           }
>>>>>   }
>>>>>   javaScript {
>>>>>           sourceSets {
>>>>>                   main { srcDirs = '…' }
>>>>>           }
>>>>>   }
>>>>> 
>>>>> 3. Both, where defining `sourceSets.${function}.${language}` also defines 
>>>>> `${language}.sourceSets.${function}` and vice versa.
>>>>> 
>>>>> 4. A polymorphic container of source sets. You use whatever groupings you 
>>>>> like, and can add language specific or composite source sets to this 
>>>>> container. The opinionated language plugins would continue to group by 
>>>>> function and add a `main` and `test` composite source set.
>>>>> 
>>>>>   sourceSets {
>>>>>           main(CompositeSourceSet) {  // possibly the default type
>>>>>                   java { srcDirs = '…' }
>>>>>                   cpp { srcDirs = '…' }
>>>>>                   resources { srcDirs = '…' }
>>>>>           }
>>>>>           test(GroovySourceSet) {
>>>>>                   srcDirs = '...'
>>>>>           }
>>>>>   }
>>>>> 
>>>> 
>>>> My preference at this stage is to go with option #1. Let's dig into this a 
>>>> bit more.
>>>> 
>>>> The goal, regardless of whichever grouping option we choose, is to 
>>>> introduce language specific source sets as a layer underneath this. So, 
>>>> we'd add types like JavaSourceSet, GroovySourceSet, CppSourceSet, and so 
>>>> on. These types would have some stuff common - mainly just a set of source 
>>>> files - and some meta-data about the source files:
>>>> 
>>>> - For a Java source set, this would include the Java language level and 
>>>> Java API that the source is written against, and the compile and runtime 
>>>> dependencies of the source.
>>>> - Same for Groovy and Scala source sets, except with the Groovy and Scala 
>>>> languages and runtimes. The Java API is also relevant for this source, I 
>>>> guess. These source sets also have some way to declare the compiler 
>>>> macros/AST transforms that the source expects to be available, probably as 
>>>> a set of dependencies on the implementation libraries.
>>>> - For a C/C++ source set, this would include the language dialect that the 
>>>> source is written against, and the compile, link and runtime dependencies 
>>>> of the source.
>>>> - For an ANTLR source set, this would include the ANTLR language version 
>>>> that the grammars are written to.
>>>> - For a Javascript source set, this would include the runtime dependencies 
>>>> of the source.
>>>> 
>>>> One question is how to model source sets that are related to each other at 
>>>> the language level:
>>>> 
>>>> - C/C++ source files and their public headers and private headers.
>>>> - Java (Scala/Groovy) source and their resource source files.
>>>> - Jointly compiled Java/Scala/Groovy source files.
>>>> 
>>>> Currently, the C++ plugin groups the C++ source files and headers into a 
>>>> CppSourceSet, with separate SourceDirectorySets for the source and for the 
>>>> headers. The Jvm language plugins group the Java/Scala/Groovy and 
>>>> resources into a SourceSet, with separate SourceDirectorySets for each 
>>>> language. So, these plugins are effectively grouping the source based on 
>>>> the target platform, or by target output component, depending on your view.
>>>> 
>>>> Do we keep these typed groupings (or something similar), or do we model 
>>>> this as an untyped composite source set that contains a bunch of atomic 
>>>> language source sets? There are some problems with the current groupings:
>>>> 
>>>> - The ANTLR source is currently attached to the 'jvm' group, but ANTLR can 
>>>> generate C, C#, and bunch of other languages.
>>>> - Java source and JVM byte code can be compiled to native code.
>>>> - Groovy, Scala and ANTLR source are added in dynamically via a convention 
>>>> object, so don't make use of the typing anyway. It would be the same for 
>>>> native languages other than C/C++ as well. So you've got this first-and 
>>>> second-class thing going on. It feels like a good solution should not 
>>>> treat certain languages specially.
>>>> - Sometimes multiple groups of source in a given language make up a 
>>>> logical group. Eg API + impl, java 5 + java 6, windows + posix, etc.
>>>> 
>>>> Let's say we remove the typed groupings (in a backwards compatible way, as 
>>>> always). It might work something like this:
>>>> 
>>>> - Some basic 'language' plugin adds the concept of (composite) source 
>>>> sets. You can define source sets and add whichever language source sets 
>>>> you like.
>>>> - A 'jvm-language' plugin adds a rule that adds a resources source set to 
>>>> each source set, adds the concept of JVM packagings, and a rule that knows 
>>>> how to copy the resource files that are inputs to a JVM packaging.
>>>> - A 'java-language' plugin adds a rule that adds a Java source set to each 
>>>> source set, and a rule that knows how to compile the Java source that are 
>>>> inputs to a JVM packaging.
>>>> - Groovy and Scala language plugins, as above.
>>>> - A 'standard-source-sets' plugin adds the 'main' and 'test' source sets.
>>>> - The 'java-base' plugin adds a rule that adds a classes dir packaging for 
>>>> each source set, and wires up the source set as input to the packaging.
>>>> - The 'java', 'groovy' and 'scala' plugins just apply various combinations 
>>>> of the above plugins.
>>>> - The 'android' plugin adds an Android resources source set to each source 
>>>> set (these are different to the JVM resource files above), and APK 
>>>> packaging, and a rule that knows how to compile the resources and classes 
>>>> packaging into an APK packaging.
>>>> - A 'native-language' plugin adds the concept of native packagings.
>>>> - A 'cpp-language' plugin adds a C/C++ source set, a public headers source 
>>>> set and a private headers source set to each source set, and a rule that 
>>>> compiles C/C++ source that are inputs to a native packaging.
>>>> - The 'cpp-lib' and 'cpp-exe' plugins just apply various combinations of 
>>>> the above plugins.
>>>> - An 'assembly-language' plugin adds an assembly source set to each source 
>>>> set and a rule that compiles assembly source that are inputs to a native 
>>>> packaging.
>>>> - A 'javascript-language' plugin adds a Javascript source set to each 
>>>> source set, and the concept of a Javascript packaging.
>>> 
>>> Let's tweak this a bit. In the above, adding support for a language also 
>>> adds one or more language source sets to every source set, which is isn't 
>>> quite right. Some common cases where that doesn't reflect reality:
>>> 
>>> - A Java project uses Groovy for testing.
>>> - A project uses ANTLR to generate production code but not test code.
>>> - A Java project bundles a JNI library, so that it has Java and C 
>>> production code, but all the tests are written in Java.
>>> 
>>> It might be better to flip things around and infer the language source sets 
>>> based what we need to build, and on convention:
>>> 
>>> - Each of the JVM -language plugins add rules that can compile the language 
>>> source that forms inputs to a JVM packaging. And that can generate the API 
>>> docs from the language source that forms inputs to an API documentation 
>>> packaging.
>>> - Each of the native -language plugins add rules that can compile the 
>>> language source that forms inputs to a native packaging.
>>> - The antlr-language plugin adds rules that can generate source from ANTLR 
>>> source sets that form inputs to a language source set.
>>> - An opinionated 'jvm-library' plugin states that for each JVM library 
>>> component 'n', there is a jar packaging 'n', which has an input classes 
>>> packaging 'n', which has as input a production source set 'n', which 
>>> includes a source set for each supported JVM language.
>>>     - This plugin may defer adding the language source set until it is 
>>> either referenced (to be configured) or the conventional source dir is not 
>>> empty.
>>>     - This would mean, in turn, that we wouldn't add compile tasks for 
>>> languages that are not required to build the classes packaging.
>>> - An opinionated 'jvm-unit-tests' plugin that states that each project has 
>>> a single 'test' classes packaging, which has as input a test source set 
>>> 'test', which includes a source set for each supported JVM language. Plus 
>>> adds the appropriate test task.
>>> - The java plugin applies the the java-language, jvm-library and unit-tests 
>>> plugins, and defines a single 'main' JVM library.
>>> - Similarly, an opinionated 'native-component' plugin states that for each 
>>> native component 'n', there is a binary packaging 'n', which as as input 
>>> object file packaging 'n', which has as input a production source set 'n', 
>>> which includes a source set of each supported native language.
>>> - An opinionated 'native-unit-tests' plugin, does the same kind of thing as 
>>> the 'jvm-unit-tests' plugin.
>>> - The cpp-lib and cpp-exe plugins apply the cpp-language, native-component 
>>> and native-unit-tests plugins and define a 'main' library or executable.
>>> 
>>> In other words:
>>> 
>>> - The 'capability' plugins add classes of things, and rules to build a 
>>> thing from its input things.
>>> - The 'opinionated' plugins that add instances of things, and rules that 
>>> state what a given thing's inputs are.
>> 
>> That seems quite right to me.
>> 
>> Could we build on the mutable polypmorphic container idea for this? 
>> 
>> Ignore backwards compatibility for a second…
>> 
>> A project has a…
>> 
>> interface SourceSetContainer extends NamedDomainObjectCollection<SourceSet> 
>> {}
>> 
>> interface SourceSet extends Named {
>>      LanguageSourceSetContainer getLanguages()
>> }
>> 
>> interface LanguageSourceSetContainer extends 
>> NamedDomainObjectCollection<SourceSet> {}
>> 
>> interface LanguageSourceSet extends Named {}
>> 
>> So…
>> 
>> sourceSets {
>>      main {
>>              languages {
>>                      java  { }
>>              }
>>      }
>> }
>> 
>> If that's too much nesting, we could always offer other views, but I think 
>> that's the data structure. 
> 
> This is exactly what I went with in the latest revision of the spec (not 
> pushed yet). A SourceSet has-a container of LanguageSourceSet instances of 
> various types.
> 
> I'm not completely happy with the name 'languages' for this container:
> 
> * 'languages.resources' doesn't feel quite right for the resource source 
> files. That is 'resources' is not really a language. And resources might 
> include Java source files, and so on.
> * We need separate 'languages' for public c++ headers, private c++ headers 
> and c++ source files. The distinction is important because each group of 
> source needs to be treated separately: headers and source files are passed to 
> the compiler in different ways, and public headers need to travel with the 
> binaries.
> * We need separate 'languages' for source files that are generated and not 
> generated. This can combine with the above, so some public c++ headers might 
> be generated (from a midl file, say) and some might not. And private c++ 
> headers might be generated (using javah, say). This distinction is important 
> for static analysis (e.g. don't run check style over generated source files), 
> for bundling the source, and for building the IDE model.
> 
> That is, there are a few different dimensions here, and implementation 
> language is one. We can either model each dimension, or we can give each 
> group of source a name and use some conventions for the names.

What about …

interface SourceSetContainerContainer extends 
NamedDomainObjectCollection<SourceSetContainer> {}
interface SourceSetContainer extends NamedDomainObjectCollection<SourceSet>, 
Named {}
interface SourceSet extends Named {}

So…

sourceSets {
        main {
                java  { }
                resources { }
                …
        }
}

SourceSetContainerContainer is a little awkward, but it's honest and unassuming.

> 
> 
> --
> Adam Murdoch
> Gradle Co-founder
> http://www.gradle.org
> VP of Engineering, Gradleware Inc. - Gradle Training, Support, Consulting
> http://www.gradleware.com
> 

-- 
Luke Daley
Principal Engineer, Gradleware 
http://gradleware.com


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Re: [gradle-dev] producing multiple outputs from jvm languages

Reply via email to