On Nov 2, 2012, at 10:46 PM, "Chen, Pei" <[email protected]> wrote:
> I think we postponed this topic previously and since the ASF code seems to be 
> in decent shape now, I think it's time to revisit this discussion for the 
> longer term.
> Currently, we have the below resources bundled with our source code and 
> distribution
> 
> -          UMLS dictionaries (hsqldb format and in lucene indexes)
> 
> -          Models (which were okay be to release opened source) that have 
> been train from various clinical data
> 
> -          Wikipedia index
> 
> What are our options as ASF source code, binaries, models, dependencies all 
> need to be compliant with ASL 2.0 (http://www.apache.org/legal/3party.html)
> 
> 1)      Leave things as they are, but we need to confirm with the sources and 
> also will probably need to seek approval from Apache Legal for each of the 
> resources
> 
> 2)      Host the resources externally such as SourceForge similar to OpenNLP 
> models (http://opennlp.sourceforge.net/models-1.5/)
> 
> a.       Single zip per release for users to download?
> 
> Option 2 seems the least painful in terms of compliance.
> Since 3.0.0-incubating, each resource has a fully qualified name/path and is 
> read from the classpath so it should be fairly easy if we decided to pull it 
> in from external sources.

My vote would be that, for each resource of any significant size, we create a 
separate module. So for example, we might have a 
ctakes-dictionary-lookup-umls-index module, a 
ctakes-dependency-parser-clearparser-model module, a 
ctakes-dependency-parser-clearparser-srl-model [1], etc.

These modules would be `mvn release`d like other modules, except that we'd 
release them with their own licenses not to the Apache repository but to Maven 
Central via Sonatype OSS 
(https://docs.sonatype.org/display/Repository/Sonatype+OSS+Maven+Repository+Usage+Guide).

This would mean that people could still declare normal Maven dependencies on 
the dictionaries, models, etc.

And if we ever resolved the licensing issues with any of these, we could simply 
add the module back to the regular Apache distribution.

Steve

[1] These names are a bit confusing because cTAKES conflates dependency parsing 
and semantic role labeling, but that's a totally different issue.

Reply via email to