Dear All, I come back one year later...
To remind you, we used a French Treebank corpus (http://www.llf.cnrs.fr/Gens/Abeille/French-Treebank-fr.php) to train models for processing French with the HMM tagger addon. I first contacted you for some advices since we did not own the resource we used and we were not sure to be allowed to distribute our models under Apache license. We were discussing about with the resource owner and we though that an alternative way to distribute the models we trained could be to jointly submit the models. Eventually, we got the grant from the owner to distribute the models we built up under the Apache License v2. In short, we built up French models for part of speech (pos), morphological (mph) and function grammatical (fct) tagging, as well as lemmatization (lemma). We use the Hmm tagger to perform the various tagging. A recent patch has been submitted to turn the Hmm tagger into a less type system dependant tagger. See https://issues.apache.org/jira/browse/UIMA-2110 Before submitting the models to the project, I have some new questions. As a researcher it is important for us that our work be cited by other researchers. In addition, the models are only a few files but they represent a substantial contribution for the French Natural Language Processing community. So I was wondering whether you still advise me to perform the IP clearance procedure or just to add a specific mention in the NOTICE file. In the first case, could you find me an "appropriate volunter" for executing the IP Clearance processing? Another "substantial" question... our model files takes about 5 Mo each (pos, mph and fct) except the lemma model file which takes 24 Mo. Alternatively we built up a merged model for pos, mph and fct which takes 6.9 Mo. Do you thing it may cause a problem if we submit all of them? Best regards /Nicolas ---------- Forwarded message ---------- From: Nicolas Hernandez <[email protected]> Date: Thu, Nov 4, 2010 at 11:28 AM Subject: Re: Guidelines for a mutual contribution To: [email protected] Thilo, we would like to submit a language model which was trained on a French Treebank corpus for the tagger addon. We do not own the treebank corpus we used. We are in discussion with her owner to know if we still respect the treebank License by distributing a model built on it under the Apache License. We though that an alternative way to distribute the model we trained could be to jointly submit the model with the owner of the treebank. Marshal, I will consult all the links you mention and come back if necessary Thanks On Thu, Nov 4, 2010 at 11:06 AM, Marshall Schor <[email protected]> wrote: > > > On 11/4/2010 5:06 AM, Nicolas Hernandez wrote: >> Hi >> >> Can someone indicate me where to find some guidelines to commit a >> mutual contribution? In other words, how to proceed when there is two >> developers or corporations involved in a work they would like to >> commit ? >> I do not find any information on this subject on >> http://www.apache.org/licenses/ neither on >> http://uima.apache.org/contribution-policy.html >> >> Do we have to submit each of us an "Individual Contributor License >> Agreement" to the ASF > > Each person has to have an "Individual Contributor License Agreement" on file > with the ASF (and, if appropriate, a Corporate Contribution License Agreement > (see http://www.apache.org/licenses/ and search for Corporate CLA). > > When you post the contribution, attach it to a Jira and state in the Jira > itself > what you are doing, including granting the ASF a license under the Apache > Software License version 2.0). > > If the contribution represents "substantial" work developed outside of the > ASF's > normal process, it will need to go through the IP clearance process, as > Tommaso > described. >> and specify clearly in the NOTICE file of our >> contribution the complete attribution ? > > Here's info to what goes in the Notice file: > > http://www.apache.org/legal/src-headers.html#notice > > and here's a link which says that the ASF prefers if the contributors do not > put > individual copyright statements into the file: > > http://www.apache.org/dev/apply-license.html#contributor-copyright - linking > to > this in particular about moving existing copyright from source into the > Notice file: > > http://www.apache.org/legal/src-headers.html#header-existingcopyright > > Does this answer your question? > > -Marshall Schor >> Thanks in advance >> >> /Nicolas >> > -- [email protected] -- http://enicolashernandez.blogspot.com http://www.univ-nantes.fr/hernandez-n -- # Laboratoire LINA-TALN CNRS UMR 6241 tel. +33 (0)2 51 12 58 55 # Université de Nantes - Institut Universitaire de Technologie - Département Informatique tel. +33 (0)2 40 30 60 67 -- [email protected] # http://enicolashernandez.blogspot.com http://www.univ-nantes.fr/hernandez-n # Laboratoire LINA-TALN CNRS UMR 6241 tel. +33 (0)2 51 12 58 55 # Université de Nantes - Institut Universitaire de Technologie - Département Informatique tel. +33 (0)2 40 30 60 67
