Hello Nicolas, 2011/5/18 Nicolas Hernandez <[email protected]>
> Dear All, > > I come back one year later... > > To remind you, we used a French Treebank corpus > (http://www.llf.cnrs.fr/Gens/Abeille/French-Treebank-fr.php) to train > models for processing French with the HMM tagger addon. > I first contacted you for some advices since we did not own the > resource we used and we were not sure to be allowed to distribute our > models under Apache license. We were discussing about with the > resource owner and we though that an alternative way to distribute the > models we trained could be to jointly submit the models. > > Eventually, we got the grant from the owner to distribute the models > we built up under the Apache License v2. > > In short, we built up French models for part of speech (pos), > morphological (mph) and function grammatical (fct) tagging, as well as > lemmatization (lemma). We use the Hmm tagger to perform the various > tagging. A recent patch has been submitted to turn the Hmm tagger into > a less type system dependant tagger. > See https://issues.apache.org/jira/browse/UIMA-2110 as I said in my last comment it'd be also nice to see some documentation on how you created the models so that more users can create models with it. > > > Before submitting the models to the project, I have some new > questions. As a researcher it is important for us that our work be > cited by other researchers. In addition, the models are only a few > files but they represent a substantial contribution for the French > Natural Language Processing community. > > So I was wondering whether you still advise me to perform the IP > clearance procedure or just to add a specific mention in the NOTICE > file. > If you also plan to donate the models I think the IP clearance is the right way both for UIMA and for you as a researcher. > > In the first case, could you find me an "appropriate volunter" for > executing the IP Clearance processing? > I am working on the UIMA Addons RC2 so can't do it right now but, if no one is available before that time, I could help you once UIMA Addons release is done. > > Another "substantial" question... our model files takes about 5 Mo > each (pos, mph and fct) except the lemma model file which takes 24 Mo. > Alternatively we built up a merged model for pos, mph and fct which > takes 6.9 Mo. Do you thing it may cause a problem if we submit all of > them? > I don't see any issue with that sizes so, in my opinion, the models can all be submitted separately. Regards, Tommaso > > Best regards > > /Nicolas > > ---------- Forwarded message ---------- > From: Nicolas Hernandez <[email protected]> > Date: Thu, Nov 4, 2010 at 11:28 AM > Subject: Re: Guidelines for a mutual contribution > To: [email protected] > > > Thilo, we would like to submit a language model which was trained on a > French Treebank corpus for the tagger addon. We do not own the > treebank corpus we used. We are in discussion with her owner to know > if we still respect the treebank License by distributing a model built > on it under the Apache License. > We though that an alternative way to distribute the model we trained > could be to jointly submit the model with the owner of the treebank. > > Marshal, I will consult all the links you mention and come back if > necessary > > Thanks > > On Thu, Nov 4, 2010 at 11:06 AM, Marshall Schor <[email protected]> wrote: > > > > > > On 11/4/2010 5:06 AM, Nicolas Hernandez wrote: > >> Hi > >> > >> Can someone indicate me where to find some guidelines to commit a > >> mutual contribution? In other words, how to proceed when there is two > >> developers or corporations involved in a work they would like to > >> commit ? > >> I do not find any information on this subject on > >> http://www.apache.org/licenses/ neither on > >> http://uima.apache.org/contribution-policy.html > >> > >> Do we have to submit each of us an "Individual Contributor License > >> Agreement" to the ASF > > > > Each person has to have an "Individual Contributor License Agreement" on > file > > with the ASF (and, if appropriate, a Corporate Contribution License > Agreement > > (see http://www.apache.org/licenses/ and search for Corporate CLA). > > > > When you post the contribution, attach it to a Jira and state in the Jira > itself > > what you are doing, including granting the ASF a license under the Apache > > Software License version 2.0). > > > > If the contribution represents "substantial" work developed outside of > the ASF's > > normal process, it will need to go through the IP clearance process, as > Tommaso > > described. > >> and specify clearly in the NOTICE file of our > >> contribution the complete attribution ? > > > > Here's info to what goes in the Notice file: > > > > http://www.apache.org/legal/src-headers.html#notice > > > > and here's a link which says that the ASF prefers if the contributors do > not put > > individual copyright statements into the file: > > > > http://www.apache.org/dev/apply-license.html#contributor-copyright - > linking to > > this in particular about moving existing copyright from source into the > Notice file: > > > > http://www.apache.org/legal/src-headers.html#header-existingcopyright > > > > Does this answer your question? > > > > -Marshall Schor > >> Thanks in advance > >> > >> /Nicolas > >> > > > > > > -- > [email protected] > -- > http://enicolashernandez.blogspot.com > http://www.univ-nantes.fr/hernandez-n > -- > # Laboratoire LINA-TALN CNRS UMR 6241 > tel. +33 (0)2 51 12 58 55 > # Université de Nantes - Institut Universitaire de Technologie - > Département Informatique > tel. +33 (0)2 40 30 60 67 > > > > -- > [email protected] > # > http://enicolashernandez.blogspot.com > http://www.univ-nantes.fr/hernandez-n > # > Laboratoire LINA-TALN CNRS UMR 6241 > tel. +33 (0)2 51 12 58 55 > # > Université de Nantes - Institut Universitaire de Technologie - > Département Informatique > tel. +33 (0)2 40 30 60 67 >
