I think we need to have documentation that shows that the donor in the SLA (University of Nantes ?) obtained rights to license the language models built using the treebank under the ASL v 2.0, because the treebank website lists a different license which (it seems to me) is not compatible with ASL v 2.0 (it restricts use of the treebank to "research" only, etc.).
-Marshall On 6/17/2011 6:40 AM, Tommaso Teofili wrote: > Hello Nicolas, all, > I've updated your patch with some more details, can you review it and see if > it sounds good to you? > In particular I've set the 'University of Nantes' as holder of distribution > rights, can you confirm that is correct? > If any other UIMA PMC member would like to review it too that'd be welcome > :) > After this has been fixed I plan to call a vote here to accept this IP > Clearance. > Regards, > Tommaso > > > 2011/6/17 Tommaso Teofili <[email protected]> > >> Thanks Nicolas, >> I'll review it as soon as possible. >> Tommaso >> >> >> 2011/6/17 Nicolas Hernandez <[email protected]> >> >>> Patch submitted. Let me know if I have to be more precise. >>> >>> Regards, >>> >>> /Nicolas >>> >>> On Fri, Jun 17, 2011 at 8:38 AM, Tommaso Teofili >>> <[email protected]> wrote: >>>> Hello Nicolas, >>>> did you already prepare the IP clearance template? >>>> We usually maintain those files under our SVN website, see [1] as an >>> example >>>> of a former IP Clearance (see also the related thread [2]). >>>> If the IP is ready you can add a patch to the website in the same Jira >>> issue >>>> [3] so that we can review it. >>>> Regards, >>>> Tommaso >>>> [1] >>>> : >>> http://svn.apache.org/repos/asf/uima/site/trunk/uima-website/xdocs/ip-clearances/ >>>> [2] : http://markmail.org/message/gbtph456u3445yfn >>>> [3] : https://issues.apache.org/jira/browse/UIMA-2146 >>>> >>>> 2011/6/16 Nicolas Hernandez <[email protected]> >>>>> Please tell me what can I do for it >>>>> >>>>> On Wed, Jun 15, 2011 at 5:58 PM, Tommaso Teofili >>>>> <[email protected]> wrote: >>>>>> 2011/6/15 Tommaso Teofili <[email protected]> >>>>>>> Nicolas, >>>>>>> your post on opennlp-user@ made me realize we didn't take care of >>>>>>> helping >>>>>>> you here yet. >>>>>>> Did you get the ACK for your SGA? >>>>>> I see it's been recorded, so I think we can proceed. >>>>>> Tommaso >>>>>> >>>>>>> Regards, >>>>>>> Tommaso >>>>>>> 2011/5/26 Nicolas Hernandez <[email protected]> >>>>>>>> Hi >>>>>>>> >>>>>>>> French data models for the Apache UIMA Sandbox HMM Tagger have been >>>>>>>> submitted via the jira issue >>>>>>>> https://issues.apache.org/jira/browse/UIMA-2146 >>>>>>>> >>>>>>>> Documentation on the procedure to build the models from the French >>>>>>>> Treebank can be found here (accidentally it is in French...) >>>>>>>> >>>>>>>> >>>>>>>> >>> http://enicolashernandez.blogspot.com/2011/05/construire-des-modelisations-du-french.html >>>>>>>> The SLA has been sent and we are waiting for receiving the ack. >>>>>>>> >>>>>>>> I have prepared an IP form but have not right to commit it... >>>>>>>> >>>>>>>> Finaly is there an "appropriate volunter" for executing the IP >>>>>>>> Clearance processing? >>>>>>>> >>>>>>>> I hope I have nothing forgotten. >>>>>>>> >>>>>>>> Best regards >>>>>>>> >>>>>>>> /Nicolas >>>>>>>> >>>>>>>> On Thu, May 19, 2011 at 3:47 PM, Thilo Götz <[email protected]> >>> wrote: >>>>>>>>> On 5/19/2011 15:04, Nicolas Hernandez wrote: >>>>>>>>>> Hello Everyone >>>>>>>>>> >>>>>>>>>> Jörn, yes it (training MaxEnt models for OpenNLP from the French >>>>>>>>>> Treebank) is actually part of our plan (building a >>> French-Speaking >>>>>>>>>> UIMA Community). We wanted also to contribute to the OpenNLP >>>>>>>>>> project >>>>>>>>>> since no models was available for French processing! >>>>>>>>>> >>>>>>>>>> About the right to train models on this data set and then >>>>>>>>>> distribute >>>>>>>>>> them under Apache License 2: It took time for us to get the >>> right >>>>>>>>>> to >>>>>>>>>> do it, but I think it was because we were the first to ask for. >>> Now >>>>>>>>>> they know about it. I know that the maltparser team >>>>>>>>>> (http://maltparser.org/) would be also interested by the grant. >>> You >>>>>>>>>> may ask for the French Treebank authors. I can also ask them for >>>>>>>>>> letting an explicit mention about the right to do it on their >>> web >>>>>>>>>> site. >>>>>>>>>> >>>>>>>>>> As far as I know, the data training set for the English and >>> German >>>>>>>>>> POS >>>>>>>>>> models are not freely available, are they ? >>>>>>>>> The English model was trained on the Brown corpus, which is free. >>>>>>>>> The German model was trained on a non-free corpus. >>>>>>>>> >>>>>>>>>> Eventually, Jörn, I m not sure to understand. Do you think the >>> IP >>>>>>>>>> clearance process is not adapted for submitting our contribution >>> ? >>>>>>>>>> Tommaso, I will blog post the procedure I used to train the >>> models. >>>>>>>>>> There is nothing really special. I used some freely available >>>>>>>>>> (under >>>>>>>>>> AL2) AE components. The HMM learner is already present in the >>> HMM >>>>>>>>>> Tagger addon. The few other UIMA components I used are also >>>>>>>>>> available >>>>>>>>>> on some google forges (uima-common, uima-connectors, >>>>>>>>>> uima-type-mapper). >>>>>>>>>> >>>>>>>>>> Regards >>>>>>>>>> >>>>>>>>>> /Nicolas >>>>>>>>>> >>>>>>>>>> On Thu, May 19, 2011 at 9:57 AM, Jörn Kottmann < >>> [email protected]> >>>>>>>>>> wrote: >>>>>>>>>>> On 5/19/11 9:00 AM, Tommaso Teofili wrote: >>>>>>>>>>>> If you also plan to donate the models I think the IP clearance >>> is >>>>>>>>>>>> the >>>>>>>>>>>> right >>>>>>>>>>>> way both for UIMA and for you as a researcher. >>>>>>>>>>>> >>>>>>>>>>> In my opinion it is very important that we have the possibility >>>>>>>>>>> to retrain the models on the data set, otherwise it will block >>>>>>>>>>> code changes and bug fixes. >>>>>>>>>>> >>>>>>>>>>> Therefore I think we need the right to train models on this >>>>>>>>>>> data set and then distribute them under AL 2.0. >>>>>>>>>>> >>>>>>>>>>> Jörn >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> [email protected] >>>>>>>> # >>>>>>>> http://enicolashernandez.blogspot.com >>>>>>>> http://www.univ-nantes.fr/hernandez-n >>>>>>>> # >>>>>>>> Laboratoire LINA-TALN CNRS UMR 6241 >>>>>>>> tel. +33 (0)2 51 12 58 55 >>>>>>>> # >>>>>>>> Université de Nantes - Institut Universitaire de Technologie - >>>>>>>> Département Informatique >>>>>>>> tel. +33 (0)2 40 30 60 67 >>>>>> >>>>> >>>>> >>>>> -- >>>>> [email protected] >>>>> # >>>>> http://enicolashernandez.blogspot.com >>>>> http://www.univ-nantes.fr/hernandez-n >>>>> # >>>>> Laboratoire LINA-TALN CNRS UMR 6241 >>>>> tel. +33 (0)2 51 12 58 55 >>>>> # >>>>> Université de Nantes - Institut Universitaire de Technologie - >>>>> Département Informatique >>>>> tel. +33 (0)2 40 30 60 67 >>>> >>> >>> >>> -- >>> [email protected] >>> # >>> http://enicolashernandez.blogspot.com >>> http://www.univ-nantes.fr/hernandez-n >>> # >>> Laboratoire LINA-TALN CNRS UMR 6241 >>> tel. +33 (0)2 51 12 58 55 >>> # >>> Université de Nantes - Institut Universitaire de Technologie - >>> Département Informatique >>> tel. +33 (0)2 40 30 60 67 >>> >>
