Please tell me what can I do for it On Wed, Jun 15, 2011 at 5:58 PM, Tommaso Teofili <[email protected]> wrote: > 2011/6/15 Tommaso Teofili <[email protected]> >> >> Nicolas, >> your post on opennlp-user@ made me realize we didn't take care of helping >> you here yet. >> Did you get the ACK for your SGA? > > I see it's been recorded, so I think we can proceed. > Tommaso > >> >> Regards, >> Tommaso >> 2011/5/26 Nicolas Hernandez <[email protected]> >>> >>> Hi >>> >>> French data models for the Apache UIMA Sandbox HMM Tagger have been >>> submitted via the jira issue >>> https://issues.apache.org/jira/browse/UIMA-2146 >>> >>> Documentation on the procedure to build the models from the French >>> Treebank can be found here (accidentally it is in French...) >>> >>> http://enicolashernandez.blogspot.com/2011/05/construire-des-modelisations-du-french.html >>> >>> The SLA has been sent and we are waiting for receiving the ack. >>> >>> I have prepared an IP form but have not right to commit it... >>> >>> Finaly is there an "appropriate volunter" for executing the IP >>> Clearance processing? >>> >>> I hope I have nothing forgotten. >>> >>> Best regards >>> >>> /Nicolas >>> >>> On Thu, May 19, 2011 at 3:47 PM, Thilo Götz <[email protected]> wrote: >>> > On 5/19/2011 15:04, Nicolas Hernandez wrote: >>> >> Hello Everyone >>> >> >>> >> Jörn, yes it (training MaxEnt models for OpenNLP from the French >>> >> Treebank) is actually part of our plan (building a French-Speaking >>> >> UIMA Community). We wanted also to contribute to the OpenNLP project >>> >> since no models was available for French processing! >>> >> >>> >> About the right to train models on this data set and then distribute >>> >> them under Apache License 2: It took time for us to get the right to >>> >> do it, but I think it was because we were the first to ask for. Now >>> >> they know about it. I know that the maltparser team >>> >> (http://maltparser.org/) would be also interested by the grant. You >>> >> may ask for the French Treebank authors. I can also ask them for >>> >> letting an explicit mention about the right to do it on their web >>> >> site. >>> >> >>> >> As far as I know, the data training set for the English and German POS >>> >> models are not freely available, are they ? >>> > >>> > The English model was trained on the Brown corpus, which is free. >>> > The German model was trained on a non-free corpus. >>> > >>> >> >>> >> Eventually, Jörn, I m not sure to understand. Do you think the IP >>> >> clearance process is not adapted for submitting our contribution ? >>> >> >>> >> Tommaso, I will blog post the procedure I used to train the models. >>> >> There is nothing really special. I used some freely available (under >>> >> AL2) AE components. The HMM learner is already present in the HMM >>> >> Tagger addon. The few other UIMA components I used are also available >>> >> on some google forges (uima-common, uima-connectors, >>> >> uima-type-mapper). >>> >> >>> >> Regards >>> >> >>> >> /Nicolas >>> >> >>> >> On Thu, May 19, 2011 at 9:57 AM, Jörn Kottmann <[email protected]> >>> >> wrote: >>> >>> On 5/19/11 9:00 AM, Tommaso Teofili wrote: >>> >>>> >>> >>>> If you also plan to donate the models I think the IP clearance is >>> >>>> the >>> >>>> right >>> >>>> way both for UIMA and for you as a researcher. >>> >>>> >>> >>> >>> >>> In my opinion it is very important that we have the possibility >>> >>> to retrain the models on the data set, otherwise it will block >>> >>> code changes and bug fixes. >>> >>> >>> >>> Therefore I think we need the right to train models on this >>> >>> data set and then distribute them under AL 2.0. >>> >>> >>> >>> Jörn >>> >>> >>> >> >>> >> >>> >> >>> > >>> >>> >>> >>> -- >>> [email protected] >>> # >>> http://enicolashernandez.blogspot.com >>> http://www.univ-nantes.fr/hernandez-n >>> # >>> Laboratoire LINA-TALN CNRS UMR 6241 >>> tel. +33 (0)2 51 12 58 55 >>> # >>> Université de Nantes - Institut Universitaire de Technologie - >>> Département Informatique >>> tel. +33 (0)2 40 30 60 67 >> > >
-- [email protected] # http://enicolashernandez.blogspot.com http://www.univ-nantes.fr/hernandez-n # Laboratoire LINA-TALN CNRS UMR 6241 tel. +33 (0)2 51 12 58 55 # Université de Nantes - Institut Universitaire de Technologie - Département Informatique tel. +33 (0)2 40 30 60 67
