Hi Robert We used (and still use) the French Treebank (Paris 7 Abeille) for building machine learning models for (pre)processing French and some of them for OpenNLP. I say 'still use' because the French Treebank is not always consistent and we are trying "to correct it" in some way.
About the release of the models. Righ now, due to an unclear corpus license, the models we build are only available for research purpose. We are trying to see if we can release them under Apache License. This objective is on its way. To download them. We do not have yet a dedicated web page for downloading the models we built so far (even if you may find some of them already present on the web...). If you are interested in, I can send them to you. Best On Thu, Jan 19, 2012 at 11:08 PM, Jason Baldridge <jasonbaldri...@gmail.com>wrote: > Unfortunately, there is no data I'm aware of for training models for > French. There are efforts underway to get multilingual annotations going on > unrestricted texts, but they are still in the sandbox. Help with those > would be welcome! > > On Thu, Jan 19, 2012 at 10:27 AM, Robert VISEUR <robert.vis...@cetic.be > >wrote: > > > Hi, > > > > We are actually using OpenNLP for POS tagging tasks (with news articles). > > Part of the articles are in French, and I see there wasn't french POS > > tagging model in the common OpenNLP package. Do you know a French public > > model for POS tagging in Open NLP ? > > > > Thanks, > > Best regards, > > Robert. > > > > > > -- > Jason Baldridge > Associate Professor, Department of Linguistics > The University of Texas at Austin > http://www.jasonbaldridge.com > http://twitter.com/jasonbaldridge > -- Dr. Nicolas Hernandez Associate Professor (Maître de Conférences) Université de Nantes - LINA CNRS http://enicolashernandez.blogspot.com http://www.univ-nantes.fr/hernandez-n +33 (0)2 51 12 53 94 +33 (0)2 40 30 60 67