That's great to hear. I thought the French Treebank licensing was pretty
clear about how artifacts that could be trained on it could be used. Please
keep us informed about the French data situation!

FWIW, while I very much want to see the creation of unrestricted data with
unrestricted annotations, I implore anyone who does find any models that
have been trained on a restricted corpus like that to only use them in
accordance with the wishes of the copyright holders. It's not just the
legally correct thing to do, but also the morally correct thing to do.

To begin the process of getting models organized and associated with clear
permissions, I've started the following GitHub repo:

https://github.com/utcompling/OpenNLP-Models

Nothing there yet, but there will be Norwegian models fairly soon. Any help
with getting more languages in there, or help with getting things set up in
general is most welcome!

-Jason

On Thu, Jan 19, 2012 at 4:28 PM, Nicolas Hernandez <
nicolas.hernan...@gmail.com> wrote:

> Hi Robert
>
> We used (and still use) the French Treebank (Paris 7 Abeille) for building
> machine learning models for (pre)processing French and some of them for
> OpenNLP.
> I say 'still use' because the French Treebank is not always consistent and
> we are trying "to correct it" in some way.
>
> About the release of the models.
> Righ now, due to an unclear corpus license, the models we build are only
> available for research purpose.
> We are trying to see if we can release them under Apache License.
> This objective is on its way.
>
> To download them.
> We do not have yet a dedicated web page for downloading the models we
> built so far (even if you may find some of them already present on the
> web...).
> If you are interested in, I can send them to you.
>
> Best
>
> On Thu, Jan 19, 2012 at 11:08 PM, Jason Baldridge <
> jasonbaldri...@gmail.com> wrote:
>
>> Unfortunately, there is no data I'm aware of for training models for
>> French. There are efforts underway to get multilingual annotations going
>> on
>> unrestricted texts, but they are still in the sandbox. Help with those
>> would be welcome!
>>
>> On Thu, Jan 19, 2012 at 10:27 AM, Robert VISEUR <robert.vis...@cetic.be
>> >wrote:
>>
>> > Hi,
>> >
>> > We are actually using OpenNLP for POS tagging tasks (with news
>> articles).
>> > Part of the articles are in French, and I see there wasn't french POS
>> > tagging model in the common OpenNLP package. Do you know a French public
>> > model for POS tagging in Open NLP ?
>> >
>> > Thanks,
>> > Best regards,
>> > Robert.
>> >
>>
>>
>>
>> --
>> Jason Baldridge
>> Associate Professor, Department of Linguistics
>> The University of Texas at Austin
>> http://www.jasonbaldridge.com
>> http://twitter.com/jasonbaldridge
>>
>
>
>
> --
> Dr. Nicolas Hernandez
> Associate Professor (Maître de Conférences)
> Université de Nantes - LINA CNRS
> http://enicolashernandez.blogspot.com
> http://www.univ-nantes.fr/hernandez-n
> +33 (0)2 51 12 53 94
> +33 (0)2 40 30 60 67
>
>


-- 
Jason Baldridge
Associate Professor, Department of Linguistics
The University of Texas at Austin
http://www.jasonbaldridge.com
http://twitter.com/jasonbaldridge

Reply via email to