On Fri, Jun 10, 2011 at 10:29 AM, Olivier Grisel
<[email protected]>wrote:

>
> No idea. I think Jacob Perkins (and possibly others) who works with
> NLTK was also interested in such open copora. See for instance this
> thread on metaoptimize.com/qa:
>
>
> http://metaoptimize.com/qa/questions/4650/what-licenses-cover-a-nltk-tagger-trained-on-treebank
>
>
Great. I think a lot of people would benefit from a standard infrastructure
for annotation and training of models for different languages.


> > BTW, there is a lot that can be done to bootstrap POS-taggers from raw
> data
> > and the tags in Wiktionary, so if folks are interested in that I'm happy
> to
> > provide pointers.
>
> As mentionned by Tommaso I think we should start to structure the wiki
> for this effort. Do you want me to create sub-pages of [1] for
> POS-tagging and NE detection? I could write the NE detection page
> with a description of the current effort on corpus-refiner / Walter
> and let you add pointers for the POS tags case.
>
> [1] https://cwiki.apache.org/OPENNLP/opennlp-annotations.html
>
>
Yep, that sounds great. I might not be able to get to it right away, but can
put it on my stack!

Jason

-- 
Jason Baldridge
Assistant Professor, Department of Linguistics
The University of Texas at Austin
http://www.jasonbaldridge.com
http://twitter.com/jasonbaldridge

Reply via email to