On 12/3/2010 16:58, Isabel Drost wrote:
On Fri, 03 Dec 2010 Thilo Goetz<[email protected]>  wrote:
Exactly.  I would really like to create some training data
that's under a permissive license.  It's surprising how
little of that there is in general.  It would be a lot of
work, but maybe we'll find help.

If you are interested in gathering training data sets under permissive
licences you may want to have a look into the Lucene Open Relevance
project: http://lucene.apache.org/openrelevance/

 From the project explanation: "The Open Relevance Project (ORP) is a
new Apache Lucene sub-project aimed at making materials for doing
relevance testing for Information Retrieval (IR), Machine Learning and
Natural Language Processing (NLP) into open source."

Isabel

Thanks, we should keep it in the backs of our
heads.  Although we'd need different kinds of annotations,
we could still use the same basic data.

--Thilo

Reply via email to