On 6/29/11 12:37 AM, Hannes Korte wrote:
On 29.06.2011 00:13, Jörn Kottmann wrote:
I thought a bit more about work scheduling and believe it might be
nice to put this functionality directly into the Corpus Server. Then
a labeling task queue can be defined by a search query to match CASes
which should be tagged, and an annotation tool can just ask the
Corpus Server for the next work item.

+1, that's what I planned to do as well. And at a later stage we can integrate a real active learning component there. Such a work item consists of a CAS and some sentence identifier?

A work item is just a CAS, and the annotator should enhance it with annotation if possible. The work queue defines what kind of task the CAS belongs to, we might have one work queue which contains
CASes for named entity labeling and another one for pos labeling.

The search query could search for CASes with a certain minimum text length, a minimum number of pre-detect entitities
and no-human labeled entities.

I will try to extend the proposal tomorrow with a sample type system and a description on how things could be labeled, based
on the discussion we had here a few days ago.

Jörn

Reply via email to