Re: Name Finder and chunker training format

Nicolas Hernandez Wed, 12 Oct 2011 06:16:46 -0700

Ok thanks


On Wed, Oct 12, 2011 at 2:46 PM, Jörn Kottmann <kottm...@gmail.com> wrote:
> On 10/12/11 2:36 PM, Nicolas Hernandez wrote:
>>
>> Looking at the the Name Finder and the chunker tool, I wonder why they
>> do not use the same training format?
>>
>> For exemple, this
>>
>> Mr.<START:person>  Pierre Vinken<END>  is chairman
>>
>> may also be represented like this
>>
>> Mr. NNP O
>> Pierre NNP B-person
>> Vinken NNP I-person
>> is VBZ O
>> chairman NN O
>>
>> I have noted that the Name Finder API offers the possibility to custom
>> the feature generation to consider for the training, but both the Name
>> Finder and the chunker use the same implementation of the learning
>> algorithm don't they ?
>
> That has historical reasons, the name finder development was inspired by
> the MUC shared tasks, and the chunker development was inspired by the CONLL
> 2000
> shared task.
>
> The implementations are actually different, and the biggest difference is
> the way features
> are generated. The chunker can use pos tags, and the name finder cannot.
>
> We have plans to use the feature generation framework which was created for
> the name finder
> also in the POS tagger and chunker.
>
> Anyway the reasons why we have different components for sequence tagging is
> that it makes it easier to integrate them if there is one component per
> task.
>
> Everything in OpenNLP uses maxent or perceptron, yes.
>
> Jörn
>

Re: Name Finder and chunker training format

Reply via email to