Hi Jorn, let me be more precise.  Do you have a notion of how the
precision-recall curve (AUC) changes as a function of the number of
annotations?  I'm curious how many annotations are needed for a model
with reasonable precision-recall AUC and reasonable performance
(memory and speed).

Peace.  Michael

On Mon, Oct 7, 2013 at 3:29 PM, Jörn Kottmann <[email protected]> wrote:
> On 10/07/2013 11:00 PM, Michael Schmitz wrote:
>>
>> Do you know how many sentences/tokens were annotated for the OpenNLP
>> POS and CHUNK models?  Do you have an idea of the "sweet spot" for
>> number of annotations vs performance?
>
>
> If the model gets bigger the computations get more complex, but as far as I
> know
> the effect of the model not fitting anymore in the CPU cache is much more
> significant then
> that. I am using hash based int features to reduce the memory footprint in
> the name finder.
>
> I don't have much experience with the Chunker or Pos Tagger in regards to
> performance, but
> it should be easy to do a series of tests, the command line tools have built
> in performance monitoring.
>
> Jörn

Reply via email to