Re: Next Steps for OpenNLP

Michael Schmitz Thu, 10 Oct 2013 08:21:56 -0700

Hi William--thanks for the pointer.  Do you know the size of your
training sets?  I did not see that in the chapters you pointed me to.


On Mon, Oct 7, 2013 at 3:48 PM, William Colen <[email protected]> wrote:
> Actually, I measured the model effectiveness, not the memory x performance.
>
>
>
>
> 2013/10/7 Michael Schmitz <[email protected]>
>
>> Hi Jorn, let me be more precise.  Do you have a notion of how the
>> precision-recall curve (AUC) changes as a function of the number of
>> annotations?  I'm curious how many annotations are needed for a model
>> with reasonable precision-recall AUC and reasonable performance
>> (memory and speed).
>>
>> Peace.  Michael
>>
>> On Mon, Oct 7, 2013 at 3:29 PM, Jörn Kottmann <[email protected]> wrote:
>> > On 10/07/2013 11:00 PM, Michael Schmitz wrote:
>> >>
>> >> Do you know how many sentences/tokens were annotated for the OpenNLP
>> >> POS and CHUNK models?  Do you have an idea of the "sweet spot" for
>> >> number of annotations vs performance?
>> >
>> >
>> > If the model gets bigger the computations get more complex, but as far
>> as I
>> > know
>> > the effect of the model not fitting anymore in the CPU cache is much more
>> > significant then
>> > that. I am using hash based int features to reduce the memory footprint
>> in
>> > the name finder.
>> >
>> > I don't have much experience with the Chunker or Pos Tagger in regards to
>> > performance, but
>> > it should be easy to do a series of tests, the command line tools have
>> built
>> > in performance monitoring.
>> >
>> > Jörn
>>

Re: Next Steps for OpenNLP

Reply via email to