Hi William--thanks for the pointer. Do you know the size of your training sets? I did not see that in the chapters you pointed me to.
On Mon, Oct 7, 2013 at 3:48 PM, William Colen <[email protected]> wrote: > Actually, I measured the model effectiveness, not the memory x performance. > > > > > 2013/10/7 Michael Schmitz <[email protected]> > >> Hi Jorn, let me be more precise. Do you have a notion of how the >> precision-recall curve (AUC) changes as a function of the number of >> annotations? I'm curious how many annotations are needed for a model >> with reasonable precision-recall AUC and reasonable performance >> (memory and speed). >> >> Peace. Michael >> >> On Mon, Oct 7, 2013 at 3:29 PM, Jörn Kottmann <[email protected]> wrote: >> > On 10/07/2013 11:00 PM, Michael Schmitz wrote: >> >> >> >> Do you know how many sentences/tokens were annotated for the OpenNLP >> >> POS and CHUNK models? Do you have an idea of the "sweet spot" for >> >> number of annotations vs performance? >> > >> > >> > If the model gets bigger the computations get more complex, but as far >> as I >> > know >> > the effect of the model not fitting anymore in the CPU cache is much more >> > significant then >> > that. I am using hash based int features to reduce the memory footprint >> in >> > the name finder. >> > >> > I don't have much experience with the Chunker or Pos Tagger in regards to >> > performance, but >> > it should be easy to do a series of tests, the command line tools have >> built >> > in performance monitoring. >> > >> > Jörn >>
