Thanks. I used MapReduce to build the training input. I didn't realize that the training can also be performed on Hadoop. Can I simply combine the generated models at the completion of the job?
On Mon, Oct 7, 2013 at 8:00 AM, Miljana Mladenovic <[email protected]>wrote: > Learn about map-reduce strategy over big data. For example: > http://wiki.apache.org/hadoop/**MapReduce<http://wiki.apache.org/hadoop/MapReduce> > Regards, Mixie > > On Mon, 07 Oct 2013 13:53:33 +0200, Jeffrey Zemerick <[email protected]> > wrote: > > Hi, >> >> I'm new to OpenNLP (and NLP in general) and I'm trying to train the >> NameFinder on a large corpus (nearly 1 GB). After a few hours it will fail >> with a GC overhead limit exception. Do you have any suggestions on how I >> might could accomplish this? Is it possible to train the model on parts of >> the input at a time? I tried increasing the memory available but that >> seemed to just prolong the exception. >> >> Thanks for any help. >> >> Jeff >> > > > -- > Using Opera's revolutionary e-mail client: http://www.opera.com/mail/ >
