Hi, I'm new to OpenNLP (and NLP in general) and I'm trying to train the NameFinder on a large corpus (nearly 1 GB). After a few hours it will fail with a GC overhead limit exception. Do you have any suggestions on how I might could accomplish this? Is it possible to train the model on parts of the input at a time? I tried increasing the memory available but that seemed to just prolong the exception.
Thanks for any help. Jeff
