Re: Need to speed up the model creation process of OpenNLP

Samik Raychaudhuri Tue, 18 Nov 2014 16:31:35 -0800

Hi,

This is essentially a machine learning problem, nothing to do withOpenNLP. If you have such a large corpus, it would take a substantialamount of time to train models. You can possibly have smaller trainingsets and see if the models deteriorate substantially. Another strategyis to incrementally introduce training sets containing specific class ofToken Names - that would provide a quicker turnaround.

Hope this help.
Best,
-Samik



On 18/11/2014 8:46 AM, nikhil jain wrote:

Hi,
I asked below question yesterday, did anyone get a chance to look at this.
I am new in OpenNLP and really need some help. Please provide some clue or link 
or example.
ThanksNIkhil
       From: nikhil jain <[email protected]>
  To: "[email protected]" <[email protected]>; Dev at Opennlp Apache 
<[email protected]>
  Sent: Tuesday, November 18, 2014 12:02 AM
  Subject: Need to speed up the model creation process of OpenNLP

Hi,

I am using OpenNLP Token Name Finder for parsing the unstructured data. I have 
created a corpus of about 4 million records. When I am creating a model out of 
the training set using openNLP API's in Eclipse using default setting (cut-off 
5 and iterations 100), process is taking a good amount of time, around 2-3 
hours.
Can someone suggest me how can I reduce the time as I want to experiment with 
different iterations but as the model creation process is taking so much time, 
I am not able to experiment with it. This is really a time consuming process.
Please provide some feedback.
Thanks in advance.Nikhil Jain

Re: Need to speed up the model creation process of OpenNLP

Reply via email to