Hi Rodrigo,
No, I am not using multi-threading, it's a simple Java program, took help from
openNLP documentation but it is worth mentioning over here is that as the
corpus is containing 4 million records so my Java program running in eclipse
was frequently giving me java heap space issue (out of memory issue) so I
investigate a bit and found that process was taking around 10GB memory for
building the model so i increased the memory to 10 GB using -Xmx parameter. so
it worked properly but took 3 hours.
Thanks-NIkhil
From: Rodrigo Agerri <[email protected]>
To: "[email protected]" <[email protected]>; nikhil jain
<[email protected]>
Cc: "[email protected]" <[email protected]>
Sent: Wednesday, November 19, 2014 2:17 AM
Subject: Re: Need to speed up the model creation process of OpenNLP
Hi,
Are you using multithreading, lots of threads, RAM memory?
R
On Tue, Nov 18, 2014 at 5:46 PM, nikhil jain
<[email protected]> wrote:
> Hi,
> I asked below question yesterday, did anyone get a chance to look at this.
> I am new in OpenNLP and really need some help. Please provide some clue or
> link or example.
> ThanksNIkhil
> From: nikhil jain <[email protected]>
> To: "[email protected]" <[email protected]>; Dev at Opennlp
>Apache <[email protected]>
> Sent: Tuesday, November 18, 2014 12:02 AM
> Subject: Need to speed up the model creation process of OpenNLP
>
> Hi,
> I am using OpenNLP Token Name Finder for parsing the unstructured data. I
> have created a corpus of about 4 million records. When I am creating a model
> out of the training set using openNLP API's in Eclipse using default setting
> (cut-off 5 and iterations 100), process is taking a good amount of time,
> around 2-3 hours.
> Can someone suggest me how can I reduce the time as I want to experiment with
> different iterations but as the model creation process is taking so much
> time, I am not able to experiment with it. This is really a time consuming
> process.
> Please provide some feedback.
> Thanks in advance.Nikhil Jain
>
>