Hi Rodrigo,
I was trying to call train method without resource but I was getting some
errors. I did not find any train method without resources.
I found these train methods in class NameFinderME:
1. train(String languageCode, String type, ObjectStream<NameSample>
samples,TrainingParameters trainParams, byte[] featureGeneratorBytes,
Map<String,Object> resources)
2. train(String languageCode, String type, ObjectStream<NameSample>
samples,TrainingParameters trainParams, AdaptiveFeatureGenerator
generator,Map<String,Object> resources) 3. train(String languageCode, String
type, ObjectStream<NameSample> samples,Map<String,Object> resources) 4.
train(String languageCode, String type, ObjectStream<NameSample>
samples,AdaptiveFeatureGenerator generator, Map<String,Object> resources, int
iterations, int cutoff)
Am I missing something, Could you please tell me how can I do so?
ThanksNikhil
From: Rodrigo Agerri <[email protected]>
To: nikhil jain <[email protected]>
Sent: Friday, November 21, 2014 12:12 AM
Subject: Re: Need to speed up the model creation process of OpenNLP
Hi Nikhil,
It looks good, but you do not seem to need the resources, though, you
why do not use the train method without the resources?
Also, do you have 50 threads?
Rodrigo
On Thu, Nov 20, 2014 at 5:57 PM, nikhil jain <[email protected]> wrote:
> Thanks for the feedback Rodrigo.
> Yes I am trying to create a model based on maximum entropy. As I am using
> API's for building the model, so I tried adding thread param in the Training
> parameters object but I am not sure whether I am adding the param correctly
> or not. I haven't find any clue in documentation as well.
>
> Here is my code developed with the help of openNLP documentation. Is it the
> correct way of creating a maxent model using multi threads?
>
> TrainingParameters tp = new TrainingParameters();
> tp.put(TrainingParameters.ALGORITHM_PARAM, "MAXENT");
> tp.put(TrainingParameters.ITERATIONS_PARAM, Integer.toString(100));
> tp.put(TrainingParameters.CUTOFF_PARAM, Integer.toString(5));
> tp.put("Threads", "50");
>
> Map<String, Object> resources = new HashMap<String, Object>();
> model = NameFinderME.train( "en", "sample", sampleStream, tp, generator,
> resources);
> Thanks
> Nikhil
>
>
> ________________________________
> From: Rodrigo Agerri <[email protected]>
> To: nikhil jain <[email protected]>
> Sent: Thursday, November 20, 2014 11:35 AM
>
> Subject: Re: Need to speed up the model creation process of OpenNLP
>
> Hi Nikhil
> The maxent trainer already allows multi thread training. If you are using
> the cli specify the Threads in your Trainparams file. Check the paramaters
> file sample distributed with opennlp.
> If using it via API perhaps the easiest is to create a TrainingParameters
> object with the threads param specified.
> HTH
> R
>
>
> On 19 Nov 2014 21:19, "nikhil jain" <[email protected]> wrote:
>
> Hi Rodrigo,
>
> No, I am not using multi-threading, it's a simple Java program, took help
> from openNLP documentation but it is worth mentioning over here is that as
> the corpus is containing 4 million records so my Java program running in
> eclipse was frequently giving me java heap space issue (out of memory issue)
> so I investigate a bit and found that process was taking around 10GB memory
> for building the model so i increased the memory to 10 GB using -Xmx
> parameter. so it worked properly but took 3 hours.
>
> Thanks
> -NIkhil
>
> ________________________________
> From: Rodrigo Agerri <[email protected]>
> To: "[email protected]" <[email protected]>; nikhil jain
> <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Sent: Wednesday, November 19, 2014 2:17 AM
> Subject: Re: Need to speed up the model creation process of OpenNLP
>
> Hi,
>
> Are you using multithreading, lots of threads, RAM memory?
>
> R
>
>
>
>
> On Tue, Nov 18, 2014 at 5:46 PM, nikhil jain
> <[email protected]> wrote:
>> Hi,
>> I asked below question yesterday, did anyone get a chance to look at this.
>> I am new in OpenNLP and really need some help. Please provide some clue or
>> link or example.
>> ThanksNIkhil
>> From: nikhil jain <[email protected]>
>> To: "[email protected]" <[email protected]>; Dev at Opennlp
>> Apache <[email protected]>
>> Sent: Tuesday, November 18, 2014 12:02 AM
>> Subject: Need to speed up the model creation process of OpenNLP
>>
>> Hi,
>> I am using OpenNLP Token Name Finder for parsing the unstructured data. I
>> have created a corpus of about 4 million records. When I am creating a model
>> out of the training set using openNLP API's in Eclipse using default setting
>> (cut-off 5 and iterations 100), process is taking a good amount of time,
>> around 2-3 hours.
>> Can someone suggest me how can I reduce the time as I want to experiment
>> with different iterations but as the model creation process is taking so
>> much time, I am not able to experiment with it. This is really a time
>> consuming process.
>> Please provide some feedback.
>> Thanks in advance.Nikhil Jain
>>
>>
>
>
>
>