Hi Jorn,
 
thanks for replying. I changed the encoding of the file to the ANSI but I got 
another error
-----------------------------------------------------------------------------------------------------
C:\OpenNLP\apache-opennlp-1.5.1-incubating-bin\apache-opennlp-1.5.1-incubating>j
ava -jar lib\opennlp-tools-*.jar TokenNameFinderTrainer -encoding UTF-8 -lang en
 -data data1.txt -model maha.bin
Indexing events using cutoff of 5
        Computing event counts...  java.io.IOException: Found unexpected annotat
ion <END>.
Incorporating indexed data for training...
Exception in thread "main" java.lang.NullPointerException
        at opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:272)
        at opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:252)
        at opennlp.maxent.GIS.trainModel(GIS.java:228)
        at opennlp.maxent.GIS.trainModel(GIS.java:179)
        at opennlp.tools.namefind.NameFinderME.train(NameFinderME.java:345)
        at opennlp.tools.namefind.NameFinderME.train(NameFinderME.java:356)
        at opennlp.tools.cmdline.namefind.TokenNameFinderTrainerTool.run(TokenNa
meFinderTrainerTool.java:87)
        at opennlp.tools.cmdline.CLI.main(CLI.java:183)
--------------------------------------------------------------------------------------------------------
 
I am sure there is not annotation END followed by period in my file there is 
always space between <END> and .
 
 

> Date: Tue, 21 Jun 2011 19:02:14 +0200
> From: kottm...@gmail.com
> To: opennlp-users@incubator.apache.org
> Subject: Re: What is the problem with the training filr
> 
> Hi,
> 
> there is an issue with the encoding of your trainingFile.txt, for some 
> reason it cannot be decoded
> using UTF-8. Try to open it in a text editor with UTF-8 and you will get 
> an error too.
> 
> Hope that helps,
> Jörn
> 
> On 6/21/11 6:59 PM, Amal Elmah wrote:
> > When I used command line training tool on my data (training.txt) it gives 
> > error as follows:
> > ------------------------------------------------------------------------------------------------------------------------
> > C:\OpenNLP\apache-opennlp-1.5.1-incubating-bin\apache-opennlp-1.5.1-incubating>java
> >  -jar lib\opennlp-tools-*.jar TokenNameFinderTrainer -encoding UTF-8 -lang 
> > en
> > -data trainingFile.txt -model mymodel.bin
> > Indexing events using cutoff of 5
> > Computing event counts... java.nio.charset.MalformedInputException: Input 
> > length = 1
> > Incorporating indexed data for training...
> > Exception in thread "main" java.lang.NullPointerException
> > at opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:272)
> > at opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:252)
> > at opennlp.maxent.GIS.trainModel(GIS.java:228)
> > at opennlp.maxent.GIS.trainModel(GIS.java:179)
> > at opennlp.tools.namefind.NameFinderME.train(NameFinderME.java:345)
> > at opennlp.tools.namefind.NameFinderME.train(NameFinderME.java:356)
> > at opennlp.tools.cmdline.namefind.TokenNameFinderTrainerTool.run(TokenNa
> > meFinderTrainerTool.java:87)
> > at opennlp.tools.cmdline.CLI.main(CLI.java:183)
> > ---------------------------------------------------------------------------
> > I do not know what is the problem and this is part of my data in the text 
> > file
> >
> > Professor<START> Michael<END>
> > Professor<START> Naci<END>
> > Dr<START> Richard<END> ( p / t )
> > Dr<START> David<END>
> > Professor<START> Vic<END>
> > Dr<START> Adrian<END>
> > Dr<START> Martin<END>
> > Dr<START> Timothy<END>
> > Dr<START> Ian<END>
> > Dr<START> Ali<END>
> > -----------------------------------------------------------------------------------------------------------------------
> > 
> 
                                          

Reply via email to