Sorry for this bug. We have a jira for it, but no one every took time to
fix it.
Well, instead of the stack trace you should see an error message which
tells you that you don't have enough training data.
You should try with a few hundred examples at least, otherwise
the model you produce will not really work.
Jörn
On 03/30/2012 03:36 PM, Adriano Santos wrote:
Hi Jörn, thanks for help me.
I changed the class path and OpenNLP version. Ran, again, the sample and
returned this error:
C:\apache-opennlp-1.5.2\bin>opennlp DoccatTrainer -encoding UTF-8 -lang en
-data
en-doccat.train -model en-doccat.bin
Indexing events using cutoff of 5
Computing event counts... done. 2 events
Indexing... Dropped event GMDecrease:[bow=Major, bow=acquisitions,
bow=
that, bow=have, bow=a, bow=lower, bow=gross, bow=margin, bow=than, bow=the,
bow=
existing, bow=network, bow=also]
Dropped event GMIncrease:[bow=The, bow=upward, bow=movement, bow=of,
bow=gross,
bow=margin, bow=resulted, bow=from, bow=amounts, bow=pursuant, bow=to,
bow=adjus
tments]
done.
Sorting and merging events... Done indexing.
Incorporating indexed data for training...
Exception in thread "main" java.lang.NullPointerException
at opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:263)
at opennlp.maxent.GIS.trainModel(GIS.java:256)
at opennlp.model.TrainUtil.train(TrainUtil.java:182)
at
opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerM
E.java:154)
at
opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerM
E.java:176)
at
opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerM
E.java:192)
at
opennlp.tools.cmdline.doccat.DoccatTrainerTool.run(DoccatTrainerTool.
java:91)
at opennlp.tools.cmdline.CLI.main(CLI.java:191)
On Fri, Mar 30, 2012 at 10:18 AM, Jörn Kottmann<[email protected]> wrote:
Looks like you do not have the maxent jar on the classpath.
Maybe it is just an issue with our script (does that work with head?).
Anyway, try to go to this dir:
C:\Program Files\Apache Software Foundation\opennlp-tools-1.5.0
and type: bin/opennlp
Or does it not work because of the whitespace in Program Files?
I suggest that you try 1.5.2, if I remember it correctly we spent some
time on this script to fix it.
Jörn
On 03/30/2012 03:14 PM, Adriano Santos wrote:
Hi, people.
So... I run the exemple and return this error:
C:\Program Files\Apache Software Foundation\opennlp-tools-1.5.**
0\bin>opennlp
Docc
atTrainer -encoding UTF-8 -lang en -data en-doccat.train -model
en-doccat.bin
Exception in thread "main" java.lang.**NoClassDefFoundError:
opennlp/model/EventSt
ream
at opennlp.tools.cmdline.CLI.<**clinit>(CLI.java:107)
Caused by: java.lang.**ClassNotFoundException: opennlp.model.EventStream
at java.net.URLClassLoader$1.run(**URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(**URLClassLoader.java:355)
at java.security.**AccessController.doPrivileged(**Native Method)
at java.net.URLClassLoader.**findClass(URLClassLoader.java:**354)
at java.lang.ClassLoader.**loadClass(ClassLoader.java:**423)
at sun.misc.Launcher$**AppClassLoader.loadClass(**
Launcher.java:308)
at java.lang.ClassLoader.**loadClass(ClassLoader.java:**356)
... 1 more
I'm using opennlp-tools-1.5.0 version.
Thanks for all.
On Tue, Mar 27, 2012 at 8:40 PM, [email protected]<
[email protected]> wrote:
Hi, Adriano,
We don't have any ready to use model for Document Categorizer yet. You
should try training your own using the instructions.
Regards,
William
On Tue, Mar 27, 2012 at 5:31 PM, Adriano Santos<[email protected]>
wrote:
To perform classification I need a maxent model. But I don’t have an
example this. In the others tasks (Name Finder, Tokenizer, Sentence
Detector...) has example... I’m beginner in the OpenNLP and I’d like run
all existents examples.
Can you help me?
On Tue, Mar 27, 2012 at 5:17 PM, Jörn Kottmann<[email protected]>
wrote:
On 03/27/2012 10:04 PM, Adriano Santos wrote:
I'm trying to use Document Categorizer - Classifying, but I could
not
run
the example .
What the problem you have? Do you get an exception?
Jörn
--
Adriano Araújo Santos
*************************************************
*Professor da **Escola Superior de Aviação Civil - ESAC* *
*
*Professor do Curso de Sistemas de Informação - FACISA*
*Professor do Departamento de Computação da UEPB
* *PMI Membership
Mestrando em Ciência da Computação da UFCG*
*Pós-graduando em Gestão Empresarial de Projetos - MBA*
*MSP Lead - Microsoft Student Partner
Lider do Grupo de Usuários.NUG
**Twitter:* @Adriano_Santos
*Site:**https://sites.google.**com/site/adrianosantospb*<https://sites.google.com/site/adrianosantospb*>