Has anyone else had java.lang.AssertionError's or any other kinds of
stability problems with BerkeleyAligner?
Thanks,
Tom
--------
Original Message --------
SUBJECT:
BerkeleyAligner
AssertionError
DATE:
Sun, 19 Feb 2012 13:01:06 +0700
FROM:
Tom Hoar
TO:
Moses support
I posted this on GoogleCode
Issues for BerkeleyAligner, but it's pretty inactive. So, I thought I'd
try here, too.
I'm using the unsupervised version 2.1 available from
the repository to create word alignments. This is a small 40,000 phrase
pair corpus for testing and development. The machine is a 6-core AMD
Opteron and 16 GB RAM and 1TB available hard drive space. Java/OS
version as follows:
user@moses0:~$ java -version
java version
"1.6.0_20"
OpenJDK Runtime Environment (IcedTea6 1.9.10)
(6b20-1.9.10-0ubuntu1~10.04.3)
OpenJDK 64-Bit Server VM (build
19.0-b09, mixed mode)
I run the same command on the same corpus
multiple times. Most times, this command completes training
successfully. Sometimes it fails with an AssertionError in a different
location, normally in the first or second iteration of model 1. I list
the command line followed by the errors. I have also tried reducing
numThreads to 5, but it still throws an error.
I suspect environment
problems more than corpus problems.Any suggestions?
Thanks,
Tom
/usr/bin/java -server
-Xms1024m
-Xmx2048m
-Xss768k
-ea
-jar /usr/local/bin/berkeleyaligner.jar
-EMWordAligner.numThreads 6
-Data.trainSources /opt/library/BUILDS/tm/demo_tm/bitext.list
-Data.foreignSuffix nl
-Data.englishSuffix en
-Data.testSources
-exec.execDir
/opt/library/TRAININGS/alignments/align-demo_tm-en-nl/berk.classes
-exec.create True
-Evaluator.writeGIZA True
-Main.SaveParams True
-Main.alignTraining True
-Main.forwardModels MODEL1 HMM
-Main.reverseModels MODEL1 HMM
-Main.iters 5 5
-Main.mode JOINT
JOINT
The Error:
main() {
Execution directory:
/opt/library/TRAININGS/alignments/align-demo_tm-en-nl/berk.classes
Preparing Training Data [2.3s, cum. 2.4s]
41410 training, 0 test
Training models: 2 stages {
Training stage 1: MODEL1 and MODEL1 jointly
for 5 iterations {
Initializing forward model [9.1s, cum. 9.1s]
Initializing reverse model [7.9s, cum. 17s]
Joint Train: 41410
sentences, jointly {
Iteration 1/5 {
Sentence 2/41410
Sentence
1/41410
Sentence 5/41410
Sentence 13/41410
WARNING: Translation model
update concurrency error
Sentence 54/41410
WARNING: Translation model
update concurrency error
Sentence 207/41410
WARNING: Translation model
update concurrency error
WARNING: Translation model update concurrency
error
ERROR:
java.lang.AssertionError:
fig.basic.StringDoubleMap.find(StringDoubleMap.java:397)
fig.basic.StringDoubleMap.incr(StringDoubleMap.java:78)
fig.basic.String2DoubleMap.incr(String2DoubleMap.java:51)
edu.berkeley.nlp.wordAlignment.SentencePairState.updateTransProbs(SentencePairState.java:79)
edu.berkeley.nlp.wordAlignment.distortion.Model1or2SentencePairState.updateNewParams(Model1or2SentencePairState.java:91)
edu.berkeley.nlp.wordAlignment.EMWordAligner.run(EMWordAligner.java:231)
edu.berkeley.nlp.concurrent.WorkQueue.run(WorkQueue.java:70)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
java.lang.Thread.run(Thread.java:636)
1
errors, 4 warnings
... 585 lines omitted ...
}
Here's another error
for the same corpus:
main() {
Execution directory:
/opt/library/TRAININGS/alignments/align-demo_tm-en-nl/berk.classes
Preparing Training Data [2.3s, cum. 2.3s]
41410 training, 0 test
Training models: 2 stages {
Training stage 1: MODEL1 and MODEL1 jointly
for 5 iterations {
Initializing forward model [9.2s, cum. 9.2s]
Initializing reverse model [8.0s, cum. 17s]
Joint Train: 41410
sentences, jointly {
Iteration 1/5 {
Sentence 2/41410
Sentence
1/41410
Sentence 4/41410
Sentence 15/41410
Sentence 67/41410
WARNING: Translation model update concurrency error
Sentence 279/41410
Sentence 911/41410
WARNING: Translation model update concurrency error
Sentence 2218/41410
Sentence 3908/41410
Sentence 5776/41410
Sentence
7744/41410
Sentence 9737/41410
Sentence 11746/41410
Sentence
13767/41410
Sentence 15780/41410
Sentence 17802/41410
Sentence
19841/41410
Sentence 21912/41410
Sentence 24000/41410
Sentence
26120/41410
Sentence 28239/41410
Sentence 30359/41410
Sentence
32490/41410
Sentence 34634/41410
Sentence 36776/41410
Sentence
38928/41410
... 40883 lines omitted ...
} [19s, cum. 19s]
Iteration
2/5 {
Sentence 1/41410
Sentence 5/41410
Sentence 4/41410
ERROR:
java.lang.AssertionError:
fig.basic.StringDoubleMap.put(StringDoubleMap.java:72)
fig.basic.StringDoubleMap.switchMapType(StringDoubleMap.java:309)
fig.basic.StringDoubleMap.find(StringDoubleMap.java:386)
fig.basic.StringDoubleMap.incr(StringDoubleMap.java:78)
fig.basic.String2DoubleMap.incr(String2DoubleMap.java:51)
edu.berkeley.nlp.wordAlignment.SentencePairState.updateTransProbs(SentencePairState.java:79)
edu.berkeley.nlp.wordAlignment.distortion.Model1or2SentencePairState.updateNewParams(Model1or2SentencePairState.java:91)
edu.berkeley.nlp.wordAlignment.EMWordAligner$1.run(EMWordAligner.java:232)
edu.berkeley.nlp.concurrent.WorkQueue$1.run(WorkQueue.java:70)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
java.lang.Thread.run(Thread.java:636)
... 25 lines omitted ...
}
Sentence 28/41410
Sentence 29/41410
Sentence 30/41410
Sentence 31/41410
Sentence 32/41410
Sentence
33/41410
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support