hi tom
i also encountered problems using berkeleyAligner and reported it on
their website. As you can see, it seems to have been abandoned for a few
years now. I personally wouldn't use it again until someone starts
picking up their phone.
What's the motivation for using Berkeley Alignent, as opposed to say,
GIZA++/MGIZA?
On 26/02/2012 12:07, Tom Hoar wrote:
Has anyone else had java.lang.AssertionError's or any other kinds of
stability problems with BerkeleyAligner?
Thanks,
Tom
-------- Original Message --------
Subject: BerkeleyAligner AssertionError
Date: Sun, 19 Feb 2012 13:01:06 +0700
From: Tom Hoar <[email protected]>
To: Moses support <[email protected]>
I posted this on GoogleCode Issues for BerkeleyAligner, but it's
pretty inactive. So, I thought I'd try here, too.
I'm using the unsupervised version 2.1 available from the repository to create
word alignments. This is a small 40,000 phrase pair corpus for testing and
development. The machine is a 6-core AMD Opteron and 16 GB RAM and 1TB
available hard drive space. Java/OS version as follows:
user@moses0:~$ java -version
java version "1.6.0_20"
OpenJDK Runtime Environment (IcedTea6 1.9.10) (6b20-1.9.10-0ubuntu1~10.04.3)
OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)
I run the same command on the same corpus multiple times. Most times, this
command completes training successfully. Sometimes it fails with an
AssertionError in a different location, normally in the first or second
iteration of model 1. I list the command line followed by the errors. I have
also tried reducing numThreads to 5, but it still throws an error.
I suspect environment problems more than corpus problems.Any suggestions?
Thanks, Tom
/usr/bin/java -server \
-Xms1024m \
-Xmx2048m \
-Xss768k \
-ea \
-jar /usr/local/bin/berkeleyaligner.jar \
-EMWordAligner.numThreads 6 \
-Data.trainSources /opt/library/BUILDS/tm/demo_tm/bitext.list \
-Data.foreignSuffix nl \
-Data.englishSuffix en \
-Data.testSources \
-exec.execDir
/opt/library/TRAININGS/alignments/align-demo_tm-en-nl/berk.classes \
-exec.create True \
-Evaluator.writeGIZA True \
-Main.SaveParams True \
-Main.alignTraining True \
-Main.forwardModels MODEL1 HMM \
-Main.reverseModels MODEL1 HMM \
-Main.iters 5 5 \
-Main.mode JOINT JOINT
The Error:
main() {
Execution directory:
/opt/library/TRAININGS/alignments/align-demo_tm-en-nl/berk.classes
Preparing Training Data [2.3s, cum. 2.4s]
41410 training, 0 test
Training models: 2 stages {
Training stage 1: MODEL1 and MODEL1 jointly for 5 iterations {
Initializing forward model [9.1s, cum. 9.1s]
Initializing reverse model [7.9s, cum. 17s]
Joint Train: 41410 sentences, jointly {
Iteration 1/5 {
Sentence 2/41410
Sentence 1/41410
Sentence 5/41410
Sentence 13/41410
WARNING: Translation model update concurrency error
Sentence 54/41410
WARNING: Translation model update concurrency error
Sentence 207/41410
WARNING: Translation model update concurrency error
WARNING: Translation model update concurrency error
ERROR: java.lang.AssertionError:
fig.basic.StringDoubleMap.find(StringDoubleMap.java:397)
fig.basic.StringDoubleMap.incr(StringDoubleMap.java:78)
fig.basic.String2DoubleMap.incr(String2DoubleMap.java:51)
edu.berkeley.nlp.wordAlignment.SentencePairState.updateTransProbs(SentencePairState.java:79)
edu.berkeley.nlp.wordAlignment.distortion.Model1or2SentencePairState.updateNewParams(Model1or2SentencePairState.java:91)
edu.berkeley.nlp.wordAlignment.EMWordAligner$1.run(EMWordAligner.java:231)
edu.berkeley.nlp.concurrent.WorkQueue$1.run(WorkQueue.java:70)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
java.lang.Thread.run(Thread.java:636)
1 errors, 4 warnings
... 585 lines omitted ...
}
Here's another error for the same corpus:
main() {
Execution directory:
/opt/library/TRAININGS/alignments/align-demo_tm-en-nl/berk.classes
Preparing Training Data [2.3s, cum. 2.3s]
41410 training, 0 test
Training models: 2 stages {
Training stage 1: MODEL1 and MODEL1 jointly for 5 iterations {
Initializing forward model [9.2s, cum. 9.2s]
Initializing reverse model [8.0s, cum. 17s]
Joint Train: 41410 sentences, jointly {
Iteration 1/5 {
Sentence 2/41410
Sentence 1/41410
Sentence 4/41410
Sentence 15/41410
Sentence 67/41410
WARNING: Translation model update concurrency error
Sentence 279/41410
Sentence 911/41410
WARNING: Translation model update concurrency error
Sentence 2218/41410
Sentence 3908/41410
Sentence 5776/41410
Sentence 7744/41410
Sentence 9737/41410
Sentence 11746/41410
Sentence 13767/41410
Sentence 15780/41410
Sentence 17802/41410
Sentence 19841/41410
Sentence 21912/41410
Sentence 24000/41410
Sentence 26120/41410
Sentence 28239/41410
Sentence 30359/41410
Sentence 32490/41410
Sentence 34634/41410
Sentence 36776/41410
Sentence 38928/41410
... 40883 lines omitted ...
} [19s, cum. 19s]
Iteration 2/5 {
Sentence 1/41410
Sentence 5/41410
Sentence 4/41410
ERROR: java.lang.AssertionError:
fig.basic.StringDoubleMap.put(StringDoubleMap.java:72)
fig.basic.StringDoubleMap.switchMapType(StringDoubleMap.java:309)
fig.basic.StringDoubleMap.find(StringDoubleMap.java:386)
fig.basic.StringDoubleMap.incr(StringDoubleMap.java:78)
fig.basic.String2DoubleMap.incr(String2DoubleMap.java:51)
edu.berkeley.nlp.wordAlignment.SentencePairState.updateTransProbs(SentencePairState.java:79)
edu.berkeley.nlp.wordAlignment.distortion.Model1or2SentencePairState.updateNewParams(Model1or2SentencePairState.java:91)
edu.berkeley.nlp.wordAlignment.EMWordAligner$1.run(EMWordAligner.java:232)
edu.berkeley.nlp.concurrent.WorkQueue$1.run(WorkQueue.java:70)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
java.lang.Thread.run(Thread.java:636)
... 25 lines omitted ...
}
Sentence 28/41410
Sentence 29/41410
Sentence 30/41410
Sentence 31/41410
Sentence 32/41410
Sentence 33/41410
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support