Okay, I've been able to set it all up, compute the LM, and have started
the training.
To "clean the pipes", I'm doing a run with little data (started from
100K sentences only, which were trimmed to about 74K sentences when
removing long ones).
The training fails in step 2.1b, GIZA:
(2.1b) running giza fr-en @ Fri Feb 22 17:02:11 2008
/cygdrive/c/STATMT/bin//GIZA++ -CoocurrenceFile
working-dir/giza.fr-en/fr-en.cooc -c
working-dir/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3
-model1dump frequency 1 -model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o
working-dir/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s
working-dir/corpus/en.vcb -t
working-dir/corpus/fr.vcb/cygdrive/c/STATMT/bin//GIZA++
-CoocurrenceFile working-dir/giza.fr-en/fr-en.cooc -c
working-dir/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3
-model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o
working-dir/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s
working-dir/corpus/en.vcb -t working-dir/corpus/fr.vcb
Executing: /cygdrive/c/STATMT/bin//GIZA++
-CoocurrenceFile working-dir/giza.fr-en/fr-en.cooc -c
working-dir/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3
-model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o
working-dir/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s
working-dir/corpus/en.vcb -t working-dir/corpus/fr.vcb
13772 [main] GIZA++ 1604 _cygtls::handle_exceptions:
Error while dumping state
(probably corrupted stack)
Execution of: /cygdrive/c/STATMT/bin//GIZA++
-CoocurrenceFile working-dir/giza.fr-en/fr-en.cooc -c
working-dir/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3
-model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o
working-dir/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s
working-dir/corpus/en.vcb -t working-dir/corpus/fr.vcb died with signal
11, with coredump
Er, what now ? :-/
--
Hubert Crépy
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support