Hi,
I am trying to produce word alignment for individual sentences. For this
purpose I am using the "force align" functionality of mgiza++
Unfortunately when I am loading a big N table (fertility), mgiza crashes
with a segmentation fault.
In particular, I have initially run mgiza on the full training parallel
corpus using the default settings of the Moses script:
/project/qtleap/software/moses-2.1.1/bin/training-tools/mgiza
-CoocurrenceFile
/local/tmp/elav01/selection-mechanism/systems/de-en/training/giza.1/en-de.cooc
-c
/local/tmp/elav01/selection-mechanism/systems/de-en/training/prepared.1/en-de-int-train.snt
-m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -ncpus
24 -nodumps 0 -nsmooth 4 -o
/local/tmp/elav01/selection-mechanism/systems/de-en/training/giza.1/en-de
-onlyaldumps 0 -p0 0.999 -s
/local/tmp/elav01/selection-mechanism/systems/de-en/training/prepared.1/de.vcb
-t
/local/tmp/elav01/selection-mechanism/systems/de-en/training/prepared.1/en.vcb
Afterwards, by executing the mgiza force-align script, I run the
following command
/project/qtleap/software/moses-2.1.1/mgizapp-code/mgizapp//bin/mgiza
giza.en-de/en-de.gizacfg -c
/local/tmp/elav01/selection-mechanism/systems/de-en/falign/qtmp_SOVBrE/prepared./en-de.snt
-o
/local/tmp/elav01/selection-mechanism/systems/de-en/falign/qtmp_SOVBrE/giza./en-de
-s
/local/tmp/elav01/selection-mechanism/systems/de-en/falign/qtmp_SOVBrE/prepared./de.vcb
-t
/local/tmp/elav01/selection-mechanism/systems/de-en/falign/qtmp_SOVBrE/prepared./en.vcb
-m1 0 -m2 0 -mh 0 -coocurrence
/local/tmp/elav01/selection-mechanism/systems/de-en/falign/qtmp_SOVBrE/giza./en-de.cooc
-restart 11 -previoust giza.en-de/en-de.t3.final -previousa
giza.en-de/en-de.a3.final -previousd giza.en-de/en-de.d3.final -previousn
giza.en-de/en-de.n3.final -previousd4 giza.en-de/en-de.d4.final -previousd42
giza.en-de/en-de.D4.final -m3 0 -m4 1
This runs fine, until I get the following error:
We are going to load previous N model from giza.en-de/en-de.n3.final
Reading fertility table from giza.en-de/en-de.n3.final
Segmentation fault (core dumped)
The n-table that is failing has about 300k entries. For this reason, I
thought I should try to see if the size is a problem. So I concatenated
the table to 60k entries. And it works! But the alignments are not good.
I am struggling to fix this, so any help would be appreciated. I am
running a freshly installed mgiza, on Ubuntu 12.04
cheers,
Lefteris
--
MSc. Inf. Eleftherios Avramidis
DFKI GmbH, Alt-Moabit 91c, 10559 Berlin
Tel. +49-30 238 95-1806
Fax. +49-30 238 95-1810
-------------------------------------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
-------------------------------------------------------------------------------------------
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support