Hi,

I am trying to produce word alignment for individual sentences. For this
purpose I am using the "force align" functionality of mgiza++
Unfortunately when I am loading a big N table (fertility), mgiza crashes
with a segmentation fault.

In particular, I have initially run mgiza on the full training parallel
corpus using the default settings of the Moses script:

    /project/qtleap/software/moses-2.1.1/bin/training-tools/mgiza  
-CoocurrenceFile 
/local/tmp/elav01/selection-mechanism/systems/de-en/training/giza.1/en-de.cooc 
-c 
/local/tmp/elav01/selection-mechanism/systems/de-en/training/prepared.1/en-de-int-train.snt
 -m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -ncpus 
24 -nodumps 0 -nsmooth 4 -o 
/local/tmp/elav01/selection-mechanism/systems/de-en/training/giza.1/en-de 
-onlyaldumps 0 -p0 0.999 -s 
/local/tmp/elav01/selection-mechanism/systems/de-en/training/prepared.1/de.vcb 
-t 
/local/tmp/elav01/selection-mechanism/systems/de-en/training/prepared.1/en.vcb

Afterwards, by executing the mgiza force-align script, I run the
following command

    /project/qtleap/software/moses-2.1.1/mgizapp-code/mgizapp//bin/mgiza 
giza.en-de/en-de.gizacfg -c 
/local/tmp/elav01/selection-mechanism/systems/de-en/falign/qtmp_SOVBrE/prepared./en-de.snt
 -o 
/local/tmp/elav01/selection-mechanism/systems/de-en/falign/qtmp_SOVBrE/giza./en-de
 -s 
/local/tmp/elav01/selection-mechanism/systems/de-en/falign/qtmp_SOVBrE/prepared./de.vcb
 -t 
/local/tmp/elav01/selection-mechanism/systems/de-en/falign/qtmp_SOVBrE/prepared./en.vcb
 -m1 0 -m2 0 -mh 0 -coocurrence 
/local/tmp/elav01/selection-mechanism/systems/de-en/falign/qtmp_SOVBrE/giza./en-de.cooc
 -restart 11 -previoust giza.en-de/en-de.t3.final -previousa 
giza.en-de/en-de.a3.final -previousd giza.en-de/en-de.d3.final -previousn 
giza.en-de/en-de.n3.final -previousd4 giza.en-de/en-de.d4.final -previousd42 
giza.en-de/en-de.D4.final -m3 0 -m4 1

This runs fine, until I get the following error:

      We are going to load previous N model from giza.en-de/en-de.n3.final

    Reading fertility table from giza.en-de/en-de.n3.final

    Segmentation fault (core dumped)

The n-table that is failing has about 300k entries. For this reason, I
thought I should try to see if the size is a problem. So I concatenated
the table to 60k entries. And it works! But the alignments are not good.

I am struggling to fix this, so any help would be appreciated. I am
running a freshly installed mgiza, on Ubuntu 12.04

cheers,
Lefteris

-- 
MSc. Inf. Eleftherios Avramidis
DFKI GmbH, Alt-Moabit 91c, 10559 Berlin
Tel. +49-30 238 95-1806

Fax. +49-30 238 95-1810 

-------------------------------------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff

Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes

Amtsgericht Kaiserslautern, HRB 2313
-------------------------------------------------------------------------------------------
        

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to