Re: [Moses-support] Running mgiza keeps getting core dumped

Hieu Hoang Mon, 12 Nov 2012 15:09:33 -0800

on cygwin, there is no way to solve the problem. 2GB is the maximum anyprocess can use on cygwin.

If you have large model files, you should move to 64-bit linux ormacOSX, with plenty of memory.

Or try and compile mgiza and moses without cygwin. For example, usemingw or visual studio.

This will require some work. Other people may be trying to do the samething, so maybe team up with them.


For mgiza, you can minimize memory by following this
   http://www.statmt.org/moses/?n=Moses.Optimize#ntoc8

However, you may encounter more memory problems further down thepipeline anyway.

I would personally advised against using berkeley aligner. IMO, it'sbuggy and questions to their developers go unanswered.


On 12/11/2012 12:56, Jelita Asian wrote:

Hi Barry,

Actually how do we solve the more than 2 GB memory problem? Thanks.

Best regards,

Jelita

On Fri, Nov 9, 2012 at 10:51 AM, Jelita Asian<[email protected] <mailto:[email protected]>> wrote:


    Hi Barry,

    Thanks. I will look into it now.

    Cheers,

    Jelita


    On Thu, Nov 8, 2012 at 10:09 PM, Barry Haddow
    <[email protected] <mailto:[email protected]>>
    wrote:

        Hi Jelita

        It could be running out of memory. Under cygwin, mgiza will be
        limited to 2GB
        http://www.statmt.org/moses/?n=Moses.FAQ#ntoc9

        cheers - Barry


        On 06/11/12 07:33, Jelita Asian wrote:

            Hi,

            I run Moses training using moses-for-mere-mortal scripts.
            The run is used to be OK. However, since I increase the
            number of words (mostly numbers written in words where
            words act as parallel sentences in corpus for Indonesian
            and English), I keep getting mgiza stack-dump, hence the
            training is failed.

            Here is the extract for the log file of the run:

            -----------
            Model1: Iteration 5
            Reading more sentence pairs into memory ...
            [sent:100000]
            Reading more sentence pairs into memory ...
            Reading more sentence pairs into memory ...
            Reading more sentence pairs into memory ...
            Model1: (5) TRAIN CROSS-ENTROPY 5.82453 PERPLEXITY 56.6706
            Model1: (5) VITERBI TRAIN CROSS-ENTROPY 6.59753 PERPLEXITY
            96.8401
            Model 1 Iteration: 5 took: 87 seconds
            Entire Model1 Training took: 444 seconds
            NOTE: I am doing iterations with the HMM model!
            Read classes: #words: 48562  #classes: 51
            Actual number of read words: 48561 stored words: 48561
            Read classes: #words: 45484  #classes: 51
            Actual number of read words: 45483 stored words: 45483

            ==========================================================
            Hmm Training Started at: Tue Nov  6 12:46:41 2012

./train-AllCorpusIndo.sh: line 1184: 3936 Aborted(core dumped) $toolsdir/mgiza/bin/mgiza -ncpus

            $mgizanumprocessors -c
            $modeldir/$lang2-$lang1-int-train.snt -o
            $modeldir/$lang2-$lang1 -s $modeldir/$lang1.vcb -t
            $modeldir/$lang2.vcb -coocurrencefile
            $modeldir/$lang1-$lang2.cooc -ml $ml -countincreasecutoff
            $countincreasecutoff -countincreasecutoffal
            $countincreasecutoffal -mincountincrease $mincountincrease
            -peggedcutoff $peggedcutoff -probcutoff $probcutoff
            -probsmooth $probsmooth -m1 $model1iterations -m2
            $model2iterations -mh $hmmiterations -m3 $model3iterations
            -m4 $model4iterations -m5 $model5iterations -m6
            $model6iterations -t1 $model1dumpfrequency -t2
            $model2dumpfrequency -t2to3 $transferdumpfrequency -t345
            $model345dumpfrequency -th $hmmdumpfrequency -onlyaldumps
            $onlyaldumps -nodumps $nodumps -compactadtable
            $compactadtable -model4smoothfactor $model4smoothfactor
            -compactalignmentformat $compactalignmentformat -verbose
            $verbose -verbosesentence $verbosesentence -emalsmooth
            $emalsmooth -model23smoothfactor $model23smoothfactor
            -model4smoothfactor $model4smoothfactor
            -model5smoothfactor $model5smoothfactor -nsmooth $nsmooth
            -nsmoothgeneral $nsmoothgeneral
            -deficientdistortionforemptyword
            $deficientdistortionforemptyword -depm4 $depm4 -depm5
            $depm5 -emalignmentdependencies $emalignmentdependencies
            -emprobforempty $emprobforempty -m5p0 $m5p0 -manlexfactor1
            $manlexfactor1 -manlexfactor2 $manlexfactor2
            -manlexmaxmultiplicity $manlexmaxmultiplicity
            -maxfertility $maxfertility -p0 $p0 -pegging $pegging
            Starting MGIZA
            Initializing Global Paras
            DEBUG: EnterDEBUG: PrefixDEBUG: LogParsing Arguments
            Parameter 'ncpus' changed from '2' to '8'
            Parameter 'c' changed from '' to
            
'/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6/id-en-int-train.snt'
            Parameter 'o' changed from '112-11-06.124815.Jelita' to
            
'/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6/id-en'
            Parameter 's' changed from '' to
            
'/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6/en.vcb'
            Parameter 't' changed from '' to
            
'/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6/id.vcb'
            Parameter 'coocurrencefile' changed from '' to
            
'/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6/en-id.cooc'
            Parameter 'm3' changed from '5' to '3'
            Parameter 'm4' changed from '5' to '3'
            Parameter 'onlyaldumps' changed from '0' to '1'
            Parameter 'nodumps' changed from '0' to '1'
            Parameter 'model4smoothfactor' changed from '0.2' to '0.4'
            Parameter 'nsmooth' changed from '64' to '4'
            Parameter 'p0' changed from '-1' to '0.999'
            general parameters:
            -------------------
            ml = 101  (maximum sentence length)

            Here is another extract

./train-AllCorpusIndo.sh: line 1184: 2756 Aborted(core dumped) $toolsdir/mgiza/bin/mgiza -ncpus

            $mgizanumprocessors -c
            $modeldir/$lang1-$lang2-int-train.snt -o
            $modeldir/$lang1-$lang2 -s $modeldir/$lang2.vcb -t
            $modeldir/$lang1.vcb -coocurrencefile
            $modeldir/$lang2-$lang1.cooc -ml $ml -countincreasecutoff
            $countincreasecutoff -countincreasecutoffal
            $countincreasecutoffal -mincountincrease $mincountincrease
            -peggedcutoff $peggedcutoff -probcutoff $probcutoff
            -probsmooth $probsmooth -m1 $model1iterations -m2
            $model2iterations -mh $hmmiterations -m3 $model3iterations
            -m4 $model4iterations -m5 $model5iterations -m6
            $model6iterations -t1 $model1dumpfrequency -t2
            $model2dumpfrequency -t2to3 $transferdumpfrequency -t345
            $model345dumpfrequency -th $hmmdumpfrequency -onlyaldumps
            $onlyaldumps -nodumps $nodumps -compactadtable
            $compactadtable -model4smoothfactor $model4smoothfactor
            -compactalignmentformat $compactalignmentformat -verbose
            $verbose -verbosesentence $verbosesentence -emalsmooth
            $emalsmooth -model23smoothfactor $model23smoothfactor
            -model4smoothfactor $model4smoothfactor
            -model5smoothfactor $model5smoothfactor -nsmooth $nsmooth
            -nsmoothgeneral $nsmoothgeneral
            -deficientdistortionforemptyword
            $deficientdistortionforemptyword -depm4 $depm4 -depm5
            $depm5 -emalignmentdependencies $emalignmentdependencies
            -emprobforempty $emprobforempty -m5p0 $m5p0 -manlexfactor1
            $manlexfactor1 -manlexfactor2 $manlexfactor2
            -manlexmaxmultiplicity $manlexmaxmultiplicity
            -maxfertility $maxfertility -p0 $p0 -pegging $pegging
            ****** phase 2.1 of training (merge alignments)
            Traceback (most recent call last):
              File
            "/home/Jelita/moses/tools/mgiza/scripts/merge_alignment.py",
            line 24, in <module>
                files.append(open(sys.argv[i],"r"));
            IOError: [Errno 2] No such file or directory:
            
'/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6/id-en.A3.final.part*'
            Traceback (most recent call last):
              File
            "/home/Jelita/moses/tools/mgiza/scripts/merge_alignment.py",
            line 24, in <module>
                files.append(open(sys.argv[i],"r"));
            IOError: [Errno 2] No such file or directory:
            
'/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6/en-id.A3.final.part*'
            ****** Rest of parallel training
            Using SCRIPTS_ROOTDIR: /home/Jelita/moses/tools/moses/scripts
            Using single-thread GIZA
            (3) generate word alignment @ Tue Nov  6 13:07:31 SEAST 2012
            Combining forward and inverted alignment from files:

/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6/id-en.A3.final.{bz2,gz}/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6/en-id.A3.final.{bz2,gz}

            Executing: mkdir -p
            
/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6
            Executing:
            /home/Jelita/moses/tools/moses/scripts/training/symal/giza2bal.pl
            <http://giza2bal.pl> <http://giza2bal.pl> -d "gzip -cd
            
/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6/en-id.A3.final.gz"
            -i "gzip -cd
            
/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6/id-en.A3.final.gz"
            |/home/Jelita/moses/tools/moses/scripts/training/symal/symal
            -alignment="grow" -diagonal="yes" -final="yes" -both="yes"
            >
            
/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6/aligned.grow-diag-final-and


            symal: computing grow alignment: diagonal (1) final
            (1)both-uncovered (1)
            skip=<0> counts=<0>
            (4) generate lexical translation table 0-0 @ Tue Nov  6
            13:07:31 SEAST 2012
            
(/home/Jelita/moses/corpora_trained/lc_clean/MinLen-1.MaxLen-60/CleanAllCorpus15Oct2012.for_train.lowercase.id
            <http://CleanAllCorpus15Oct2012.for_train.lowercase.id>
            
<http://CleanAllCorpus15Oct2012.for_train.lowercase.id>,/home/Jelita/moses/corpora_trained/lc_clean/MinLen-1.MaxLen-60/CleanAllCorpus15Oct2012.for_train.lowercase.en,/home/Jelita/moses/corpora_trained/model/id-en-CleanAllCorpus15Oct2012.for_train.LM-CleanAllCorpus15Oct2012.for_train-IRSTLM-4-1-improved-kneser-ney-0-1/T-1-1-9-MKCLS-2-50-MGIZA-8-GIZA-101-5-0-5-3-3-0-0-1e-06-1e-05-1e-07-0.03-1e-07-1e-07-0-0-0-0-0-0-0-1-1-0--10-0.2-0-0.4-0.1-4-0-1-0-76-68-2-0.4--1-0-0-20-10-0.999-0-MOSES-6-1-1-60-7-4-1-1-1-0-0-200-1.0-0-20-0-0-0-1000-100-20-0-6/lex)


            !Use of uninitialized value $a in scalar chomp at
            /home/Jelita/moses/tools/moses/scripts/training/train-model.perl
            line 1079.
            Use of uninitialized value $a in split at
            /home/Jelita/moses/tools/moses/scripts/training/train-model.perl
            line 1082.

            What is the cause? I use cygwin for Windows 7 on a 64-bit
            machine.
            I ran a few times and it can't get pass the Model 1 training.

            Thanks.

            Best regards,

            Jelita



            _______________________________________________
            Moses-support mailing list
            [email protected] <mailto:[email protected]>
            http://mailman.mit.edu/mailman/listinfo/moses-support

--The University of Edinburgh is a charitable body, registered in

        Scotland, with registration number SC005336.





_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Running mgiza keeps getting core dumped

Reply via email to