Hi Iain,

that looks about right. There are large intermediate files
during training, and the final phrase table will also be
pretty big, several gigabytes.

-phi

On Mon, Apr 28, 2008 at 8:07 PM, Iain Adams <[EMAIL PROTECTED]> wrote:
> Dear Mailing List,
>
>  I am running the commands below from a file:
>
>  export
>  SCRIPTS_ROOTDIR=/data/aca04iba/en-es/bin/moses-scripts/scripts-20080411-1824
>
>  $SCRIPTS_ROOTDIR/training/train-factored-phrase-model.perl -scripts-root-dir
>  $SCRIPTS_ROOTDIR -root-dir /data/aca04iba/en-es/ -corpus
>  /data/aca04iba/en-es/training/corpus.lowercased -f es -e en -alignment
>  grow-diag-final-and -reordering msd-bidirectional-fe -lm
>  0:5:/data/aca04iba/en-es/lm/corpus.lm:0
>
>  This is producing a massive model directory with contents:
>
>  ls -lh
>
>  92M Apr 22 22:49 aligned.0.en
>  99M Apr 22 22:48 aligned.0.es
>  57M Apr 23 11:27 aligned.grow-diag-final-and 703M Apr 23 12:06
>  extract.0-0.gz 689M Apr 23 12:12 extract.0-0.inv.gz 705M Apr 23 12:54
>  extract.0-0.inv.sorted.gz 532M Apr 23 11:57 extract.0-0.o.gz 696M Apr 23
>  12:29 extract.0-0.sorted.gz 90M Apr 22 22:51 lex.0-0.f2n 90M Apr 22 22:51
>  lex.0-0.n2f 14G Apr 23 14:39 phrase-table.0-0.half.f2n 7.0G Apr 23 16:16
>  phrase-table.0-0.half.n2f 809M Apr 23 14:52
>  phrase-table.0-0.half.n2f.part0000
>  986M Apr 23 14:58 phrase-table.0-0.half.n2f.part0001
>  979M Apr 23 15:04 phrase-table.0-0.half.n2f.part0002
>  996M Apr 23 15:10 phrase-table.0-0.half.n2f.part0003
>  989M Apr 23 15:16 phrase-table.0-0.half.n2f.part0004
>  958M Apr 23 15:21 phrase-table.0-0.half.n2f.part0005
>  962M Apr 23 15:27 phrase-table.0-0.half.n2f.part0006
>  972M Apr 23 15:33 phrase-table.0-0.half.n2f.part0007
>  979M Apr 23 15:39 phrase-table.0-0.half.n2f.part0008
>  999M Apr 23 15:45 phrase-table.0-0.half.n2f.part0009
>  995M Apr 23 15:51 phrase-table.0-0.half.n2f.part0010
>  1020M Apr 23 15:56 phrase-table.0-0.half.n2f.part0011
>  965M Apr 23 16:02 phrase-table.0-0.half.n2f.part0012
>  934M Apr 23 16:08 phrase-table.0-0.half.n2f.part0013
>  381M Apr 23 16:10 phrase-table.0-0.half.n2f.part0014
>
>  The whole operation fails as I hit my quota allowance. Should this be
>  producing such large files. This model directory is 38G. I didn't realise it
>  would be quite this large.
>
>  Can anyone advise as to why this is happening?
>
>  I am training on Europarl corpus.
>
>  92M Apr 13 14:14 corpus.lowercased.en
>  99M Apr 13 14:13 corpus.lowercased.es
>
>  and my language model is:
>
>  155M Apr 13 15:02 corpus.lm
>
>
>  Iain
>  --
>  Iain Adams
>  4th Year Undergraduate MCOMP
>  Marketing Team
>  Genesys Solutions
>
>
>
>  _______________________________________________
>  Moses-support mailing list
>  [email protected]
>  http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to