Hello,

Building the phrase table really used to take me a long, long time. 

I have a 4-processor computer with 8 GB RAM and with a 12 million segment 
corpus (about 0.5 billion words EN+PT), the whole training took about 7 days, 
of which 2 days to build the phrase table (using the swap too).

However, now I have a 80 GB solid-state drive installed for the swap and temp 
files and the training of a larger corpus (14 million segments) took about the 
same time. The main difference was in the building of the phrase table: it took 
only 7 hours. Beautiful!

I hope this information may be useful to you ... although the corpus you want 
to train is not as large.

Maria José 

-----Original Message-----
From: [email protected] [mailto:[email protected]] On 
Behalf Of Tom Hoar
Sent: Monday, April 18, 2011 4:05 PM
To: David Wilkinson
Cc: [email protected]
Subject: Re: [Moses-support] How much Ram for Europarl?


 Your report of 100% physical usage, growing swap usage and low CPU load 
 is normal when working with limited RAM machines. With only 4 Gb Ram and 
 the new (larger) EuroParl v6 corpus, you could train for 3 or 4 days 
 depending on how you setup your swap partition. Even then, it's possible 
 you will run out of RAM before it's finished. Upgrading to 8 Gb ram is a 
 move in the right direction.

 Once it's finished training, you'll want to use the binarized the 
 tables and language model, which MMM's train-1.11 script creates.

 Tom


 On Mon, 18 Apr 2011 14:52:10 +0100, Philipp Koehn <[email protected]> 
 wrote:
> Hi,
>
> I am not familiar with the MMM setup, but one of the causes
> of memory use may be the translation table. You should use
> the on-disk translation table.
>
> -phi
>
> On Mon, Apr 18, 2011 at 2:47 PM, David Wilkinson
> <[email protected]> wrote:
>> I have set up an Ubuntu 10.04 system with the moses-for-mere-mortals
>> scripts. The default corpus trained in about 6-7 hours on my system 
>> (Athlon
>> x3 3.2Ghz, 4Gb Ram). I am now trying to train the system with the 
>> Europarl
>> German-English parallel corpus (about 45m words in each language), 
>> again
>> using the default moses-for-mere-mortals settings. The system has 
>> been
>> running for 24 hrs and is currently using all the physical memory 
>> and about
>> 1.2Gb of swap. None of the cores are being used more than 10%, so 
>> like this
>> it will take a very long time to finish. If I double the ram to 8gb, 
>> will
>> this be sufficient?
>> Many Thanks
>> David
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to