Dave, 

You have an interesting hardware setup. We have several
users who use DoMY inside VMware guests on Windows host machines with
much less capable hardware. One frequently trains/tunes/binarizes and
evaluates models with 2+ million segments on as little as 4 GB real RAM
& 4 cores and the total time is about 2 days. This configuration uses
all cores for MGIZA++ and all other multi-threaded components. 

Re
MGIZA++, it compiles on Ubuntu 10.04 through 12.10. I can't imagine why
Mint would be any different. You might try installing DoMY CE to see any
differences. http://www.precisiontranslationtools.com/quick-start/ [4].
Alterately, I posted our MGIZA++ setup script on this list last week.
you might try it. 

The other thing that catches my attention is how you
"map the working (train/model) directory to the host computer harddisk."
I don't think any of my users have used this configuration, but rather
set the virtual disk to grow automatically. Hieu, is it possible this
hardware mapping configuration is incompatible with the virtualization
techniques used in the binarized tables? 

Finally, the train-model.perl
script uses different temp folders for in different steps. Sometimes, it
honors the standard /tmp folder for Linux. In step 6, honors the
user-defined --temp-dir option. In step 5, the extract binary only uses
a subfolder it creates under its output folder. This unpredictable temp
file storage is also true with the language model building tools.
Therefore, the easiest way to ensure you have enough hard drive space
for all the various temp outputs is to have one root partition that is
big enough for all of your temp files. 

On 2012-12-28 18:49, Hieu Hoang
wrote: 

> Hi Dave
> 
> NB. Please subscribe to the mailing list before
posting to it. You can 
> subscribe here:
>
http://mailman.mit.edu/mailman/listinfo/moses-support [1]
> 
> I've
noticed recently that virtual machine disks tends to be REALLY 
> slow.
Since binarization is all IO bound, that may severely affect the 
>
speed. Try binarizing on a real pc and see how it goes.
> 
> Also, check
that you haven't been beaten by this VMWare bug:
>
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=51306
[2]
> 
> For comparison, the de-en in the example models I created
>
http://www.statmt.org/moses/RELEASE-0.91/de-en/model/phrase-table.1.bin/
[3]took 10 hours, and that while running about 20 binarizations
simultaneously.
> 
> On 28/12/2012 02:02, David Wilson-Parr wrote:
> 
>>
Hi all, 
>> 
>> I am a new member going through the 'build baseline
system' section of the website using the Europarl Swedish-English V6
set. Training took not so long maybe 2 days although I swapped laptops
halfway through so its hard to tell. I am running moses on Mint (Ubuntu
type) linux on VMWare under Windows 7. I map the working (train/model)
directory to the host computer harddisk because I want to keep the VM
image smaller. 
>> 
>> Anyway cutting to the chase. I was training the
Swedish/English Pair of Europarl. Training took a while, I would
estimate 2 days but I wasn't using mgiza++ just giza++ , incidentally I
can't get mgiza++ to compile. I then tried to run the decoder and it
took a while to start up but when it started, it would immediately say

>> 
>> **Killed
>> 
>> event though I didn't kill it. So I decided to
binarise the phrase table an re-ordering models but it has taken far
longer than I expected. The 'build a baseline system tutorial' generally
indicates when something is a time-consuming process but this was taking
longer than the initial training. 
>> 
>> processPhraseTable - took 2
days+ 
>> processLexicalTable - 3 days and still running 
>> 
>> Machine
has 32gb of Ram, Intel I7 3630-QM 2.40 Ghz cpu (4/8 cores) . SSD drive
Sata III. VMware Image is set to use 4 cores and 29Gb memory. 
>> 
>> I
really appreciate some help, 
>> 
>> Dave
> 
>
_______________________________________________
> Moses-support mailing
list
> [email protected]
>
http://mailman.mit.edu/mailman/listinfo/moses-support [1]



Links:
------
[1]
http://mailman.mit.edu/mailman/listinfo/moses-support
[2]
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=51306
[3]
http://www.statmt.org/moses/RELEASE-0.91/de-en/model/phrase-table.1.bin/
[4]
http://www.precisiontranslationtools.com/quick-start/
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to