Dave,
You have an interesting hardware setup. We have several users who use DoMY inside VMware guests on Windows host machines with much less capable hardware. One frequently trains/tunes/binarizes and evaluates models with 2+ million segments on as little as 4 GB real RAM & 4 cores and the total time is about 2 days. This configuration uses all cores for MGIZA++ and all other multi-threaded components. Re MGIZA++, it compiles on Ubuntu 10.04 through 12.10. I can't imagine why Mint would be any different. You might try installing DoMY CE to see any differences. http://www.precisiontranslationtools.com/quick-start/ [4]. Alterately, I posted our MGIZA++ setup script on this list last week. you might try it. The other thing that catches my attention is how you "map the working (train/model) directory to the host computer harddisk." I don't think any of my users have used this configuration, but rather set the virtual disk to grow automatically. Hieu, is it possible this hardware mapping configuration is incompatible with the virtualization techniques used in the binarized tables? Finally, the train-model.perl script uses different temp folders for in different steps. Sometimes, it honors the standard /tmp folder for Linux. In step 6, honors the user-defined --temp-dir option. In step 5, the extract binary only uses a subfolder it creates under its output folder. This unpredictable temp file storage is also true with the language model building tools. Therefore, the easiest way to ensure you have enough hard drive space for all the various temp outputs is to have one root partition that is big enough for all of your temp files. On 2012-12-28 18:49, Hieu Hoang wrote: > Hi Dave > > NB. Please subscribe to the mailing list before posting to it. You can > subscribe here: > http://mailman.mit.edu/mailman/listinfo/moses-support [1] > > I've noticed recently that virtual machine disks tends to be REALLY > slow. Since binarization is all IO bound, that may severely affect the > speed. Try binarizing on a real pc and see how it goes. > > Also, check that you haven't been beaten by this VMWare bug: > http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=51306 [2] > > For comparison, the de-en in the example models I created > http://www.statmt.org/moses/RELEASE-0.91/de-en/model/phrase-table.1.bin/ [3]took 10 hours, and that while running about 20 binarizations simultaneously. > > On 28/12/2012 02:02, David Wilson-Parr wrote: > >> Hi all, >> >> I am a new member going through the 'build baseline system' section of the website using the Europarl Swedish-English V6 set. Training took not so long maybe 2 days although I swapped laptops halfway through so its hard to tell. I am running moses on Mint (Ubuntu type) linux on VMWare under Windows 7. I map the working (train/model) directory to the host computer harddisk because I want to keep the VM image smaller. >> >> Anyway cutting to the chase. I was training the Swedish/English Pair of Europarl. Training took a while, I would estimate 2 days but I wasn't using mgiza++ just giza++ , incidentally I can't get mgiza++ to compile. I then tried to run the decoder and it took a while to start up but when it started, it would immediately say >> >> **Killed >> >> event though I didn't kill it. So I decided to binarise the phrase table an re-ordering models but it has taken far longer than I expected. The 'build a baseline system tutorial' generally indicates when something is a time-consuming process but this was taking longer than the initial training. >> >> processPhraseTable - took 2 days+ >> processLexicalTable - 3 days and still running >> >> Machine has 32gb of Ram, Intel I7 3630-QM 2.40 Ghz cpu (4/8 cores) . SSD drive Sata III. VMware Image is set to use 4 cores and 29Gb memory. >> >> I really appreciate some help, >> >> Dave > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support [1] Links: ------ [1] http://mailman.mit.edu/mailman/listinfo/moses-support [2] http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=51306 [3] http://www.statmt.org/moses/RELEASE-0.91/de-en/model/phrase-table.1.bin/ [4] http://www.precisiontranslationtools.com/quick-start/
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
