Assuming your configuration follows the others' suggestions, your 
 machine's hard disk subsystem is the bottleneck that ultimately controls 
 Moses' speed when using on-disk binarized files.

 See the recent moses-support thread subject: "MT training on a laptop" 
 for some discussion on slow/fast hard disk performance.

 Barry mentioned slow performance with NFS mounts. Virtual machines also 
 have slow disk performance.

 Here are some suggestions to speed hardware performance bottlenecks.

   * Use 7,200 RPM (or faster) SATA3 disks.
   * Combine 2 or more disks in a RAID0 (MDADM software raids work well, 
 but have backups)
   * Use solid state disks (SSD's). They are really fast
   * Combine 2 or more SSD disks in a RAID0... work at near-RAM speeds

 You can add new disks to your system and mount them into the file 
 system. Save the binarized files to the fast disk's mount and edit your 
 config file to find them.

 KEN: what does "cat binaryfile >/dev/null" do exactly? In the process 
 of reading the file from disk, does the OS cache all/part of the file to 
 RAM?

 Tom


 On Fri, 02 Mar 2012 14:08:25 -0500, Kenneth Heafield 
 <[email protected]> wrote:
> It's using lazy mmap as evidenced by the .mm at the end.  So 
> "loading"
> time will be fast but this may still explain the slow speed.
>
> Regardless of using IRSTLM or KenLM, cat the binary file >/dev/null
> first, though not being lazy (remove .mm for IRSTLM or use 8 for 
> KenLM)
> should accomplish the same thing.
>
> The "on disk" binary phrase tables need to be in RAM for reasonable
> performance.  cat >/dev/null them as well.  Effectively, "on disk" 
> means
> that it's memory accessed as a file, so the kernel doesn't count it 
> as
> part of the process virtual size (a neat accounting trick if you ask
> me).  If the virtual size of Moses plus the binary phrase tables 
> exceeds
> physical memory, you're in trouble.
>
> Kenneth
>
> On 03/02/2012 01:20 PM, Nicola Bertoldi wrote:
>> I would only put in evidence that irstlm is not responsible for the 
>> very large loading time .
>> In fact, it starts and ends in less than 1 second (start at 49 
>> seconds and end at the same time)
>>
>> So probably as Barry mention, you have to use binarized version of 
>> the phrase and reordering tables
>>
>> In the configuration file where you specify the translation table
>> the first field is set to 1, meaning that your phrase table is 
>> binarized,
>>
>> Are you sure that you actually created it?
>> In other words, do you have a set of files like
>> /home/ssp/smt/etof/model/phrase-table.binphr*
>>
>> cheers,
>> Nicola
>>
>> On Mar 2, 2012, at 6:29 PM, Barry Haddow wrote:
>>
>>> Hi Shweta
>>>
>>> Here's some suggestions:
>>> - Make sure you binarise the reordering model too
>>> - Try with kenlm instead of irstlm
>>> - make sure your binarised files are on a local disk (not nfs!)
>>> - Remember that you can translate a batch of sentences in one moses 
>>> run
>>>
>>> cheers - Barry
>>>
>>> On Friday 02 March 2012 17:15:35 shweta porwal wrote:
>>>> Hi im building an smt web translation system using your given 
>>>> guidelines on
>>>> moses home.
>>>>
>>>> The translations are good , but even after doing the steps of 
>>>> memory
>>>> management for speeding up the translation given in the step by 
>>>> step guide
>>>> , I am still facing delay in translation output due to disk access 
>>>> times.
>>>> According to the guide it should not load the entire phrase table 
>>>> every
>>>> time, but I guess I have missed something.
>>>>
>>>>
>>>> the output of echo command is:
>>>>
>>>> echo "mardi" | /opt/tools/moses/dist/bin/moses -f
>>>> /home/ssp/smt/etof/model/moses-bin.ini
>>>>
>>>> Defined parameters (per moses.ini or switch):
>>>>       config: /home/ssp/smt/etof/model/moses-bin.ini
>>>>       distortion-file: 0-0 wbe-msd-bidirectional-fe-allff 6
>>>> /home/ssp/smt/etof/model/reordering-table.wbe-msd-bidirectional-fe
>>>>       distortion-limit: 6
>>>>       input-factors: 0
>>>>       lmodel-file: 1 0 3 /home/ssp/smt/etof/lm/europarl-v6.blm.mm
>>>>       mapping: 0 T 0
>>>>       ttable-file: 1 0 0 5 /home/ssp/smt/etof/model/phrase-table
>>>>       ttable-limit: 20
>>>>       weight-d: 0.3 0.3 0.3 0.3 0.3 0.3 0.3
>>>>       weight-l: 0.5000
>>>>       weight-t: 0.20 0.20 0.20 0.20 0.20
>>>>       weight-w: -1
>>>> Loading lexical distortion models...have 1 models
>>>> Creating lexical reordering...
>>>> weights: 0.300 0.300 0.300 0.300 0.300 0.300
>>>> Loading table into memory...done.
>>>> Start loading LanguageModel 
>>>> /home/ssp/smt/etof/lm/europarl-v6.blm.mm :
>>>> [49.000] seconds
>>>> In LanguageModelIRST::Load: nGramOrder = 3
>>>> Language Model Type of /home/ssp/smt/etof/lm/europarl-v6.blm.mm is 
>>>> 1
>>>> blmt
>>>> loadbin()
>>>> lmtable::loadbin_dict()
>>>> dict->size(): 41308
>>>> loadbin_level (level 1)
>>>> mapping 41308 1-grams
>>>> tableOffs 494937 tableGaps3417-grams
>>>> done (level1)
>>>> loadbin_level (level 2)
>>>> mapping 484826 2-grams
>>>> tableOffs 1114557 tableGaps445-grams
>>>> done (level2)
>>>> loadbin_level (level 3)
>>>> mapping 297921 3-grams
>>>> tableOffs 8386947 tableGaps2435-grams
>>>> done (level3)
>>>> done
>>>> OOV code is 1499
>>>> IRST: m_unknownId=1499
>>>> Finished loading LanguageModels : [49.000] seconds
>>>> Start loading PhraseTable /home/ssp/smt/etof/model/phrase-table : 
>>>> [49.000]
>>>> seconds
>>>> filePath: /home/ssp/smt/etof/model/phrase-table
>>>> Finished loading phrase tables : [49.000] seconds
>>>> IO from STDOUT/STDIN
>>>> Created input-output object : [49.000] seconds
>>>> Translating line 0  in thread id 3039353712
>>>> Translating: mardi
>>>>
>>>> reading bin ttable
>>>> size of OFF_T 8
>>>> binary phrasefile loaded, default OFF_T: -1
>>>> Collecting options took 0.120 seconds
>>>> Search took 0.120 seconds
>>>> tuesday
>>>> BEST TRANSLATION: tuesday [1]  [total=-13.136]<<0.000, -1.000, 
>>>> 0.000,
>>>> -0.547, 0.000, 0.000, 0.000, 0.000, 0.000, -27.202, -0.916, 
>>>> -0.496, -0.865,
>>>> -0.580, 1.000>>
>>>> reset caches
>>>> Translation took 0.120 seconds
>>>> Finished translating
>>>> reset mmap
>>>> len  = 623037
>>>> sync = 0
>>>> running msync...
>>>> done. Running munmap...
>>>> done
>>>> len  = 7272835
>>>> sync = 0
>>>> running msync...
>>>> done. Running munmap...
>>>> done
>>>> len  = 2087882
>>>> sync = 0
>>>> running msync...
>>>> done. Running munmap...
>>>> done
>>>> len  = 623037
>>>> sync = 0
>>>> running msync...
>>>> done. Running munmap...
>>>> done
>>>> len  = 7272835
>>>> sync = 0
>>>> running msync...
>>>> done. Running munmap...
>>>> done
>>>> len  = 2087882
>>>> sync = 0
>>>> running msync...
>>>> done. Running munmap...
>>>> done
>>>>
>>>> It takes about 2-3 minutes to translate a single word.
>>>>
>>>>   Any suggestions on how I could reduce the total translation 
>>>> time? I would
>>>> really appreciate the help.
>>>>
>>>> Thanks.
>>>>
>>>
>>> --
>>> Barry Haddow
>>> University of Edinburgh
>>> +44 (0) 131 651 3173
>>>
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to