Hi Hieu,

if I understand the new parallelization scheme of phrase table *scoring* correctly, then: a) --parallel switches on concurrent processing of direct and inverse phrase table halves b) --cores N switches on (further) parallelization of each phrase table half processing (splitting it into N chunks, scoring them in parallel and merging on the output) And so using both --parallel and --cores N leads to 2*N scorers running in parallel?

Did I get it right?

Cheers,
Ceslav

on 27/05/12 23:48 Hieu Hoang said the following:
good to know it's working.

you might want to try out the new feature caused you trouble:

        
        
        Time taken      
        Peak disk usage (GB)    argument to train-model.perl
Baseline        
        
        02:56:05        
        61.2    
New     
        
        02:51:00        
        36.5    
+4 cores        
        
        02:30:09        
        37.1    -cores 4
+compress intermediate sort files       
        
        02:10:11        
        18.7    -cores 4 -sort-compress gzip
+optimized
        
        
        01:37:00
        
        18.7
-cores 4 -sort-compress gzip -sort-buffer-size 200M -sort-batch-size 253 -sort-parallel 4


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to