Hi Vineet,

think are much simpler than you can immagine.

1,2. Parallel minimum error training and parallel test scripts basically
split the  test set into batches that are  translated on different
machines with the same run time code.   Hence, there is no
parallelism at the level of single sentences.

7. Parallel word-alignment training uses two distinct processes to training 
source-to-target
and targe-to-source alignments.

3,4 It would be interesting to study some parallel  implementation of the
search algorithm, but I  would suggest to start with some simpler model
to check how much you can gain.

5,6 yes, you just need a sample of human made translations, the larger the 
better.
The amount of data actually depends on the difficulty of the task, namely 
distance of
languages and vocabulary size.  Limited domain tasks, like traveling expressions
(see the IWSLT tasks http://www.slc.atr.jp/IWSLT2008)  can be approached with
parallel corpora of 40K sentence pairs. Translation of  Europarl or Chinese
news require working with several millions of sentence pairs.


Best,
Marcello








________________________________________
Da: [EMAIL PROTECTED] [EMAIL PROTECTED] per conto di Vineet Kashyap [EMAIL 
PROTECTED]
Inviato: sabato 19 aprile 2008 16.29
A: [email protected]
Oggetto: [Moses-support] Running Moses In Parallel

Hello users

I am new to moses and will be using it for my research.

Out of curiosity i needed answers to the following questions:

1. Which parts are made parallel when moses is run on 'n' processors?
2. What happens to the input sentence and what does each processor do
   in terms of computation? searching, assigning probability weights ?
3. Can we modify moses-parallel.pl to improve parallelization?
4. Can we use MPICH for parallel implementation?
5. How big the parallel corpora should be to get accurate results? how many
   sentences/words?
6. Parallel Corpora consists of the text in source language along with
   the translation in the target language. Is that all you need ?
7. Also, while training large data the --parallel option can be used.
   Again can we use mpich and which parts are made parallel?

I know these are lot of questions.But it would be highly appreciated
if some one takes the time to answer these.

Thanks in advance.

Regards

Vineet

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to