As you say, the number's in the input sentence would be unknown. However,
the reason to use placeholders is to make them known to the LM and
phrase-table so that they can have more accurate scores for them

Therefore you need to replace numbers in your input sentence with
placeholders.

You can then use the word alignment from the decoder to put the number back.

This is something I want to implement better in the decoder. So if you, or
anyone, is willing help me and contribute some time & code, I can help out.



On 6 June 2013 15:25, Arezki Sadoune <[email protected]> wrote:

> Dear Hieu Hoang,
>
> Thank you for the answer,
>
> Yes, I'm replacing the numbers with a placeholder on the training data and 
> the LM as well, I thought this might address the issue of the number without 
> losing too much translation quality,
>
> Regarding the input sentence I'm not interfering with the process assuming 
> that the number will still the same as it is unknown..., An other option will 
> be to make a script which, as you said puts back the original number for a 
> 100% accuracy.
>
> Do you think I could use the both at the same time?
>
> Regards
>
>
> A.S
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to