Hello All, I am preparing the phrase table for french-english conversion. When inputing the french corpus, I have the following queries: 1. How can we make the training system to recognize the strokes (like é) in french words? 2. And also special characters? For example when I try to build the phrase table inputing the corpus that includes '(aphostrophe), -(hiphen), .(dot) the tokenizer fails. But if I give space before and after these spl. chars the tokenizer works and the phrase table is properly built. But we should also include the spaces when we try to translate any input french word that contains these spl. chars to the decoder. 3. We would like to develop a web translator tool that can translate any french files to english using moses internally. For this, we initially tried to build a EXE that takes french text files (not web pages) and converts into english and we see that it happens in 2 mins for 6 KB file. when we eventually go for web, what is the best approach to make the translation tool more faster like your Moses Online Demo?
Thanks & Regards, Abhinandan
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
