Hi, I'm working on a Transliteration project. The input is a word in one language and the output is the same word in English (not translated). My language Model will created from google 1gram file - while each letter of a word should be a word. This is the original file:
</S> 95119665584 <S> 95119665584 , 30578667846 . 22077031422 <UNK> 21594821357 the 19401194714 - 16337125274 of 12765289150 and 12522922536 This is the file after inserting spaces between words letters: t h e 19401194714 - 16337125274 o f 12765289150 a n d 12522922536 Now I have "1gram" file that contains not just 1gram (1 word each line), but also 2grams\3grams\etc. How can I run the SRILM "ngram-count" script to create a Language Model ? When I'm running the script normally , the integers are calculated as words too - and not as Probability\number of appearances. Can anyone help me please? Thank you, Guy.
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
