Dear Kenneth and Moses community, @Kenneth, Thank you for the tip!
Regards, Nat On Wed, Nov 9, 2016 at 4:46 PM, Kenneth Heafield <[email protected]> wrote: > No. Tokenizer and LM are separate tools. You can of course replace space > with a token like <spc> or something. > > On November 9, 2016 6:04:07 AM GMT+00:00, Nat Gillin <[email protected]> > wrote: > >> Dear Moses community, >> >> Other than manually replacing space with an unused character and adding >> spaces to each character before training a language model with KenLM. Is it >> possible for KenLM to generate character ngrams and output in arpa format >> without altering the input file? >> >> Regards, >> Nat >> >> ------------------------------ >> >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >>
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
