On 18/05/15 14:02, Hieu Hoang wrote: > i prefer FilePiece outputs a failthful representation of the file. If > you need to clean your data, I think it should go into the cleaning or > normalization scripts
That could go into a lot more places and end up being more brittle though. Would it help if I made the default "do not strip carriage returns", and made lexical-reordering-score request the conversion explicitly? Bear in mind here that every time we fopen() a file without the "b" mode flag, we're really saying we want the same conversion if the runtime feels the need — as it would on Windows. When we call ReadLine(), at least it knows we really want the file interpreted as text. Jeroen _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
