On 18/05/15 14:02, Hieu Hoang wrote:
> i prefer FilePiece outputs a failthful representation of the file. If
> you need to clean your data, I think it should go into the cleaning or
> normalization scripts

That could go into a lot more places and end up being more brittle though.

Would it help if I made the default "do not strip carriage returns", and
made lexical-reordering-score request the conversion explicitly?

Bear in mind here that every time we fopen() a file without the "b" mode
flag, we're really saying we want the same conversion if the runtime
feels the need — as it would on Windows.  When we call ReadLine(), at
least it knows we really want the file interpreted as text.


Jeroen

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to