There are non-traditional uses like ReadLine('\0') to read
null-delimited tokens.

But I'd support Jeroen here: the default ReadLine() with no argument
should swallow \r.

In any case if you're going to change code there, can you do it upstream
in github.com/kpu/kenlm ?  I just gave you commit access.

Also, how would you feel if I changed it to be FakeIFStream with
operator>> extraction, at least for integer/float types?

Kenneth

On 05/18/2015 03:41 AM, Jeroen Vermeulen wrote:
> On 18/05/15 14:02, Hieu Hoang wrote:
>> i prefer FilePiece outputs a failthful representation of the file. If
>> you need to clean your data, I think it should go into the cleaning or
>> normalization scripts
> 
> That could go into a lot more places and end up being more brittle though.
> 
> Would it help if I made the default "do not strip carriage returns", and
> made lexical-reordering-score request the conversion explicitly?
> 
> Bear in mind here that every time we fopen() a file without the "b" mode
> flag, we're really saying we want the same conversion if the runtime
> feels the need — as it would on Windows.  When we call ReadLine(), at
> least it knows we really want the file interpreted as text.
> 
> 
> Jeroen
> 
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to