This is a separate issue from the parallel "Tokenization problem" thread...

The tokenizer.perl has had one line that transforms the grave accent (`) 
to apostrophe and another that transforms double apostrophe ('') to to 
single quote. I suspect these have been in the script since the 
beginning. However, they recently "bit" me on a recent project. Easy 
enough to work around.

Still, I'm wondering. Do they still belong in the tokenizer.perl script? 
Or, should they moved into one of the other scripts? The 
normalize-punctuation.perl script seems to be a good candidate.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to