Other recent changes to the scripts have enhanced tokenizer.perl to handle non-printing ASCII and other control characters. A few weeks ago, the section #escape special chars (~ line 151) updated to include ' and ".
At that time, the square brackets [] were escaped to &bra; and &ket;. Now, I see they're escaped to &#91; and &#93;. This is great because it eliminates two custom named entities "bra" and "ket". I'd like to suggest the same for the vertical bar escaping, which is currently escaped to &bar;, and change it to &#124;. This change would eliminate the remaining custom named entity. Detokenize.perl would need updating to include the new escaping and add &bar; to the legacy section. (resent because my browser mail client converted the escaping) Tom On Fri, 22 Jun 2012 09:26:17 +0100, Hieu Hoang <[email protected]> wrote: > ah, that's where the mgizapp name comes from. I'll ask qin to make > the > automake & cmake exec names consistent (ie. mgiza). > > On 22/06/2012 05:53, Tom Hoar wrote: >> Hieu, >> >> Another observation about the changes. Compiling the new MGIZA++ >> with >> cmake on my system yesterday created the binary $prefix/bin/mgizapp. >> Line 228 of the most recent train-model.perl reads: $GIZA = >> "$_EXTERNAL_BINDIR/mgiza" >> >> The current/valid binary name of MGIZA++ has been a topic of >> discussion here before. However, since the file name changes >> unexpectedly between builds and this topic is "changes to the >> training >> scripts", would it be better if train-model.perl tests for both >> possible filenames for MGIZA++ binary? >> > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
