Uli,

—mark-unknown will apply the specified prefix or suffix to unknowns in the 
final output, but it won’t output these markers in the nbest lists. Doing the 
same for nbest lists shouldn’t be hard, I just need to find the right place so 
I don’t have to replicate this code for the different decoding algorithms.

Thanks!
-Jeremy


> On Dec 2, 2015, at 12:39 PM, Ulrich Germann <[email protected]> wrote:
> 
> Have you tried specifying 
> 
> --mark-unknown 
> 
> on the command line? This will (i.e. should ;-)) prefix unknown words in the 
> output with UNK
> 
> you can set begin and end label with --unknown-word-prefix and 
> --unknown-word-suffix.
> 
> For example 
> 
> --unknown-word-prefix '<unk>'
> --unknown-word-suffix '</unk>' 
> 
> would give you XML-style markup.
> 
> - Uli
> 
> On Wed, Dec 2, 2015 at 4:36 PM, Jeremy Gwinnup <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi,
> 
> I’d like to be able to mark unknown words in nbest lists - where is a good 
> place to dig into the code so that it works with both phrase-based and chart 
> decoding?
> 
> Thanks!
> -Jeremy
> _______________________________________________
> Moses-support mailing list
> [email protected] <mailto:[email protected]>
> http://mailman.mit.edu/mailman/listinfo/moses-support 
> <http://mailman.mit.edu/mailman/listinfo/moses-support>
> 
> 
> 
> -- 
> Ulrich Germann
> Senior Researcher
> School of Informatics
> University of Edinburgh

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to