Hi,

Sounds reasonable to me, but it would be good to have this as an option, as
Miles suggested.

-phi

On 25 Oct 2010 17:40, "Ben Gottesman" <[email protected]> wrote:
> Hi,
>
> Are truecase models still widely in use?
>
> I have a proposal for a tweak to the train-truecaser.perl script.
>
> Currently, we don't take the first token of a sentence as evidence for the
> true casing of that type, on the basis that the first word of a sentence
is
> always capitalized. The first token of a segment is always assumed to be
> the first word of a sentence, and thus is never taken as casing evidence.
>
> However, if a given segment is only one token long, then the segment is
> probably not a sentence, and the token is quite possibly in its natural
> case. So my proposal is to take the sole token of one-token segments as
> evidence for true casing.
>
> I attach the code change.
>
> Any objections? If not, I'll check it in.
>
> Ben
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to