Maybe you have some control characters in your data or Moses-reserved words?
The following regex-substitutions (Perl style) may help you:
s/[\x00-\x1f\x7f\n]//gs;
s/\<(s|unk|\/s|\s*and\s*|)\>//gs;
s/\[\s*and\s*\]//gs;
s/\|/_/gs;
Jörg
On Wed, Mar 7, 2012 at 7:23 PM, Feifan Liu <[email protected]> wrote:
> I checked the data once again, still can't figure out the reason. Any help
> will be appreciated.
>
>
> On Tue, Mar 6, 2012 at 11:55 PM, Feifan Liu <[email protected]> wrote:
>>
>> Hi All,
>> I run into the problem of "out of bounds" when training a model using
>> train.perl.
>>
>> The warning information is:
>> WARNING: sentence 2176 has alignment point (3, 3) out of bounds (6, 3)
>> T: z z z
>> S: s l iy p ih nx
>>
>> Every sentence pair has this warning. E.g. for this sentence, there are
>> three letters in T, but the alignment point to the 4th(index of 3).
>> I checked previous email archive, didn't find the solution. In the data,
>> there is no empty line, no "|" symbol.
>> Much appreciated if any solutions can be suggested.
>> -philley
>>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
--
**********************************************************************************
Jörg Tiedemann [email protected]
Dep. of Linguistics and Philology http://stp.lingfil.uu.se/~joerg/
Uppsala University tel: +46 (0)18 - 471 1412
Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support