Maybe you have some control characters in your data or Moses-reserved words?
The following regex-substitutions (Perl style) may help you:

    s/[\x00-\x1f\x7f\n]//gs;
    s/\<(s|unk|\/s|\s*and\s*|)\>//gs;
    s/\[\s*and\s*\]//gs;
    s/\|/_/gs;

Jörg


On Wed, Mar 7, 2012 at 7:23 PM, Feifan Liu <[email protected]> wrote:
> I checked the data once again, still can't figure out the reason. Any help
> will be appreciated.
>
>
> On Tue, Mar 6, 2012 at 11:55 PM, Feifan Liu <[email protected]> wrote:
>>
>> Hi All,
>> I run into the problem of "out of bounds" when training a model using
>> train.perl.
>>
>> The warning information is:
>> WARNING: sentence 2176 has alignment point (3, 3) out of bounds (6, 3)
>> T: z z z
>> S: s l iy p ih nx
>>
>> Every sentence pair has this warning. E.g. for this sentence, there are
>> three letters in T, but the alignment point to the 4th(index of 3).
>> I checked previous email archive, didn't find the solution. In the data,
>> there is no empty line, no "|" symbol.
>> Much appreciated if any solutions can be suggested.
>> -philley
>>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>



-- 
**********************************************************************************
 Jörg Tiedemann                                   [email protected]
 Dep. of Linguistics and Philology           http://stp.lingfil.uu.se/~joerg/
 Uppsala University                                  tel:  +46 (0)18 - 471 1412
 Box 635, SE-751 26 Uppsala/SWEDEN    fax: +46 (0)18 - 471 1094

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to