Hi all,
i have installed latest version of moses from sourceforge.net. i am just clarifying, do we need to place the corpus of both the languages (both source and target) as input for clean-corpus-n.perl ? i executed script for both these lang and got following messages:- For Source:- ./clean-corpus-n.perl 200EnglishSens en hi 200EnglishSens.clean 1 50 clean-corpus.perl: processing 200EnglishSens.en & .hi to 200EnglishSens.clean, cutoff 1-50 Input sentences: 203 Output sentences: 187 For Target :- ./clean-corpus-n.perl 200HindiSens hi en 200HindiSens.clean 1 50 clean-corpus.perl: processing 200HindiSens.hi & .en to 200HindiSens.clean, cutoff 1-50 Use of uninitialized value $opn in open at ./clean-corpus-n.perl line 46. Use of uninitialized value $opn in concatenation (.) or string at ./clean-corpus-n.perl line 46. Can't open '' at ./clean-corpus-n.perl line 46 So the problem is again seems to be with the target lang. How to solve this problem of broken UTF as it was pointed out Tom. -- Thanks & Regards, nakul.
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
