Hi,

simply combine all the files for each language into one file:
% cat de-en/de/* > corpus.de-en.de
% cat de-en/en/* > corpus.de-en.en

-phi

On 12/27/07, Pradeep Muthukrishnan <[EMAIL PROTECTED]> wrote:
> Hello,
>
> I got the sentence-align-corpus to work properly, but every other script
> that needs to be run like clean-corpus-n.perl needs just two files, the
> source language file and the target language file. But after
> sentence-align-corpus I have a lot of German files in
> /data0/tools/mosesdecoder/scripts/training/europarl/aligned/de-en/de.
> The files are named like
> ep-00-09-07.txt, etc.
>
> Similarly all the English files are in
> /data0/tools/mosesdecoder/scripts/training/europarl/aligned/de-en/de.
> The files are named like
> ep-00-09-07.txt, etc.
>
>  How do I merge all these files into corpus.de and corpus.en?
>
> Someone please help me with this. I have been working on this for quite some
> time now. Thanks for your time!
>
> regards,
> Pradeep
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to