Hi, you should be able to do that with a simple script.
I wrote one such a script here: https://github.com/phikoehn/casmacat-install/blob/master/admin/scripts/process-xliff.perl This one is very much tied in with larger application that it is connected to (CASMACAT), but it should give you an idea. The following tools should also be helpful: https://code.google.com/p/m4loc/wiki/TMXTools -phi On Thu, Sep 18, 2014 at 10:39 AM, Asad A.Malik <[email protected]> wrote: > Hi All, > > I am currently trying to develop SMT system, and for that I have parallel > corpus, but the problem is that I have parallel corpus in XML format and > upto my knowledge MOSES support TXT. So is there any way that I can convert > XML to TXT so that I don't change the format also. > > PS. I've tried multiple online XML converter but each of them change the > order or the corpus. Also the corpus is then converted as paragraph. > > Kind Regards, > > Mr. Asad Abdul Malik > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
