Hello Saeed, I think you can also use a tool called XenC I developed and released last year. I believe it can help you since it was designed to cope with similar needs than yours.
You can read about it in this paper: https://ufal.mff.cuni.cz/pbml/100/art-rousseau.pdf Source code of the tool can be found here: https://github.com/rousseau-lium/XenC Best regards, — Anthony Rousseau, Ph.D. LIUM, University of Le Mans [email protected] Le 16 janv. 2014 à 16:43, Saeed Farzi <[email protected]> a écrit : > Dear all, > > I am working on a translation task with a very large parallel corpus. > Because of computational cost of training such a parallel corpus, i am > going to filter it regarding to the test set ( of course , by the > filtering, the evaluation must be still fair). > > I am looking for a solution or a tool for filtering parallel corpus > sentences. > > Note that i do not need to filter phrase table. I know that the > filter_ moses tool reduces the phrase table size. > > cheers > -- > S.Farzi, Ph.D. Student > Natural Language Processing Lab, > School of Electrical and Computer Eng., > Tehran University > Tel: +9821-6111-9719 > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
