the original question was about speed of decoding, not potential quality improvements due to filtering
clearly, if you can identify phrases to prune then you will get a speed-boost. but this is not true for the general case and my advice was for the general case. Miles 2009/5/4 Marcin Miłkowski <[email protected]>: > Miles Osborne pisze: >> >> filtering etc might give you a speed-up (eg a constant one --less >> stuff to load) but if filtering is safe w.r.t to the source data, then >> you shouldn't see much here. >> >> (pruning the table should make it faster since there will be fewer >> options to consider, but this is not safe) > > Actually, this is contrary to what Johnson et al. say in their paper, and my > subjective (not measured) experience was definitely in their favor. As long > as you have really clean data, you don't want to lose any of it, but if > alignments are lousy, translations ambiguous etc., you want to cut it off, > and Jan wants to do that (see his post). > > I was even filtering more and got better results by heuristically discarding > unprobable phrases from the phrase table (based on Fran's idea he had about > discarding unprobable alignments). Again, this is subjective, anecdotal, > etc., but before that I was getting complete garbage. > > Note: my pair was English-Polish and Polish English. > >> i guess you might also see fewer page faults and the like with a >> smaller model and that will help matters. > > btw, quantising and binarising language models helps as well > > Marcin > >> but in general, the beam size is the most direct way to make it faster. > > > >> Miles >> >> 2009/5/4 Francis Tyers <[email protected]>: >>> >>> El lun, 04-05-2009 a las 14:08 +0100, Miles Osborne escribió: >>>> >>>> actually, i think Jan wants a speedup, not a space saving. >>> >>> Does filtering the phrase table before translation not decrease the >>> total time to make a translation (including the time taken to load the >>> phrase table etc.)? That was my experience, and it appears to be >>> something that he hasn't done, but perhaps my set up is unusual... >>> >>> Fran >>> >>>> your best bet is to reduce the size of the beam: >>>> >>>> http://www.statmt.org/moses/?n=Moses.Tutorial#ntoc6 >>>> >>>> Miles >>>> 2009/5/4 Francis Tyers <[email protected]>: >>>>> >>>>> El lun, 04-05-2009 a las 14:54 +0200, Jan Helak escribió: >>>>>> >>>>>> Hello everyone :) >>>>>> >>>>>> I try to build two-way translator for polish and english languages as >>>>>> a >>>>>> project on one of my subjects. By now, I created a one-way translator >>>>>> (polish->english) as a beta version, but severals problems have came: >>>>>> >>>>>> (1) A translator must work in two-ways. How to achieve this? >>>>> >>>>> Make another directory and train two models. >>>>> >>>>>> (2) Time of traslating for phrases is two long ( 4 min. for one >>>>>> sentence). How to accelerate this (decresing a quality of translation >>>>>> is acceptable). >>>>> >>>>> You can try filtering the phrase table before translating (see PART V - >>>>> Filtering Test Data), or using a binarised phrase table (see Memory-Map >>>>> LM and Phrase Table). >>>>> >>>>> >>>>> http://ufallab2.ms.mff.cuni.cz/~bojar/teaching/NPFL087/export/HEAD/lectures/02-phrase-based-Moses-installation-tutorial.html >>>>> >>>>> Regards, >>>>> >>>>> Fran >>>>> >>>>> _______________________________________________ >>>>> Moses-support mailing list >>>>> [email protected] >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>> >>>> >>>> >>> >> >> >> > > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
