Hi James,

There has been a vast literature on adaptation techniques for SMT in
recent years.

Some reading suggestions:

http://www.statmt.org/wmt07/pdf/WMT17.pdf
http://www.statmt.org/wmt09/pdf/WMT-0932.pdf
http://dl.acm.org/citation.cfm?id=1870702
http://www.mt-archive.info/IWSLT-2011-Bisazza.pdf
http://www.aclweb.org/anthology/P12-1099
http://amta2012.amtaweb.org/AMTA2012Files/papers/115.pdf
http://amta2012.amtaweb.org/AMTA2012Files/papers/152.pdf
http://www.hltpr.rwth-aachen.de/publications/download/832/Mansour-IWSLT-2012.pdf
http://www.aclweb.org/anthology/P/P13/P13-1141.pdf

See also http://www.statmt.org/survey/Topic/DomainAdaptation .

You'll be able to find many more interesting papers about that topic. 
Not sure to what extent your idea differs from what has been suggested
in previous work.

Regarding your question about distortion, you may also want to consult
the literature first. Philipp Koehn wrote a nice textbook about all the
basics of SMT: http://www.statmt.org/book/

Cheers,
Matthias


On Wed, 2013-11-06 at 09:35 -0800, Kenneth Heafield wrote:
> Hi,
> 
>       Multiple column won't really work because the set of phrase pairs will
> be different.  You could of course take the union of phrase pairs and
> just have null values for inapplicable phrases, but it's not clear how
> much compression you'd get.
> 
> Kenneth
> 
> On 11/06/13 06:21, Read, James C wrote:
> > So here's a random crazy idea I had lately. A phrase table could have 
> > multiple columns giving different scores for different probabilities from 
> > different alignments, different corpora, different domains etc. Recent work 
> > at Edinburgh, Cambridge and Sheffield has had some emphasis on adaptation 
> > of models for speech recognition purposes. I guess a similar principle 
> > could be applied to SMT. Given a text from some unknown domain the engine 
> > could perform some automated recognition test to guess which translation 
> > model best fits the text to be translated. A primitive form of automatic 
> > domain recognition and adaptation if you like.
> > 
> > I guess even making available multiple forms of a phrase table or a single 
> > compact version with multiple columns for scoring could even have some 
> > demand in the future.
> > 
> > James



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to