Hello all,

I wonder which is the method implemented in Moses for on-demand loading 
of the rule table when hierarchical phrase-based models are used. Is 
this the same method used for the phrase table in phrase-based SMT, i.e. 
the use of a prefix tree (trie) as describe by Zens & Ney (2007)?

    Zens & Ney (2007): "Efficient Phrase Table Representation for
    Machine Translation with Applications to Online MT and Speech
    Translation"

In the literature I have found papers describing the use of suffix 
arrays both for phrase-based SMT (Callison-Burch, Bannard & Schoeder, 
2005; Zhang & Vogel 2005), and for hierarchical phrase-based SMT (Lopez, 
2008; Schwartz & Callison-Burch, 2010), but all these methods store the 
parallel corpus and compute the required probabilities on the fly.

    Callison-Burch, Bannard & Schoeder (2005): "Scaling Phrase-Based
    Statistical Machine Translation to Large Corpora and Longer
    Phrases"

    Zhang & Vogel (2005): "An Efficient Phrase-to-Phrase Alignment Model
    for Arbitrarily Long Phrases and Large Corpora.

    Lopez (2008): "Tera-Scale Translation Models via Pattern Matching"

    Schwartz & Callison-Burch (2010): "Hierarchical Phrase-Based Grammar
    Extraction in Joshua"


In addition, I would also like to know if Moses implements any 
compression technique to save memory or disk space or if it just 
identifies each word by an integer (32 bits), and which data structure 
uses Moses to store the phrase table in memory.

If I am missing some work I have not cited, please let me know. I 
appreciate your help.

Thank you very much in advance.
Kind regards.
-- 
Felipe
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to