Hi, it would be better to include a word vector obtained by word2vec or other means as a single factor, and generate them with a generation step to avoid filling up the phrase table with redundant information. Unfortunately, there is no source side generation step, which may be a useful addition to the factored model.
Of course, the question is what to do with these vectors. I assume that you have a new feature function in mind. -phi On Wed, Jul 2, 2014 at 5:04 AM, Hubert Soyer <[email protected]> wrote: > Hello, > > I have checked the mailing list archive for this question but couldn't > find anything. > I'd be surprised if this question has not been asked yet, if it has, > I'd be happy if you could point me to the corresponding mails. > > Recently, word representations induced by neural networks have gained > a lot of momentum. > Particularly often cited in this context is: > http://code.google.com/p/word2vec/ > > Those vector word representations are vectors that carry some semantic > meaning in them, i.e. semantically similar words have similar vectors > (small distances to each other). > > I have been wondering about the best way to incorporate them in Moses. > > One solution would be to incorporate them as factors in a factored model: > > http://www.statmt.org/moses/?n=Moses.FactoredTutorial > > It seems to me that I would have to treat each dimension of each word > vector as a separate factor which would lead to a lot of factors. > Usual dimensionalities of those word vectors are 200 or more. > > Is treating each dimension as a factor the best way to incorporate > those vectors or is there anything better I can do? > I don't have to stick to factors, if there is another way. > > Thank you in advance! > > Best, > > Hubert > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
