Hi, you can also have your feature function read in the word vector mapping table.
-phi On Wed, Jul 2, 2014 at 11:23 AM, Hubert Soyer <[email protected]> wrote: > Yes, I am thinking of a new feature function based on word vectors. > Thank you for your suggestion about the generation step, I'll look into it, > maybe I'll find a way. > > I will also try to create a feature function directly. > > Thanks again! > > Best, > > Hubert > > On Jul 2, 2014 11:02 PM, "Philipp Koehn" <[email protected]> wrote: >> >> Hi, >> >> it would be better to include a word vector obtained by word2vec or other >> means >> as a single factor, and generate them with a generation step to avoid >> filling >> up the phrase table with redundant information. Unfortunately, there is no >> source side generation step, which may be a useful addition to the >> factored >> model. >> >> Of course, the question is what to do with these vectors. I assume that >> you have >> a new feature function in mind. >> >> -phi >> >> On Wed, Jul 2, 2014 at 5:04 AM, Hubert Soyer >> <[email protected]> wrote: >> > Hello, >> > >> > I have checked the mailing list archive for this question but couldn't >> > find anything. >> > I'd be surprised if this question has not been asked yet, if it has, >> > I'd be happy if you could point me to the corresponding mails. >> > >> > Recently, word representations induced by neural networks have gained >> > a lot of momentum. >> > Particularly often cited in this context is: >> > http://code.google.com/p/word2vec/ >> > >> > Those vector word representations are vectors that carry some semantic >> > meaning in them, i.e. semantically similar words have similar vectors >> > (small distances to each other). >> > >> > I have been wondering about the best way to incorporate them in Moses. >> > >> > One solution would be to incorporate them as factors in a factored >> > model: >> > >> > http://www.statmt.org/moses/?n=Moses.FactoredTutorial >> > >> > It seems to me that I would have to treat each dimension of each word >> > vector as a separate factor which would lead to a lot of factors. >> > Usual dimensionalities of those word vectors are 200 or more. >> > >> > Is treating each dimension as a factor the best way to incorporate >> > those vectors or is there anything better I can do? >> > I don't have to stick to factors, if there is another way. >> > >> > Thank you in advance! >> > >> > Best, >> > >> > Hubert >> > _______________________________________________ >> > Moses-support mailing list >> > [email protected] >> > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
