Re: [Moses-support] Using word embeddings in Moses

Philipp Koehn Wed, 02 Jul 2014 10:49:50 -0700

Hi,

you can also have your feature function read in the word vector mapping table.


-phi

On Wed, Jul 2, 2014 at 11:23 AM, Hubert Soyer
<[email protected]> wrote:
> Yes, I am thinking of a new feature function based on word vectors.
> Thank you for your suggestion about the generation step, I'll look into it,
> maybe I'll find a way.
>
> I will also try to create a feature function directly.
>
> Thanks again!
>
> Best,
>
> Hubert
>
> On Jul 2, 2014 11:02 PM, "Philipp Koehn" <[email protected]> wrote:
>>
>> Hi,
>>
>> it would be better to include a word vector obtained by word2vec or other
>> means
>> as a single factor, and generate them with a generation step to avoid
>> filling
>> up the phrase table with redundant information. Unfortunately, there is no
>> source side generation step, which may be a useful addition to the
>> factored
>> model.
>>
>> Of course, the question is what to do with these vectors. I assume that
>> you have
>> a new feature function in mind.
>>
>> -phi
>>
>> On Wed, Jul 2, 2014 at 5:04 AM, Hubert Soyer
>> <[email protected]> wrote:
>> > Hello,
>> >
>> > I have checked the mailing list archive for this question but couldn't
>> > find anything.
>> > I'd be surprised if this question has not been asked yet, if it has,
>> > I'd be happy if you could point me to the corresponding mails.
>> >
>> > Recently, word representations induced by neural networks have gained
>> > a lot of momentum.
>> > Particularly often cited in this context is:
>> > http://code.google.com/p/word2vec/
>> >
>> > Those vector word representations are vectors that carry some semantic
>> > meaning in them, i.e. semantically similar words have similar vectors
>> > (small distances to each other).
>> >
>> > I have been wondering about the best way to incorporate them in Moses.
>> >
>> > One solution would be to incorporate them as factors in a factored
>> > model:
>> >
>> > http://www.statmt.org/moses/?n=Moses.FactoredTutorial
>> >
>> > It seems to me that I would have to treat each dimension of each word
>> > vector as a separate factor which would lead to a lot of factors.
>> > Usual dimensionalities of those word vectors are 200 or more.
>> >
>> > Is treating each dimension as a factor the best way to incorporate
>> > those vectors or is there anything better I can do?
>> > I don't have to stick to factors, if there is another way.
>> >
>> > Thank you in advance!
>> >
>> > Best,
>> >
>> > Hubert
>> > _______________________________________________
>> > Moses-support mailing list
>> > [email protected]
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Using word embeddings in Moses

Reply via email to