Hi Ben, I'm confused by this email.
On Thu, May 11, 2017 at 4:40 AM, Ben Goertzel <[email protected]> wrote: > , > > I was thinking to explore addressing this with (fairly shallow) neural > networks ... > > This paper > > https://nlp.stanford.edu/pubs/HuangACL12.pdf > > which I've pointed out before, does unsupervised construction of > word2vec type vectors for word senses (thus, doing sense > disambiguation sorta mixed up with the dimension-reduction process) > I'm skimming that paper, but it makes my eyes glaze over. We are already getting better results than they get, so WTF? > > 1) A first step would be to use the OpenCog pattern miner to mine the > surprising patterns from the set of parse trees produced by MST > parsing. > But that is exactly what the disjuncts are. Do you not like the metric? Do you want a different one? > > 2) Then, one could associate with each word-instance W a set of > instance-pattern-vectors. Well, but I've already got at least 3 different types of sparse vectors per word instance, and all of them give OK results. I think the disjunct-based one gives the best results, but I haven't proved that yet. We can add yet another vector to the mix, but honestly (see other email) baby-sitting the CPU while it crunches data takes about half my time, and writing code to do data analysis takes about another half. In between that, I get some scattered hours to actually do some data analysis, and read some email. So I need to be very protective of where I spend my time.... I still find that work in 1% inspiration and 99% mindless, thoughtless persperation ... > > > 3) Their algorithm involves an embedding matrix L that maps: a binary > vector with a 1 in position i representing the i'th word in the > dictionary, into a much smaller dense vector. Yes, this is called "clustering". This is the next step. > I would suggest > instead having an embedding matrix L that maps the pattern-vectors > representing words or senses (constructed in step 2) into a much > smaller dense vector. Why do you think that some kind of linear transform is the best way to do clustering? Clustering usually works better when you allow it to do whatever, instead of forcing it to be linear (e.g. PCA, LSA) Recall that we already know that we want to have hundreds of clusters. It's not obvious to me that PCA is effective at this size. I've been mentally envisioning some sort of agglomerative clustering for the dimensional reduction step, rather than a linear transform of some kind ... > > 4) Their algorithm involves, in the local score function, using a > sequence [x1, ..., xm], where xi is the embedding vector assigned to > word i in the sequence being looked at. Ehh? We've got scoring functions out the wazoo. So far cosine similarity seems to be the best, from my poking around, I'm still planning on exploring some others. > > This context-matrix is a way of capturing "the embedding vectors of > the words constituting the context of w in parsed sentence S" as a > linear vector... Stopping at "two links away" is arbitrary, probably > we want to go 4-5 links away (yielding a vector of length 8-10); this > would have to be experimented with... > WTF? link-distances are all about what MST is doing. We already know, from psychology studies, from link-grammar, from published MST results, what the appropriate link lengths are. Viz, yes, most links are 1-2 words long, some are much much longer. We even know these for various languages: e.g. link lengths for English have been decreasing for over 400 years -- link lengths for old english are almost twice as long as modern english. This all seems like a redd herring .. we've got the technology for dealing with this. Anyway, I don't see anything in that paper that is worth saving. It old crap, we've been doing better for years, Rohit demonstrated that. The missing next step is the dimensional reduction, and you suggest using linear matrix algos, but I don't see why these would be better than agglomerative clustering. They seem to be harder to control, and gut instinct says they won't give good results. --linas > > > > > > -- > Ben Goertzel, PhD > http://goertzel.org > > "I am God! I am nothing, I'm play, I am freedom, I am life. I am the > boundary, I am the peak." -- Alexander Scriabin > > -- > You received this message because you are subscribed to the Google Groups > "link-grammar" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/link-grammar. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA34a9UrjSTkhPDOwC9qtvR6h2pisnEzs-vJzVDEwLUj3qw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
