Hi Linas,

Nice stuff!

A quick comment regarding

***

More precisely, the factorization of the language model appears to require
three important ingredients:

   -

   A way of decomposing word-vectors into sums of word-sense vectors,
   -

   A way of performing biclustering, so as to split the bipartite graph p (w,
   d) into left, central and right components, holding the left and right
   parts to be sparse,
   -

   Using an information-theoretic similarity metric, to preserve the proba-
   bilistic interpretation of the contingency table p (w, d).

***

The first of these is, of course, what Adagram attempts to do ... and
Andres has experimented with a variety of Adagram that replaces standard
SkipGram with "SkipGram-in-a-parse" (to be done after a round of e.g. MST
parsing) ....  But improvements on Adagram that incorporate broader context
will be valuable...

The third of these makes total sense and is fortunately not a huge deal, as
most clustering algorithms can work with whatever similarity metric you
throw at them...

The second of these is the most interesting to me... basically it seems you
are wanting to cluster (word, disjunct pairs) in a way that has high
"clustering quality" both in the word dimension and in the disjunct
dimension [i.e. so that both words are divided into meaningful clusters,
and disjuncts are divided into meaningful clusters, even though the words
and disjuncts are distinct and are stuck onto each other in various
combinations]

This is interesting and could be attempted via many possible algorithms
including of course k-means-like iterative algorithms or EM-like estimation
algorithms....   (Or, as you note, evolutionary learning methods) ... Oleg
may have some views on this...

-- Ben

On Sat, Aug 4, 2018 at 12:05 PM, Linas Vepstas <[email protected]>
wrote:

> Hi Anton,
>
> Attached please find this weeks new-and-improved version of the
> neural-nets-vs-symbolic-parsing document.  Improvements over the last
> version include:
>
> More words spent explaining why:
>
>    - k-means clustering is identical to matrix factorization -- this is
>    not my result, its an old result, I'm recapping it because you need to
>    understand it to understand the next step, which is bicategorization.
>    - Why bicategorization is the right thing to do -- because that is
>    what the link-grammar dicts already do!  Bicategorization is also an old
>    algo, from 2003 - but if you look at it carefully, you can see that the
>    Link Grammar dicts are ***exactly*** bicategorized contingency matrixes.
>    That is to say: ordinary old-school linguists who manually write dependency
>    grammars do so in a format that is naturally the same format as the output
>    of a k-means bicategorization.
>    - Why an information-theoretic divergence is much better than a cosine
>    distance.  This is a lot more subtle, I suppose, because it requires you to
>    see a vector dot-product as something  that is invariant under not under
>    rotations (well it is, but that misses the point), but rather as something
>    that is invariant under Markov transformations, which preserve
>    probabilities.  This is because all of the vectors are rows and columns in
>    a probability distribution.  Thus, cosine distance is "wrong", and
>    Kullback-Leibler divergence is correct. Again -- this is an old result,
>    from 2003, but all of the people who are doing ordinary off-the-shelf
>    k-means are unaware and oblivious to it, because their data is not a joint
>    probability.  I try to spell this out in great detail, and to provide all
>    of the explicit formulas you need to do this.
>
> This paper is still not yet done, but I think it lays out the groundwork
> much more nicely than before.  I am hoping that it is not hard to read --
> again, I tried to mostly simplify everything. I hope its not oversimplified.
>
> Anyway, I think its a lot more promising, a lot better direction to go in
> than triadic k-means. Its probably simpler too.
>
> --linas
>
> On Sun, Jul 22, 2018 at 12:01 AM, Anton Kolonin @ Gmail <
> [email protected]> wrote:
>
>> Hi Linas, thanks, I will look into that.
>>
>> In meantime, below, the guys are getting close with "triadic K-means":
>>
>> http://aclweb.org/anthology/P18-2010
>>
>> They use "FrameNet 1.7" and "dataset of polysemous verb classes by
>> Korhonen" for evalutaion.
>>
>> If we get these, we may compare to which extent we are doing better.
>> Cheers,
>>
>> -Anton
>>
>> 20.07.2018 1:46, Linas Vepstas пишет:
>>
>> Due to the obvious confusion that the sheaves paper caused for everyone,
>> I have started work on a different way of explaining it.  This one takes,
>> as
>> its starting point, a description of the word2vec algorithms, and
>> explains how
>> word2vec can be viewed as a sheaf.  So, if you are more comfortable with
>> that viewpoint, this might be a better way of groking the concept.
>>
>> The paper is very much an early draft; I've already re-written the final
>> 2-3
>> pages since last night.  The title is likely to change. The introduction
>> will change.
>> But maybe the middle bits will help clarify these issues.
>>
>> Here:
>> https://github.com/opencog/opencog/raw/master/opencog/nlp/
>> learn/learn-lang-diary/skippy.pdf
>>
>> -- Linas
>>
>> ---------- Forwarded message ----------
>> From: Anton Kolonin (Google Docs) <d+MTEyNjExODQyMzA2NTk3MDYxMzE
>> [email protected]>
>> Date: Thu, Mar 1, 2018 at 4:16 AM
>> Subject: Unsupervised Lang... - This comes from works of +linasvepsta...
>> To: [email protected]
>>
>>
>> Anton Kolonin mentioned you in a comment on Unsupervised Language
>> Learning (ULL) Design Draft
>> <https://docs.google.com/document/d/14MpKLH5_5eVI39PRZuWLZHa1aUS73pJZNZzgigCWwWg/edit?disco=AAAABqvKTUk&ts=5a97d2e3&usp=comment_email_document>
>> [image: Anton Kolonin]
>> *Anton Kolonin*
>> Section - collection of adjacent Seeds from single sentence, series of
>> adjacent sentence or entire single text Sheaf - unclearly defined
>> combination of Sections and Lexical Entries representing particular corpus
>>
>> This comes from works of [email protected] - clearer definition
>> may get required and potential use should be explored further
>> Open
>> <https://docs.google.com/document/d/14MpKLH5_5eVI39PRZuWLZHa1aUS73pJZNZzgigCWwWg/edit?disco=AAAABqvKTUk&usp=comment_email_discussion&ts=5a97d2e3>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA 94043, USA
>> <https://maps.google.com/?q=1600+Amphitheatre+Parkway,+Mountain+View,+CA+94043,+USA&entry=gmail&source=g>
>>
>> You have received this email because you are mentioned in this thread.Change
>> what Google Docs sends you.
>> <https://docs.google.com/document/u/112611842306597061311/docos/notify?id=14MpKLH5_5eVI39PRZuWLZHa1aUS73pJZNZzgigCWwWg&title=Unsupervised+Language+Learning+%28ULL%29+Design+Draft>You
>> can reply to this email to reply to the discussion.
>>
>>
>>
>> --
>> cassette tapes - analog TV - film cameras - you
>>
>>
>> --
>> -Anton Kolonin
>> skype: akolonin
>> cell: 
>> [email protected]https://aigents.comhttps://www.youtube.com/aigentshttps://www.facebook.com/aigentshttps://plus.google.com/+Aigentshttps://medium.com/@aigentshttps://steemit.com/@aigentshttps://golos.blog/@aigentshttps://vk.com/aigents
>>
>>
>
>
> --
> cassette tapes - analog TV - film cameras - you
>



-- 
Ben Goertzel, PhD
http://goertzel.org

"The dewdrop world / Is the dewdrop world / And yet, and yet …" --
Kobayashi Issa

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CACYTDBetkfDno126_YqtfOHmb_qAAtSDWbUQ1nTrYc8bkPe%2BuQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to