--- Mike Tintner <[EMAIL PROTECTED]> wrote: > On the one hand, we can perhaps agree that one of the brain's glories is > that it can very rapidly draw analogies - that I can quickly produce a > string of associations like, say, "snake", "rope," "chain", "spaghetti > strand," - and you may quickly be able to continue that string with further > associations, (like "string"). I believe that power is mainly based on > "look-up" - literally finding matching shapes at speed. But I don't see the > brain as checking through huge numbers of such shapes. (It would be > enormously demanding on resources, given that these are complex pictures, > no?).
Semantic models learn associations by proximity in the training text. The degree to which you associate "snake" and "rope" depends on how often these words appear near each other. You can create an association matrix A, e.g. A[snake][rope] is the degree of association between these words. Among the most successful of these models is latent semantic analysis (LSA), where A is factored: A = USV by singular value decomposition (SVD), such that U and V are orthonormal and S is diagonal, and then discard all but the largest elements of S. In a typical LSA model, A is 20K by 20K, and S is reduced to about 200. This approximates A to two 20K by 200 matrices, using about 2% as much space. One effect of lossy compression by LSA is to derive associations by the transitive property of semantics. For example, if "snake" is associated with "rope" and "rope" with "chain", then the LSA approximation will derive an association of "snake" with "chain" even if it was not seen in the training data. SVD has an efficient parallel implementation. It is most easily visualized as a 20K by 200 by 20K 3-layer linear neural network [1]. But this really should not be surprising, because natural language evolved to be processed efficiently on a slow but highly parallel computer. 1. Gorrell, Genevieve (2006), “Generalized Hebbian Algorithm for Incremental Singular Value Decomposition in Natural Language Processing”, Proceedings of EACL 2006, Trento, Italy. http://www.aclweb.org/anthology-new/E/E06/E06-1013.pdf -- Matt Mahoney, [EMAIL PROTECTED] ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244&id_secret=71675396-27fd0e
