On Fri, Jun 19, 2015 at 9:55 PM, YKY (Yan King Yin, 甄景贤) <[email protected]> wrote: > > Frame is a standard term in maths: > > https://en.wikipedia.org/wiki/Frame_of_a_vector_space > Frames use redundant numbers of dimensions (to increase accuracy etc), > whereas my current problem is that the dimension is already too high and I > want to reduce the dimension without crushing together distinct concepts / > objects / propositions.
The normal way to do this in a semantic vector space is latent semantic analysis (LSA). It is basically a 3 layer neural network that maps words to documents, where the semantic vector space in the hidden layer is learned. (You might also see it described as the dominant eigenvectors of the word-document matrix, but a neural network is the easiest way to calculate it. See http://sifter.org/~brandyn/papers/gorrell_webb.pdf ). You might have 20K words, 20K documents, and 200 dimensions in the semantic space. But I think those numbers are too low for AGI. A neural network needs enough connections to represent the information content of its training data, which is about 10^9 bits. -- -- Matt Mahoney, [email protected] ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
