--- Ed Porter <[EMAIL PROTECTED]> wrote: > And (2) with regard to the order of NL learning, I think a child actually > learns semantics first
Actually Jusczyk showed that babies learn the rules for segmenting continuous speech at 7-10 months. I did some experiments in 1999 following the work of Hutchens and Alder showing that it is possible to learn the rules for segmenting text without spaces using only the simple character n-gram statistics of the input. The word boundaries occur where the mutual information across the boundary is lowest. http://cs.fit.edu/~mmahoney/dissertation/lex1.html Children begin learning the meanings of words around 12 months, and start forming simple sentences around age 2-3. > For example, I think it would be interesting to see what sort of AGI's could > be built on current PCs with up to 4G or RAM. I did something like that with language models, up to 2 GB. So far, my research suggests you need a LOT more memory. http://cs.fit.edu/~mmahoney/compression/text.html With regard to distributed AI, I believe the protocol should be natural language at the top level (perhaps on top of HTTP), because I think it is essential that live humans can participate. The idea is that each node in the P2P network might be relatively stupid, but would be an expert on some narrow topic, and know how to find other experts on related topics. A node would scan queries for keywords and ignore the messages it doesn't understand (which would be most of them). Overall the network would appear intelligent because *somebody* would know. When a user asks a question or posts information, the message would be broadcast to many nodes, which could choose to ignore them or relay them to other nodes that it believes would find the message more relevant. Eventually the message gets to a number of experts, who then reply to the message. The source and destination nodes would then update their links to each other, replacing the least recently used links. The system would be essentially a file sharing or message posting service with a distributed search engine. It would make no distinctions between queries and updates, because asking a question about a topic indicates knowledge of related topics. Every message you post becomes a permanent part of this gigantic distributed database, tagged with your name (or anonymous ID) and a time stamp. I wrote my thesis on the question of whether such a system would scale to a large, unreliable network. (Short answer: yes). http://cs.fit.edu/~mmahoney/thesis.html Implementation detail: how to make a P2P client useful enough that people will want to install it? -- Matt Mahoney, [EMAIL PROTECTED] ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244&id_secret=71629115-649a10
