--- [EMAIL PROTECTED] wrote: > Relating to the idea that text compression (as demonstrated by general > compression algorithms) is a measure of intelligence, > Claims: > (1) To understand natural language requires knowledge (CONTEXT) of the > social world(s) it refers to. > (2) Communication includes (at most) a shadow of the context necessary > to understand it. > > Given (1), no context-free analysis can understand natural language. > Given (2), no adaptive agent can learn (proper) understanding of natural > language given only texts. > > For human-like understanding, an AGI would need to participate in > (human) social society.
The ideal test set for text compression as a test for AI would be 1 GB of chat sessions, such as the transcripts between judges and human confederates in the Loebner contests. Since I did not have this much data available I used Wikipedia. It lacks a discourse model but the problem is otherwise similar in that good compression requires vast, real world knowledge. For example, compressing or predicting: Q. What color are roses? A. ___ is almost the same kind of problem as compressing or predicting: Roses are ___ Of course, the compressor would be learning an ungrounded language model. That should be sufficient for passing a Turing test. A model need not have actually seen a rose to know the answer to the question. I don't think it is possible to find any knowledge that could be tested through a text-only channel that could not also be learned through a text-only channel. Whether sufficient testable knowledge is actually available in a training corpus is another question. I don't claim that lossless compression could be used to test for AGI, just AI. A lossless image compression test would be almost useless because the small amount of perceptible information in video would be overwhelmed by uncompressible pixel noise. A lossy test would be appropriate, but would require subjective human evaluation of the quality of the reproduced output. For text, a strictly objective lossless test is possible because the perceptible content of text is a large fraction of the total content. -- Matt Mahoney, [EMAIL PROTECTED] ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244&id_secret=49471493-636320