Humans can do lossless compression but do it badly. Since human memory is inherently lossy, we must add error correcting information, which increases mental effort (storage cost, and thus learning time). Recalling text verbatim is harder than paraphrasing. It requires the mental equivalent of storing several identical copies.
Humans can also execute arbitrary algorithms, but not efficiently. So it is possible to do things like send Morse code, which compresses text by using shorter codes for the most common letters. But this is not making use of our built in language model. The sender and receiver have to agree on a learned, predefined code (although the code is based on a crude model). Learning such codes requires extra effort (storage) so that the signal can be decoded without errors. Ironically, the receiver still uses his language model for error correction outside the scope of the Morse code decompression algorithm. If a signal is ambiguous as to whether a beep is a dot or a dash, the receiver can usually guess correctly by considering context. Machines can't do this. Decoding telegraph signals sent by humans is a hard problem for machines. Now, one may interpret this as an argument that lossless compression is unrelated to AI and we should use lossy compression as a test instead. No, I am not arguing that. Humans make very good use of their imprecise language models for text prediction and error correction. Those are the qualities that we want to emulate in AI. A machine can make a model precise at no extra cost, enabling us to use text compression to objectively measure these qualities. Researchers in speech recognition have been using this approach for the last 15 years. -- Matt Mahoney, [EMAIL PROTECTED] ----- Original Message ---- From: J. Andrew Rogers <[EMAIL PROTECTED]> To: [email protected] Sent: Tuesday, August 22, 2006 12:58:04 AM Subject: Re: [agi] Lossy *&* lossless compression On Aug 20, 2006, at 11:15 AM, Matt Mahoney wrote: > The argument for lossy vs. lossless compression as a test for AI > seems to be motivated by the fact that humans use lossy compression > to store memory, and cannot do lossless compression at all. The > reason is that lossless compression requires the ability to do > deterministic computation. Lossy compression does not. I think this needs to be qualified a bit more strictly in real (read: finite) cases. There is no evidence that humans are incapable of lossless compression, only that lossless compression is far from efficient and humans have resource bounds that generally encourage efficiency. A distinction with a difference. Being able to recite a text verbatim is a different process than reciting a summary of its semantic content, and humans can do both. Even a probabilistic (e.g. Bayesian) computational model can reinforce some patterns to the point where all references to that pattern will be perfect in all contexts over some finite interval. I expect it would be trivial to prove a decent probabilistic model has just such a property over any arbitrary finite interval for any given pattern with proper reinforcement. I do not disagree that measures of lossy models is a significant practical issue for the purposes of a contest. But on the other hand, lossless models demand certain levels of inefficiency that a useful intelligent system would not exhibit and which impacts the solution space by how poorly these types of algorithms scale generally. If we knew an excellent lossless algorithm could fit within the resource constraints common today such that a lossy algorithm was irrelevant to the contest, I would expect a contest would be unnecessary. Which is not to say that I think the rules should be changed, just that this is quite relevant to the bigger question. Cheers, J. Andrew Rogers ------- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] ------- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
