Humans can do lossless compression but do it badly.  Since human memory is 
inherently lossy, we must add error correcting information, which increases 
mental effort (storage cost, and thus learning time).  Recalling text verbatim 
is harder than paraphrasing.  It requires the mental equivalent of storing 
several identical copies.

Humans can also execute arbitrary algorithms, but not efficiently.  So it is 
possible to do things like send Morse code, which compresses text by using 
shorter codes for the most common letters.  But this is not making use of our 
built in language model.  The sender and receiver have to agree on a learned, 
predefined code (although the code is based on a crude model).  Learning such 
codes requires extra effort (storage) so that the signal can be decoded without 
errors.  Ironically, the receiver still uses his language model for error 
correction outside the scope of the Morse code decompression algorithm.  If a 
signal is ambiguous as to whether a beep is a dot or a dash, the receiver can 
usually guess correctly by considering context.  Machines can't do this.  
Decoding telegraph signals sent by humans is a hard problem for machines.

Now, one may interpret this as an argument that lossless compression is 
unrelated to AI and we should use lossy compression as a test instead.  No, I 
am not arguing that.  Humans make very good use of their imprecise language 
models for text prediction and error correction.  Those are the qualities that 
we want to emulate in AI.  A machine can make a model precise at no extra cost, 
enabling us to use text compression to objectively measure these qualities.  
Researchers in speech recognition have been using this approach for the last 15 
years.
 
-- Matt Mahoney, [EMAIL PROTECTED]

----- Original Message ----
From: J. Andrew Rogers <[EMAIL PROTECTED]>
To: [email protected]
Sent: Tuesday, August 22, 2006 12:58:04 AM
Subject: Re: [agi] Lossy *&* lossless compression


On Aug 20, 2006, at 11:15 AM, Matt Mahoney wrote:
> The argument for lossy vs. lossless compression as a test for AI  
> seems to be motivated by the fact that humans use lossy compression  
> to store memory, and cannot do lossless compression at all.  The  
> reason is that lossless compression requires the ability to do  
> deterministic computation.  Lossy compression does not.


I think this needs to be qualified a bit more strictly in real (read:  
finite) cases.  There is no evidence that humans are incapable of  
lossless compression, only that lossless compression is far from  
efficient and humans have resource bounds that generally encourage  
efficiency.  A distinction with a difference.  Being able to recite a  
text verbatim is a different process than reciting a summary of its  
semantic content, and humans can do both.

Even a probabilistic (e.g. Bayesian) computational model can  
reinforce some patterns to the point where all references to that  
pattern will be perfect in all contexts over some finite interval.  I  
expect it would be trivial to prove a decent probabilistic model has  
just such a property over any arbitrary finite interval for any given  
pattern with proper reinforcement.


I do not disagree that measures of lossy models is a significant  
practical issue for the purposes of a contest.  But on the other  
hand, lossless models demand certain levels of inefficiency that a  
useful intelligent system would not exhibit and which impacts the  
solution space by how poorly these types of algorithms scale  
generally.  If we knew an excellent lossless algorithm could fit  
within the resource constraints common today such that a lossy  
algorithm was irrelevant to the contest, I would expect a contest  
would be unnecessary.   Which is not to say that I think the rules  
should be changed, just that this is quite relevant to the bigger  
question.

Cheers,

J. Andrew Rogers

-------
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]



-------
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]

Reply via email to