--- On Sun, 12/28/08, Philip Hunt <[email protected]> wrote: > > Please remember that I am not proposing compression as > > a solution to the AGI problem. I am proposing it as a > > measure of progress in an important component (prediction). > > Then why not cut out the middleman and measure prediction > directly?
Because a compressor proves the correctness of the measurement software at no additional cost in either space or time complexity or software complexity. The hard part of compression is modeling. Arithmetic coding is essentially a solved problem. A decompressor uses exactly the same model as a compressor. In high end compressors like PAQ, the arithmetic coder takes up about 1% of the software, 1% of the CPU time, and less than 1% of memory. In speech recognition research it is common to use word perplexity as a measure of the quality of a language model. Experimentally, it correlates well with word error rate. Perplexity is defined as 2^H where H is the average number of bits needed to encode a word. Unfortunately this is sometimes done in nonstandard ways, such as with restricted vocabularies and different methods of handling words outside the vocabulary, parsing, stemming, capitalization, punctuation, spacing, and numbers. Without accounting for this additional data, it makes published results difficult to compare. Compression removes the possibility of such ambiguities. -- Matt Mahoney, [email protected] ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b Powered by Listbox: http://www.listbox.com
