On Thu, Sep 21, 2023, 12:21 AM <[email protected]> wrote:
> https://twitter.com/DimitrisPapail/status/1704516092293452097 > 0.177 on enwik9 puts it about 47th place on my benchmark, just ahead of 7zip. Of course that is by training on the first 10% (enwik8). I don't know why they haven't figured out online training. The top ranked program, nncp, was released 2 years ago. It uses an online transformer with 199M parameters running on 10K cuda cores in about 3 days on an RTX3090 GPU with 24 GB for a compression ratio of 0.1085. Online training isn't hard. Our brains do it. I wrote the first practical online neural network compressor in 2000, leading to the PAQ series. http://mattmahoney.net/dc/mmahoney00.pdf What if we had 1 big model in the cloud that everyone accesses? So say it > compresses your file to like really small, your computer now has like so > much more room, and doesn't need to store the big LLM. > That's not the point of compressing. The point is to measure prediction accuracy. If you had a distributed language model and used it to compress your files, it might be updated and not make exactly the same sequence of predictions when you decompress. But yeah, I would like to see a distributed language model. In my 2008 AGI proposal, I described a peer to peer network of narrow experts competing for attention in a hostile environment where information has negative value. Messages are most likely to be retained where they maximize mutual information, leading to a distributed compression algorithm. http://mattmahoney.net/agi2.html ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T414bc941cdd95c2d-Mf067c91f0fa131064f9c4bda Delivery options: https://agi.topicbox.com/groups/agi/subscription
