subject:"\[agi\] Lexical model learning for LLMs"

Re: [agi] Lexical model learning for LLMs

2024-02-21 Thread Matt Mahoney

To answer your question on stackexchange, the way a compressor would guess that a binary string is made up of 2 bit tokens with different frequencies is to train on a context that includes both the previous bit and the bit position mod 2. PAQ has models like this at the byte level for length 2, 3,

Re: [agi] Lexical model learning for LLMs

2024-02-20 Thread James Bowery

https://twitter.com/jabowery/status/1760015755792294174 https://youtu.be/zduSFxRajkE On Tue, Nov 21, 2023 at 7:20 PM Matt Mahoney wrote: > I started the large text benchmark in 2006 > (https://mattmahoney.net/dc/text.html ) with the claim that all you > need to pass the Turing test is text

[agi] Lexical model learning for LLMs

2023-11-21 Thread Matt Mahoney

I started the large text benchmark in 2006 (https://mattmahoney.net/dc/text.html ) with the claim that all you need to pass the Turing test is text prediction, which you can measure with compression. Both the benchmark and the Hutter prize use the same 1 GB text file (enwik9), with the goal of

Re: [agi] Lexical model learning for LLMs

Re: [agi] Lexical model learning for LLMs

[agi] Lexical model learning for LLMs

3 matches

Site Navigation

Mail list logo

Footer information