Here are my results with paq8px_v209fix1 -8 54,804 2024pre-processed enwik5.txt 20,151 2024pre-processed enwik5.txt.paq8px209fix1
I read your algorithm description on the encode forum at https://encode.su/threads/3595-Star-Engine-AI-data-compressor?p=86599&viewfull=1#post86599 but it's not clear to me. Is this PPM with a fixed context tree? On Mon, Dec 8, 2025 at 2:14 AM <[email protected]> wrote: > > Can you try the newest (2024pre-processed enwik5.txt I posted from a few days > ago? I named it 2024 at the start of the filename. > > But I already know how to (and have) learn related words and use related > words without word2vec and without Glove etc. Using the ideas of PPM etc > instead. > > And I posted on my project page that gap matches are working in my 52 lines > of code (and the list at the top that runs these searches can add more or > time-delay matches too like "the big big big cat" matches "the cat", or both > done together as a search too). The 52 lines of code also does priming and > evaluation. > > I posted that the delay matches don't work at all, and that they should - but > probably only once I add the building up of the sentence (parsing, using > probabilities to guide it) like Transformers do, and this will allow me to > find only the searches that I "should" be making, both for delays and for > gaps, and order-ns too. > Artificial General Intelligence List / AGI / see discussions + participants + > delivery options Permalink -- -- Matt Mahoney, [email protected] ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Tf0bedfcd44454678-M8b4d6650bb4ef0367a468501 Delivery options: https://agi.topicbox.com/groups/agi/subscription
