On Wed, Dec 10, 2025 at 4:59 AM <[email protected]> wrote: > > Oh. My algorithm does something similar to SSE. But I think it's a bit > different. The priming I use considers how far back each character is in the > last ~500 characters, and also use what you called MATCH but that's for > longer than order 0 (and I do that for the first few orders too). > > I have at least 1 more question for you though. How does you algorithm (if it > does?) handle time delay matching? Here's an example of the problem: You've > seen in the past "the large large large um um um cat chewed" and you see in > the future "the cat ?_?" and can in theory use the old memory's prediction - > but you need to match that old memory - how does your algorithm handle this > or does it not use these?
zpaq has some advanced options for contexts with fixed length gaps. It also has models that map word sequences to lower case and skip spaces and punctuation. Beyond that would require writing code in ZPAQL in a config file and compressing with zpaqd, but I haven't done this. Here is a simple example using the advanced options described in https://mattmahoney.net/dc/zpaqdoc.html C:\tmp>zpaq a "" "2024pre-processed enwik5.txt" -ms4.0ci1.1.1.1am zpaq v7.15 journaling archiver, compiled Aug 17 2016 Creating at offset 0 + 0 Adding 0.054804 MB in 1 files -method s4.0ci1.1.1.1am -threads 2 at 2025-12-12 00:04:19. + 2024pre-processed enwik5.txt 54804 0 + (54804 -> 22917) = 22917 0.046 seconds (all OK) On the first line, the archive name is "" which means to just report the compressed size and discard it. The option -ms4.0ci1.1.1.1am means as follows: -m selects the compression method. s selects streaming mode, which means simple compression with no dedupe or rollback. 4 selects a block size of 2^4 MB. This also controls how much memory is used by the other components. .0 means no preprocessing. You can also select LZ77, BWT, and E8E9 but these make compression worse in this case. c selects an indirect context model with an order 0 context. It maps the previous bits of the current byte to an 8 bit state representing counts of 0 and 1 bits and the last bit. That state is then mapped to probability by a second table. After coding, both tables are updated to the new state and a new probability that reduces the prediction error by 0.1%. i1.1.1.1 selects 4 ISSE components with each one being 1 order higher than the previous, i.e. orders 1, 2, 3, and 4. These are hashes that include the previous bits of the current byte as well. An ISSE takes the output of the previous component and mixes it with a constant 1, where the two mixer weights are selected by the state as in an ICM. Mixing is by weighted summation in the logistic domain, i.e stretch(p) = log(p/(1-p)), which can be negative. Then the state is updated and the two weights are updated to reduce the prediction error by 0.1%. a selects a match model. It uses a hash table to find the last occurrence of an order 8 context match and predicts whatever bit came next, weighted by the actual length of the match. m selects a mixer that combines the predictions of all of the previous components using a simple neural network with no hidden layer. The mixer weights are selected by the order 0 context, i.e. the previous bits of the current byte. The weighted averaging is in the logistic domain as before. The output is converted back to a prediction by the squash(x) = 1/(1+e^-x), the inverse of stretch. The last 2 lines show that the file was compressed from 54804 bytes to 22917 in 46 milliseconds. You can get better compression by adding an SSE to the final output. -ms4.0ci1.1.1.1ams -> 22895 (47 ms) -ms4.0ci1.1.1.1amst -> 22874 (62 ms) s is the SSE (or APM in PAQ). It selects a new prediction from a 2-D table indexed by the order 0 context and the old prediction stretched and quantized to 32 levels and interpolated. It is updated by reducing the prediction error of the closest entry by 0.1%. t is a 2 input mixer for the last 2 components, which are the SSE input and output. This is just a simple context model with no gaps and no word modeling. I didn't try word modeling because the input was already dictionary preprocessed. But it does help on the original text: zpaq a "" "enwik5" -ms4.0ci1.1.1.1amst -> 28216 (94 ms) zpaq a "" "enwik5" -ms4.0ci1.1.1.1awmst -> 27316 (94 ms) zpaq a "" "enwik5" -ms4.0ci1.1.1.1aw2mst -> 27288 (109 ms) w selects an ICM where the context is the current word (A-Z, case insensitive). w2 is an order 0-1 ICM-ISSE chain, but where the contexts are words instead of bytes. (w3 and higher make compression worse). You can find a more detailed description of the compression algorithm on the main page. https://mattmahoney.net/dc/zpaq.html -- -- Matt Mahoney, [email protected] ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Tf0bedfcd44454678-Mcb0b16824e7f471babbee651 Delivery options: https://agi.topicbox.com/groups/agi/subscription
