Mostly for PAQ/ZPAQ the contexts start on byte or word boundaries and end at the current bit. As an optimization I use a hash table to index contexts that end every 3 (PAQ) or 4 (ZPAQ) bits that map to an array of 7 or 15 bit histories (1 byte each). The bits after the nibble boundary select the history, which maps to a bit probability. After coding, the history and prediction are both updated. I use a data structure that minimizes the number of 64 byte cache line misses because random memory access dominates the compression time.
On Tue, May 11, 2021, 12:26 PM <[email protected]> wrote: > @Matt oh maybe I got it, is this way a good idea?: > > Say I search for these contexts: "walking dow[n[ [t[h[e]]]]] ?" When I get > matches, there is byte sized predictions, simply I gather all the first > bits. Then upon predicting the next bit, I remove the bytes (predictions) > that don't start with that bit, and tally up again the 2nd bits. > > What do you think? > > I couldn't imagine storing contexts like below so that I could get > refined/dedicated predictions after outputing a new bit: > 01110100 11[1001[01 [0000[110[0]]]]] > (where the windows are on bits, instead of limited to bytes only) > *Artificial General Intelligence List <https://agi.topicbox.com/latest>* > / AGI / see discussions <https://agi.topicbox.com/groups/agi> + > participants <https://agi.topicbox.com/groups/agi/members> + delivery > options <https://agi.topicbox.com/groups/agi/subscription> Permalink > <https://agi.topicbox.com/groups/agi/Tf856e4082d9ea09a-M82f0dc4516fef27590e11ab7> > ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Tf856e4082d9ea09a-Mf79a6c17630883d5b060fff3 Delivery options: https://agi.topicbox.com/groups/agi/subscription
