On Wednesday, September 30, 2020, at 4:58 PM, Matt Mahoney wrote: > Yes. Context models in data compressors express confidence in their > predictions by giving probabilities close to 0 or 1 after many correct > predictions. These get higher weights by averaging with other models after a > stretching transform, ln(p)/ln(1-p). Also, mixers (neural networks) learn > which models are most accurate and weight them more heavily. Thanks so much.
On Wednesday, September 30, 2020, at 8:12 PM, Matt Mahoney wrote: > Another mixing technique is the ISSE (indirect secondary symbol estimator) > chain. Each element in the chain maps an order n bit history (in increasing > n) to a 2 input mixer where one input is the previous ISSE prediction and the > other is the constant 1. The final output can be used directly or mixed > further. > I'm trying to understand this intuitively. I don't get it yet. Is ISSE looking at the last say 30 times the word cat has appeared over the last 1,000 words and thinks it should appear more likely with the other appearances? I mean is ISSE a grouping/clumping of some word/topic ex. enwik8 has a piece on science then after that it shifts to toys for infants. Explain ISSE intuitively. Perhaps it is taking the prediction as input? But such has not made sense to me... what is the pattern? ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T9a31de2189d7ab2a-M2f1d28b4fb330bb48d388b94 Delivery options: https://agi.topicbox.com/groups/agi/subscription
