On Tue, May 25, 2021, 9:20 AM stefan.reich.maker.of.eye via AGI < [email protected]> wrote:
> On Monday, May 24, 2021, at 10:00 PM, Matt Mahoney wrote: > > To code a word prediction, you have to calculate the probability > distribution over the entire vocabulary. > > Wait. You're saying I have to scan my entire vocabulary when I predict a > word someone says? I don't think that's what anyone does. > In a neural network, the n probabilities can be calculated in parallel in one step. To arithmetic code the result you need to assign a fraction of the range space to each word in proportion to their probabilities. This requires at a minimum adding them up, which can be made parallel up to log n steps using an adder tree. That's the same as predicting log n bits serially, which is what my code does. ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Tf856e4082d9ea09a-M74aeddab10e25bb35bd6e1a9 Delivery options: https://agi.topicbox.com/groups/agi/subscription
