Enwik9 (Large text/Hutter prize) has a vocabulary of 1.4M words. Your word
pair matrix will have 2 trillion parameters.

The usual way to implement a word pair semantic model is LSA (latent
semantic analysis) of the most common 20K words. It is a 20K to about 200
to 20K neural network where you add neurons to the hidden layer
continuously while training. I didn't implement it in PAQ because I
couldn't figure out how to make it fast enough.

On Sun, Aug 22, 2021, 5:39 PM <[email protected]> wrote:

> Keep in mind my program is combing thoudands of contexts and does so by
> combining percentages!!!!!!!!! :[[[[[[ !!!!!!
>
> This means every relation in ITS BRAIN like dog = cat .....dog = man .
> ........ book = store .......... pump = lock ......... pump = push
> .......... push = throw ........push = twist................
>
> need, percentages!!! Dog doesn't = man, it is only "similar", ex. dog =
> man 78%.
>
> I cannot do this faithfully, only my brain know how similar pony is to
> woman....
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T192296c5c5a27230-Mc95e416c20e9bfd91039f6d7
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to