Re: [agi] Why do Transformers have layers of Attention Heads?

immortal . discoveries Wed, 16 Dec 2020 02:51:47 -0800

See functions of Attention Heads: 
https://lena-voita.github.io/posts/acl19_heads.html


Attention Heads / TRANSFORMERS are using position, syntax, semantic relations,  
rare-ness, ex. the all CAPS I highlight are either similar or matched parts of 
memories: "IF blah blah blah blah, THEN blah blah blah, X", or "I was WALKING 
FAST and the GIRL stopped X".

But so does mine. We really need to stop using all these words and structures 
that don't relate to our breadth of knowledge, no one knows what a Head is - 
not even you, or what Backprop does, we need to use words like Follows, and 
Hebbian Learning of Follows. All these mungojungo terminology are only 
confusing the real AGI technology. Once you admit what the real words are, you 
can see the true nature of the machine.

Doesn't anyone here have a fast yet full-cheat-sheet in clear English of as far 
as they have learnt AGI works? It doesn't take 5 years or even a month to learn 
what you know on AGI, nor require lots of painful searching if you provide it 
in full all at one place.
------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/Ta21b3b47e26f50e7-Mfb5efc5a7dc69fba9e620cc7
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Re: [agi] Why do Transformers have layers of Attention Heads?

Reply via email to