To clarify the above:
In transformers and graph NNs, context or embeddings in respectively attention 
heads and edges represent relative lateral positions of connected items. But 
the main question is where they come from. In a fully unsupervised scheme they 
must be learned, not hand-coded. If such learning is to be distinct from 
generic backprop, the only alternative I see is connectivity clustering. 
Otherwise transformers become indistinguishable from MLP. 
------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T8366cc740ec68376-Mf44fda55738a32dc456822bb
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to