"You / they / all statistical crap need this "reward for prediction" because predictive value is not quantified bottom-up. In a comparison-first paradigm, IĀ quantify it as match, see "AtomicĀ comparison" section. So I can select for it incrementally, instead of waiting for ridiculously coarse feedback. The whole "self-supervised" mindset is a crutch, they use RL because their core unsupervised method (perceptron) is a cripple."
:0 ... Most of this reply makes no sense to me....it would be better if you used more common words and examples to explain what you're seeing visually. But I'm not using supervision or labeling, just unsupervised learning. The reward for text prediction is not what you think of when you read RL, it is merely a way to make it AGI like by making it talk about certain features more often than others. It is like proximity, but permanent and unchanging no matter how much data it sees. It doesn't improve prediction score I think for Perplexity/ Lossless Compression. ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T5b614d3e3bb8e0da-Mb5fb1e1f9014da743a937a31 Delivery options: https://agi.topicbox.com/groups/agi/subscription
