[agi] a history of neural networks - and my opinions

immortal . discoveries Wed, 12 May 2021 18:00:47 -0700

https://www.skynettoday.com/overviews/neural-net-history


Right off the bat this is wrong. This is not where AI started and if it did it 
still isn't the base of how all AI work! The answer is Markov chains / 
Prediction by Partial Match (PPM!!). The foundation is not mapping a line on 
dots to map inputs and outputs to generalize it and find the function. This is 
going over it all without touching the red cherry in the center!!

Looking at the plot is visual to you, but it's really numbers, not even text. 
ITS NOT LINEAR REGRESSION. This, [here] ,is you predicting the next pixel when 
you look at all those dots. And I told you, how that works, just yesterday. The 
same letter or position reoccurs nearly as the same in a text or image(s), 
hence you can learn what usually comes next! Count them!!! cat>eat, cat>eat, 
cat>sleep, cats usually eat! Add time delay and and delay pattern and 2 
dimensions unlike text and you can do it too.

Now, overlooking you *LOOKING *at the plot: If we take his plot of dots as 
numbers, HERE is the foundation:

Notice the higher a dot is, the more to the right it is? 
4,5.....88,121........33,30......556,856......they are translates. Cat cat dog 
dog table table you you eat eat vine vine gum gum horse horse teeth ? What 
comes next? And if you simply use the Frequency I explained, it can solve 4, 
_?_....usually it's 4, sometimes 3 or 5, less common is it 2 or 6...and so on. 
This is pattern matching, NOT mapping linear planes of lines..............;( :( 
xD( cry

Just cuz they teach it in machine academy school you gotta be this dumb too. :( 
Give it tons of thought.

I GET that backprop may also be a way to do my way, simply faster seemingly, 
it's just an optimization, like using a jeep over a truck, still a vehicle, 
different gas type...............BUT this is an optimization! Not actually how 
pattern prediction works/ what it is (pattern matching!!) It's like saying 
planes are red cuz rockets are used most and are fastest and look mostly red 
from fire, when really planes are just bodies with motion, backprop is a 
overcast shadow covering it.

----------------------------------------------------------

'This generalization principle is so important that there is almost always a 
*test set* of data (more examples of inputs and outputs) that is not part of 
the training set. The separate set can be used to evaluate the effectiveness of 
the machine learning technique by seeing how many of the examples the method 
correctly computes outputs for given the inputs. The nemesis of generalization 
is *overfitting* - learning a function that works really well for the training 
set but badly on the test set. Since machine learning researchers needed means 
to compare the effectiveness of their methods, over time there appeared 
standard *datasets* of training and testing sets that could be used to evaluate 
machine learning algorithms."

....Don't you onlyyyy test predictions on test set??? Not training set. 
Training set only builds the model, there is, oh, if you use backprop, ya, but 
really this is not predictions being used, the real thing is you are building a 
model, you can tell this is true by the fact that you don't need prediction 
error to train - you need more data, and add counts onto connections like 
markov chains or PPM, the error for training is only using this idea. You 
wouldn't call it overfitting then, simply bad score. In my AI I tweak 
parameters in the main algorithm, this is not really like neural weights 
though, and this can be automated BTW. As for tweaking neural weights to lower 
prediction error, the code can't tweak this on your own so that you change one 
connection weight at some cheap cost lost while gain more accuracy on many 
other tasks.... this can be done only by the data, and can be done cleanly.

For a example, if you see the context "walking down the ?" what what are you 
going to lower one weight that predicts frog and up other weight that predicts 
street? You don't need to though! You never see frog in this context, you say 
only what you see, then combine predictions of course to say unseen true 
answers. Another way to change predictions to get more bang for buck, so this 
thinking of his way assumes (!): Look at 'these' windows on the prompt of text 
like 'this' and not like 'this', ex. the last 2 words and a hole then next 
word, and other windows like so, and combine predictions from each context 
matches in memory. I.e. decide which to use, lose some gain to get more bang 
diversely, hmm, so you'd do holed matches along the prompt instead of mostly 
holed matches on the just last 8 letters, which I already know about, but 
anyway hmm, changing weights using backprop to achieve this, is absurd, it is a 
code thing.

.................My point is automating this as a brute force to decide where 
to do holed text matches, or what to predict, is WRONG, and costly, we don't 
need to causelessly predict frog, we let it store entail contexts it saw, and 
we don't need to let it brute force try to change code to decide where to do 
holed matches on prompts, we tell it where to and in the advanced stages it 
tells itself where to look. Really ehere to look on the prompt is a entail 
thing, cuz why do i say look here here and here? Cuz I see context>and say what 
comes next. And that's the advanced stage.

------------------------------------------------------

"The reason why this does not work for multiple layers should be intuitively 
clear: the example only specifies the correct output for the final output 
layer, so how in the world should we know how to adjust the weights of 
Perceptrons in layers before that? The answer, despite taking some time to 
derive, proved to be once again based on age-old calculus: the chain rule."

Haha, no, PPM and Markov Chains are how, and much easier, and can work in a 
full tall hierarchy, not just trie tree. Backprop is ONLY A OPTIMIZATION (that 
may be wrong and overly confusing AI development both in laypeople and in 
telling a net how to gather predictions).

---------------------------------------------------------

He mentions doing backprop to find exponential thresholds, and that a NN must 
be able to do AND/OR/NOT. No, exponential is based on some criteria, codable 
for all at start of code, no need to change it, it adapts. For example cat>eats 
is predicted 80%, so predict it 85%, but this changes if has more predictions 
or the word is rare or too common, see? Something like that propably. And the 
AND/OR/NOT, how many times do I do that in a day!!?? PPM doesn't do that 
really, really! Only IF>THEN prediction based on contexts looked at from the 
prompt text. It will predict cat eat or ate if both are just as probable, half 
the time each will get predicted. And holes matches considers half only as is 
needed, doing something like a OR careless of the other half ex. "walkddging 
downz the wide strZet and saw a ?" Ignores those flaws and predicts still. And 
to actually do AND or OR is done by things based on the things PPM 
do....matches...for example translate ex. if a and b are predicted right as i 
say prior back here, predict "go".
------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T44c4079317aac9d1-Md537de721cc3c2c3da635455
Delivery options: https://agi.topicbox.com/groups/agi/subscription

[agi] a history of neural networks - and my opinions

Reply via email to