On Tue, Mar 21, 2023, 5:39 AM Telmo Menezes <te...@telmomenezes.net> wrote:

> Over-fitting is less of an issue here because it's trivial to write a
> sentence that's never before been written by any human in history.
> That is not enough. A small variation on a standard IQ test is still the
> same IQ test for a super powerful pattern detector such as GPT-4.
> I have no doubt that GPT-4 can generalize in its domain. It was rigorously
> designed and tested for that by people who know what they are doing. My
> doubt is that you can give it an IQ test and claim OMG GPT-4 IQ > 140. This
> is just silly and it is junk science.
> It's true that once one learns a way to solve problems it becomes easier
> to reapply that method when you next encounter a related problem.
> But isn't that partly what intelligence is? If a system has read the whole
> Internet and seen every type of problem we know how to solve, and it can
> generalize to know what method to use in any situation, that's an
> incredible level of intelligence which until now, we haven't had in machine
> form before.
> I would say that the important methodological distinction here is between
> learning intelligent behavior and demonstrating intelligent behavior.
> Obviously it is possible to learn and generalize from a dataset, otherwise
> there would be no point in wasting time with ML. But if you want to
> convince other people that you have indeed achieved generalization, then
> the scientific gold standard is to demonstrate this on data that was not
> used in training, because beyond generalization there can be also (and
> often is) overfitting. This is not a controversial statement. Take any
> published ML result and apply it to the training data, and 99.9999999% of
> the time it will perform better / much better in the training data. Because
> it also learned the little details (over-fitting) that guide it towards the
> correct answer.
> An extreme case of this is stock trading. I am not kidding, and I suspect
> you know it: I can easily produce an ML model that achieves >1000% profit
> per month on the derivatives market, as long as we only test on in-corpus
> data. But I will raise the stakes! Are you ready?
> I promise I will train my algorithm only on ONE crypto coin from 2020 to
> 2022. Then we will apply it to OTHER crypto coins. I still promise >1000%
> profit per month. Do you want it now?
> I understand that GPT-4 is trained on most available text in natural
> language. That is amazing, I love it. But this comes with additional
> methodological challenges. I am pretty sure that the GPT-4 teams knows
> about them, and they probably have a rigorously reserved training set to
> guide their own research. Also, I fully believe that they are serious
> researchers and would never embark in this IQ test bullshit.
> I am really just insisting on sticking to the scientific attitude. I do
> not understand what I could saying that is so controversial...

I see your point about testing. Someone on the entropy list chose to write
their own word problem puzzle for it to solve. Perhaps this is the way, to
design new intelligence tests from scratch. But I don't see a way to ensure
we have developed entirely new classes of problem of a type not seen before
in the corpus of the Internet. Perhaps the opportunity will only exist when
some mathematician proves something new.

> You can tweak the parameters of the problem to guarantee it's a problem it
> has never before been seen, and it can still solve it.
> Some yes, some no. Almost one century of computer science still applies.
> You can choose to wait for the academic write ups to come out a few months
> down the line but by then things will have advanced another few levels from
> where we are today.
> I am not wanting to wait for anything, I am asking questions that can be
> addressed right now:
> - Are there IQ tests in the training data of GPT-4. Yes or no?
> - Can we conceive of human-level intelligence without recurrent
> connections or some form of ongoing recursivity / Turing completeness? Yes
> or no?
> I've been thinking about this a lot.
> My friend with access to GPT-4 asked it: "Does your neural network contain
> such reflexive loops, or is it strictly feed forward?", below is its reply:
> --------
> "As an AI language model, my underlying architecture is based on the
> Transformer model, which is primarily feedforward in nature but includes
> some elements of recurrence or reflexivity, specifically in the
> self-attention mechanism.
> The Transformer model consists of a multi-layer feedforward neural network
> with self-attention mechanisms that allow the model to learn complex
> relationships between input and output sequences. The self-attention
> mechanism weighs the importance of different input elements relative to
> each other, effectively capturing long-range dependencies and relationships
> within the input data. This mechanism introduces a form of reflexivity or
> recurrence, as it allows the model to consider the context of the entire
> input sequence when generating outputs.
> However, it is important to note that the self-attention mechanism in the
> Transformer model is not equivalent to traditional recurrent neural
> networks (RNNs), which maintain a hidden state that is updated at each time
> step. The Transformer model processes input sequences in parallel rather
> than sequentially, which makes it fundamentally different from RNNs.
> In summary, while my neural network architecture is primarily feedforward,
> it includes some elements of reflexivity in the form of self-attention
> mechanisms that allow the model to capture complex relationships within
> input sequences."
> --------
> Is this enough to meet Hofstadter's requirements of recursion? I do not
> have the expertise to say. But I do see recursion exist in a way no one
> seems to ever mention:
> The output of the LLM is fed back in, as input to the LLM that produced
> it. So all the high level processing and operation of the network at the
> highest level, used to produce a few characters of output, then reaches
> back down to the lowest level to effect the lowest level of the input
> layers of the network.
> If you asked the network, where did that input that it sees come from, it
> would have no other choice but to refer back to itself, as "I". "I
> generated that text."
> Loops are needed to maintain and modify a persistent state or memory, to
> create a strange loop of self-reference, and to achieve Turing
> completeness. But a loop may not exist entirely in the "brain" of an
> entity, it might offload part of the loop into the environment in which it
> is operating. I think that is the case for things like thermostats, guided
> missiles, AlphaGo, and perhaps even ourselves.
> We observe our own actions, they become part of our sensory awareness and
> input. We cannot say exactly where they came from or how they were done,
> aside from modeling an "I" who seems to intercede in physics itself, but
> this is a consequence of being a strange loop. In a sense, our actions do
> come in from "on high", a higher level of abstraction in the hierarchy of
> processing, and this seems as if it is a dualistic interaction by a soul in
> heaven as Descartes described.
> In the case of GPT-4, its own output buffer can act as a scratch pad
> memory buffer, to which it continuously appends it's thoughts to. Is this
> not a form of memory and recursion?
> For one of the problems in John's video, it looked like it solved the
> Chinese remainder theorem in a series of discrete steps. Each step is
> written to and saved in it's output buffer, which becomes readable as it's
> input buffer.
> Given this, I am not sure we can say that GPT-4, in its current
> architecture and implementation, is entirely devoid of a memory, or a
> loop/recursion.
> I am anxious to hear your opinion though.
> This is a great answer by GPT-4 and a good point.

Just to clarify as I am it sure it was clear, GPT-4's answer is only what
was in quotes and between the series of dashes.

I agree that the ability to re-feed the output buffer back to the language
> model constitutes a form of computational recurrence and his indeed a
> memory mechanism. One could even imagine more sophisticated "tricks", where
> one explains GPT-4 how to read/write from some form of database.
> I can imagine several ways forward here:
> (1) The amount of input/context that LLMs can receive keeps increasing,
> and eventually it is so large that RLHF can teach LLMs to make use of an
> input/output buffer as a working memory;

> (2) Some neuro-symbolic scheme is devised such that the LLM can use APIs
> to extend itself;
> (3) True recurrence inside the model is achieved (this requires some new
> learning algorithm that does not suffer from vanishing gradient).
> I think that (3) is by far the scientifically most exciting, but it is one
> of those things where it seems hard to estimate when the breakthrough will
> come. Maybe tomorrow, maybe in three decades... So another question is, can
> we ride (1) or (2) all the way to AGI? I don't know...
> I suspect that truly integrating all the modalities in a human-being kind
> of way (language, vision, memory formation and access, meta-cognition, etc)
> will require (3). But I do not have a strong argument. I love coding, so in
> that sense (2) is a bit more exciting :)
> For me only two things are clear at this point:
> - GPT-* is a spectacular, qualitative jump in AI. It can do things that we
> couldn't dream of a couple of years ago. It will almost certainly be a
> piece of the puzzle towards AGI.


> - There is still a huge chasm between Human Intelligence (HI) and GPT-4.
> How long will it take to cross that chasm? Who knows...

I think even after we cross they chasm it may not be immediately clear they
we have done so.

I would say gpt-4 has super human intelligence in some domains and has sub
human intelligence in others.

> One thing I wonder is if the main difference between HI and LLMs lies in
> the utility function more than everything else. We humans have this highly
> evolved, emergent utility function that allows us to be guided by feelings
> (boredom, curiosity, lust, fear, etc) into highly complex behaviors and
> meta-behaviors. We decide to learn things in a certain way for a
> complicated set of reasons towards a long term goal. In classical AI
> parlance, we are autonomous agents.

True. Though I think we could trivially add a goal generating mechanism of
our choosing and pair it with GPT-4 to generate and select possible courses
of action to achieve the goals at hand.

> One final point about recursion: where I was trying to get at with the
> chess example is that HI can solve problems that are provably more time
> complex than constant / linear. We can solve polynomial type stuff, and
> even approximate solutions for NP-hard stuff.

I tend to think of the LLM as capable of doing anything a human can do
given only 10 seconds of thought. (Or some other finite time period).

Playing a game like chess requires expensive navigation of a very large
> tree of possible states. This is true both for computers and humans,
> although they might implement this capability in different ways. Grand
> masters sometimes commit blunders when trying to explore the tree further
> than their cognitive capabilities permit, and they will discuss such things
> (meta-cognition).

What was interesting to me about Google's AlphaZero is that it hugely
decreased the amount of moves it considered, down to a few thousand, rather
than many millions of traditional chess engines. This is due in part to
it's superior capacity to recognize good moves in a single evaluation of
its network. I think the way AlphaZero plays chess is much closer to how
humans play than traditional chess engines.

> GPT-4 as a pure computational environment lacks the ability to perform
> polynomial time computations. It "fools" us spectacularly by wielding its
> immense domain knowledge of... everything. But this only goes so far. It
> can never defeat a competent chess player with such an architecture. Of
> course, we can integrate GPT-4 with some API and let it call some
> explore_deep_tree() function, but this is not the sort of deep integration
> that one imagines in sophisticated AI. True recurrence would allow for true
> computational power within the model.
> This is the sort of things I have been thinking. I may be missing
> something obvious. Would also love to read your opinion!

I think we could write a simple iterative or recursive script around GPT-4
which asks it to break problems down into smaller and smaller pieces until
it can reliably solve them, then as it goes back up the recursion tree
asking how to combine the intermediate results, and bubble back up to the
top it would have the solution in hand. I think this would greatly magnify
the class of problems they GPT-4 could solve.


> Telmo
> Jason
> --
> You received this message because you are subscribed to the Google Groups
> "Everything List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to everything-list+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/everything-list/CA%2BBCJUjFhjj5bzZx6x4iq_NjXOy%2BAmadTnvzF464J87xvBc_Ag%40mail.gmail.com
> <https://groups.google.com/d/msgid/everything-list/CA%2BBCJUjFhjj5bzZx6x4iq_NjXOy%2BAmadTnvzF464J87xvBc_Ag%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> --
> You received this message because you are subscribed to the Google Groups
> "Everything List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to everything-list+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/everything-list/7738df02-557d-4bfd-aee5-b60d07a2dfb5%40app.fastmail.com
> <https://groups.google.com/d/msgid/everything-list/7738df02-557d-4bfd-aee5-b60d07a2dfb5%40app.fastmail.com?utm_medium=email&utm_source=footer>
> .

You received this message because you are subscribed to the Google Groups 
"Everything List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to everything-list+unsubscr...@googlegroups.com.
To view this discussion on the web visit 

Reply via email to