Re: [agi] Computing's coming Theory of Everything

Abram Demski Tue, 22 Jul 2008 14:18:08 -0700

On Tue, Jul 22, 2008 at 4:29 PM, Steve Richfield
<[EMAIL PROTECTED]> wrote:
> Abram,
>
> On 7/22/08, Abram Demski <[EMAIL PROTECTED]> wrote:
>>
>> From the paper you posted, and from wikipedia articles, the current
>> meaning of PCA is very different from your generalized version. I
>> doubt the current algorithms would even metaphorically apply...
>
>
> Just more input points that are time-displaced from the present points, or
> alternatively in simple cases, compute with the derivative of the inputs
> rather than with their static value.


Such systems might produce some good results, but the formalism cannot
represent complex relational ideas. It is not even capable of
representing context-free patterns (for example, pictures of
fractals). Of course, I'm referring to PCA "as it is", not "as it
could be".

>>
>> Also, what would "multiple layers" mean in the generalized version?
>
>
> Performing the PC-like analysis on the principal components derived in a
> preceding PC-like analysis.

If this worked, it would be another way of trying to break up the task
into subtasks. It might help, I admit. It has an intuitive feel; it
fits the idea of there being levels of processing in the brain. But if
it helps, why? What clean subtask-division is it relying on? The idea
of iteratively compressing data by looking for the highest-information
variable repeatedly makes sense to me, it is a clear subgoal. But what
is the subgoal here?

Hmm... the algorithm for a single level would need to "subtract" the
information encoded in the new variable each time, so that the next
iteration is working with only the still-unexplained properties of the
data. The variables then should be independent, right? Yet, if we take
the multilevel approach, the 2nd level will be trying to take
advantage of dependencies in those variables...

Perhaps this will work due to inaccuracies in the algorithm, caused by
approximate methods. The task of the higher levels, then, is to
correct for the approximations. But if this is their usefulness, then
it needs to be shown that they are capable of it. After all, they will
be running the same sort of approximation. It is possible that they
will therefore miss the same sorts of things. So, we need to be
careful in defining multilevel systems.

>
> Steve Richfield
> ================
>>
>> On Tue, Jul 22, 2008 at 2:58 PM, Steve Richfield
>> <[EMAIL PROTECTED]> wrote:
>> > Abram,
>> >
>> > On 7/22/08, Abram Demski <[EMAIL PROTECTED]> wrote:
>> >>
>> >> "Problem Statement: What are the optimal functions, derived from
>> >> real-world observations of past events, the timings of their comings
>> >> and goings, and perhaps their physical association, to extract each
>> >> successive parameter containing the maximum amount of information (in
>> >> a Shannon sense) usable in reconstructing the observed inputs."
>> >>
>> >> I see it now! It is typically very useful to decompose a problem into
>> >> sub-problems that can be solved either independently or with simple
>> >> well-defined interaction. What you are proposing is such a
>> >> decomposition, for the very general problem of compression. "Find an
>> >> encoding scheme for the data in dataset X that minimizes the number of
>> >> bits we need" can be split into subproblems of the form "find a
>> >> meaning for the next N bits of an encoding that maximizes the
>> >> information they carry". The general problem can be solved by applying
>> >> a solution to the simpler problem until the data is completely
>> >> compressed.
>> >
>> >
>> > Yes, we do appear to be on the same page here. The challenge is that
>> > there
>> > seems to be a prevailing opinion that these don't :stack" into
>> > multi-level
>> > structures. The reason that this hasn't been tested seems obvious from
>> > the
>> > literature - computers are now just too damn slow, but people here seem
>> > to
>> > think that there is another more basic reason, like it doesn't work. I
>> > don't
>> > understand this argument either.
>> >
>> > Richard, perhaps you could explain?
>> >>
>> >> "However, it still fails to consider temporal clues, unless of course
>> >> you just consider these to be another dimension."
>> >>
>> >> Why does this not count as a working solution?
>> >
>> >
>> > It might be. Note that delays from axonal transit times could quite
>> > easily
>> > and effectively present inputs "flat" with time presented as just
>> > another
>> > dimension. Now, the challenge of testing a theory with an additional
>> > dimension, that already clogs computers without the additional
>> > dimension.
>> > Ugh. Any thoughts?
>> >
>> > Perhaps I should write this up and send it to the various people working
>> > in
>> > this area. Perhaps people with the present test beds could find a way to
>> > test this, and the retired math professor would have a better idea as to
>> > exactly what needed to be optimized.
>> >
>> > Steve Richfield
>> > =================
>> >>
>> >> On Tue, Jul 22, 2008 at 1:48 PM, Steve Richfield
>> >> <[EMAIL PROTECTED]> wrote:
>> >> > Ben,
>> >> > On 7/22/08, Benjamin Johnston <[EMAIL PROTECTED]> wrote:
>> >> >>>
>> >> >>> You are confusing what PCA now is, and what it might become. I am
>> >> >>> more
>> >> >>> interested in the dream than in the present reality.
>> >> >>
>> >> >> That is like claiming that multiplication of two numbers is the
>> >> >> answer
>> >> >> to
>> >> >> AGI, and then telling any critics that they're confusing what
>> >> >> multiplication
>> >> >> is now with what multiplication may become.
>> >> >
>> >> >
>> >> > Restating (not copying) my original posting, the challenge of
>> >> > effective
>> >> > unstructured learning is to utilize every clue and NOT just go with
>> >> > static
>> >> > clusters, etc. This includes temporal as well as positional clues,
>> >> > information content, etc. PCA does some but certainly not all of
>> >> > this,
>> >> > but
>> >> > considering that we were talking about clustering here just a couple
>> >> > of
>> >> > weeks ago, ratcheting up to PCA seems to be at least a step out of
>> >> > the
>> >> > basement.
>> >> >
>> >> > I think that perhaps I mis-stated or was misunderstood in my
>> >> > "position".
>> >> > No
>> >> > one has "the answer" yet, but given recent work, I think that perhaps
>> >> > the
>> >> > problem can now be stated. Given a problem statement, it (hopefully)
>> >> > should
>> >> > be "just some math" to zero in on the solution. OK...
>> >> >
>> >> > Problem Statement: What are the optimal functions, derived from
>> >> > real-world
>> >> > observations of past events, the timings of their comings and goings,
>> >> > and
>> >> > perhaps their physical association, to extract each successive
>> >> > parameter
>> >> > containing the maximum amount of information (in a Shannon sense)
>> >> > usable
>> >> > in
>> >> > reconstructing the observed inputs. IMHO these same functions will be
>> >> > exactly what you need to recognize what is happening in the world,
>> >> > what
>> >> > you
>> >> > need to act upon, which actions will have the most effect on the
>> >> > world,
>> >> > etc.
>> >> > PCA is clearly NOT there (e.g. it lacks temporal consideration), but
>> >> > seems
>> >> > to be a step closer than anything else on the horizon. Hopefully,
>> >> > given
>> >> > the
>> >> > "hint" of PCA, we can follow the path.
>> >> >
>> >> > You should find an explanation of PCA in any elementary linear
>> >> > algebra
>> >> > or
>> >> > statistics textbook. It has a range of applications (like any
>> >> > transform),
>> >> > but it might be best regarded as an/the elementary algorithm for
>> >> > unsupervised dimension reduction.
>> >> >
>> >> > Bingo! However, it still fails to consider temporal clues, unless of
>> >> > course
>> >> > you just consider these to be another dimension.
>> >> >
>> >> > When PCA works, it is more likely to be interpreted as a comment on
>> >> > the
>> >> > underlying simplicity of the original dataset, rather than the power
>> >> > of
>> >> > PCA
>> >> > itself.
>> >> >
>> >> > Agreed, but so far, I haven't seen any solid evidence that the world
>> >> > is
>> >> > NOT
>> >> > simple, though it appears pretty complex until you understand it.
>> >> >
>> >> > Thanks for making me clarify my thoughts.
>> >> >
>> >> > Steve Richfield
>> >> >
>> >> > ________________________________
>> >> > agi | Archives | Modify Your Subscription
>> >>
>> >>
>> >> -------------------------------------------
>> >> agi
>> >> Archives: https://www.listbox.com/member/archive/303/=now
>> >> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> >> Modify Your Subscription: https://www.listbox.com/member/?&;
>> >> Powered by Listbox: http://www.listbox.com
>> >
>> > ________________________________
>> > agi | Archives | Modify Your Subscription
>>
>>
>> -------------------------------------------
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>
> ________________________________
> agi | Archives | Modify Your Subscription


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=108809214-a0d121
Powered by Listbox: http://www.listbox.com

Re: [agi] Computing's coming Theory of Everything

Reply via email to