Well, if you're willing to take the step of asking questions about the
world that are framed in terms of probabilities and probability
distributions ... then modern probability and statistics tell you a
lot about overfitting and how to avoid it...

OTOH if, like Pei Wang, you think it's misguided to ask questions
posed in a probabilistic framework, then that theory will not be
directly relevant to you...

To me the big weaknesses of modern probability theory lie  in
**hypothesis generation** and **inference**.   Testing a hypothesis
against data, to see if it's overfit to that data, is handled well by
crossvalidation and related methods.

But the problem of: given a number of hypotheses with support from a
dataset, generating other interesting hypotheses that will also have
support from the dataset ... that is where traditional probabilistic
methods (though not IMO the foundational ideas of probability) fall
short, providing only unscalable or oversimplified solutions...

-- Ben G

On Sat, Nov 29, 2008 at 1:08 PM, Jim Bromer <[EMAIL PROTECTED]> wrote:
> Hi.  I will just make a quick response to this message and then I want
> to think about the other messages before I reply.
>
> A few weeks ago I decided that I would write a criticism of
> ai-probability to post to this group.  I wasn't able remember all of
> my criticisms so I decided to post a few preliminary sketches to
> another group.  I wasn't too concerned about how they responded, and
> in fact I thought they would just ignore me.  The first response I got
> was from an irate guy who was quite unpleasant and then finished by
> declaring that I slandered the entire ai-probability community!  He
> had some reasonable criticisms about this but I considered the issue
> tangential to the central issue I wanted to discuss. I would have
> responded to his more reasonable criticisms if they hadn't been
> embedded in his enraged rant.  I wondered why anyone would deface the
> expression of his own thoughts with an emotional and hostile message,
> so I wanted to try the same message on this group to see if anyone who
> was more mature would focus on this same issue.
>
> Abram made a measured response but his focus was on the
> over-generalization.  As I said, this was just a preliminary sketch of
> a message that I intended to post to this group after I had worked on
> it.
>
> Your point is taken.  Norvig seems to say that overfitting is a
> general problem.  The  method given to study the problem is
> probabilistic but it is based on the premise that the original data is
> substantially intact.  But Norvig goes on to mention that with pruning
> noise can be tolerated. If you read my message again you may see that
> my central issue was not really centered on the issue of whether
> anyone in the ai-probability community was aware of the nature of the
> science of statistics but whether or not probability can be used as
> the fundamental basis to create agi given the complexities of the
> problem.  So while your example of overfitting certainly does deflate
> my statements that no one in the ai-probability community gets this
> stuff, it does not actually address the central issue that I was
> thinking of.
>
> I am not sure if Norvig's application of a probabilistic method to
> detect overfitting is truly directed toward the agi community.  In
> other words: Has anyone in this grouped tested the utility and clarity
> of the decision making of a fully automated system to detect
> overfitting in a range of complex IO data fields that one might expect
> to encounter in AGI?
>
> Jim Bromer
>
>
>
> On Sat, Nov 29, 2008 at 11:32 AM, Abram Demski <[EMAIL PROTECTED]> wrote:
>> Jim,
>>
>> There is a large body of literature on avoiding overfitting, ie,
>> finding patterns that work for more then just the data at hand. Of
>> course, the ultimate conclusion is that you can never be 100% sure;
>> but some interesting safeguards have been cooked up anyway, which help
>> in practice.
>>
>> My point is, the following paragraph is unfounded:
>>
>>> This is a problem any AI method has to deal with, it is not just a
>>> probability thing.  What is wrong with the AI-probability group
>>> mind-set is that very few of its proponents ever consider the problem
>>> of statistical ambiguity and its obvious consequences.
>>
>> The "AI-probability group" definitely considers such problems.
>>
>> --Abram
>>
>> On Sat, Nov 29, 2008 at 10:48 AM, Jim Bromer <[EMAIL PROTECTED]> wrote:
>>> One of the problems that comes with the casual use of analytical
>>> methods is that the user becomes inured to their habitual misuse. When
>>> a casual familiarity is combined with a habitual ignorance of the
>>> consequences of a misuse the user can become over-confident or
>>> unwisely dismissive of criticism regardless of how on the mark it
>>> might be.
>>>
>>> The most proper use of statistical and probabilistic methods is to
>>> base results on a strong association with the data that they were
>>> derived from.  The problem is that the AI community cannot afford this
>>> strong a connection to original source because they are trying to
>>> emulate the mind in some way and it is not reasonable to assume that
>>> the mind is capable of storing all data that it has used to derive
>>> insight.
>>>
>>> This is a problem any AI method has to deal with, it is not just a
>>> probability thing.  What is wrong with the AI-probability group
>>> mind-set is that very few of its proponents ever consider the problem
>>> of statistical ambiguity and its obvious consequences.
>>>
>>> All AI programmers have to consider the problem.  Most theories about
>>> the mind posit the use of similar experiences to build up theories
>>> about the world (or to derive methods to deal effectively with the
>>> world).  So even though the methods to deal with the data environment
>>> are detached from the original sources of those methods, they can
>>> still be reconnected by the examination of similar experiences that
>>> may subsequently occur.
>>>
>>> But still it is important to be able to recognize the significance and
>>> necessity of doing this from time to time.  It is important to be able
>>> to reevaluate parts of your theories about things.  We are not just
>>> making little modifications from our internal theories about things
>>> when we react to ongoing events, we must be making some sort of
>>> reevaluation of our insights about the kind of thing that we are
>>> dealing with as well.
>>>
>>> I realize now that most people in these groups probably do not
>>> understand where I am coming from because their idea of AI programming
>>> is based on a model of programming that is flat.  You have the program
>>> at one level and the possible reactions to the data that is input as
>>> the values of the program variables are carefully constrained by that
>>> level.  You can imagine a more complex model of programming by
>>> appreciating the possibility that the program can react to IO data by
>>> rearranging subprograms to make new kinds of programs.  Although a
>>> subtle argument can be made that any program that conditionally reacts
>>> to input data is rearranging the execution of its subprograms, the
>>> explicit recognition by the programmer that this is useful tool in
>>> advanced programming is probably highly correlated with its more
>>> effective use.  (I mean of course it is highly correlated with its
>>> effective use!)  I believe that casually constructed learning methods
>>> (and decision processes) can lead to even more uncontrollable results
>>> when used with this self-programming aspect of advanced AI programs.
>>>
>>> The consequences then of failing to recognize that mushed up decision
>>> processes that are never compared against the data (or kinds of
>>> situations) that they were derived from will be the inevitable
>>> emergence of inherently illogical decision processes that will mush up
>>> an AI system long before it gets any traction.
>>>
>>> Jim Bromer
>>>
>>>
>>> -------------------------------------------
>>> agi
>>> Archives: https://www.listbox.com/member/archive/303/=now
>>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>>> Modify Your Subscription: https://www.listbox.com/member/?&;
>>> Powered by Listbox: http://www.listbox.com
>>>
>>
>>
>> -------------------------------------------
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>>
>
>
> -------------------------------------------
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>



-- 
Ben Goertzel, PhD
CEO, Novamente LLC and Biomind LLC
Director of Research, SIAI
[EMAIL PROTECTED]

"I intend to live forever, or die trying."
-- Groucho Marx


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=120640061-aded06
Powered by Listbox: http://www.listbox.com

Reply via email to