Thorsten

> -----Original Message-----
> From:.i.r...@jyu.fi [mailto:thorsten.i.r...@jyu.fi]
> Sent: 02 December 2010 10:58
> To: vivian.mea...@lineone.net; FlightGear developers discussions
> Subject: Re: [Flightgear-devel] Aircraft model/cockpit rating
> 
> > My point is your rating was based on an assumption that was totally
> > incorrect: that the developer had made a reasonable effort to put the
> > right
> > gauges and levers in the right place. Do you make a similar assumption
> > about
> > the FDM? That it is approximately right? Is there much value in such a
> > rating?
> 
> Vivian, I am sorry if I'm now taking a little more of a lecturing attitude
> - I do not know how much you know about mathematical statistics, but I
> have the impression you are completely missing the issue here.
> 
> What the rating represents is a screening procedure. A screening procedure
> is used to quickly assess a large number of something, to single out a
> subset with given properties. For instance, you might screen a population
> for breast cancer.
> 
> Screening procedures are designed to process large numbers, i.e. they do
> not make use of all available diagnostic tools and replace detailed
> knowledge by plausibility, because usually applying detailed knowledge and
> detailed testing requires time and resources which are not available (a
> detailed cancer test requires you to be hospitalized for maybe 1-2 days,
> say that (optimistically) costs 200$, to do it for 100 Million people once
> per year is 20 Billion per year (hm...)- so maybe you'd rather test less
> accurately for 5$ per person). Screenings therefore often test proxies,
> rather than the real property you're interested in.
> 
> For any given instance of the something, it is always true that a detailed
> test has more accurate results. It is also true that a screening produces
> both false positives (i.e. assigns a property to something which does in
> fact not have that property) and false negatives (i.e. does not assign the
> property to something which does in fact have it).
> 
> It is not required (nor reasonable to require) that a screening procedure
> is always correct or that the plausibility assumptions underlying it are
> always fulfilled. What is required is that the screening procedure is
> right most of the time (dependent on the problem, you want to minimize the
> rate of false positives, of false negatives or both - in the cancer
> example, it it better to send a few more people to detailed testing than
> to miss too many real cancer cases, so you try to minimize the false
> negatives).
> 
> So, what you have shown with the KC-135 is a case in which a default
> assumption was wrong, but in which the scheme still (for whatever reason)
> gave a good answer. That's not very problematic (one wouldn't consider it
> problematic if a screening test picks up a cancer for the wrong reason if
> there in fact is a cancer). Right now you have shown me one example in
> which the default assumption does not work. If there are no more, it means
> it has an accuracy of 99.75%. If you can find as many as 40 planes with a
> similar history in which the designer did not care about cockpit layout,
> the default assumption would still have an accuracy of 90%. That's pretty
> good to me - and the chance that the default assumption does not work but
> the result is still reasonable is even better than that!
> 
> The Concorde is in some sense way more problematic, because it is actually
> a 'wrong' result - a false negative (i.e. a high-quality plane gets a low
> rating). But here precisely the same question arises - what is the rate of
> false negatives? What is the actual probability that this happens to a
> second plane in the sample?
> 
> Of course I don't factually know that (because I have no detailed test
> data for all aircraft), but I can give an estimate based on the sub-sample
> of planes I know better - this is where statistics comes in (I could even
> compute error margins for that estimate, although I have not done that
> yet). And that estimate suggests that the rate of false positives and
> negatives is low (about 2.5% for a deviation of 5 points between quality
> and visuals - which means that it works better than that 97.5% of the
> time).  Again, this is a number which I consider entirely reasonable.
> 
> It doesn't matter if the rating works in every instance perfectly, or if
> the assumptions capture every instance correctly. On average, the results
> are reasonable and they give you an overview.
> 
> Having an overview picture of something with a 10% error margin is better
> than having no overview at all with 1% error margin (screening 90% of a
> population for cancer with a 10% rate of false positives and negatives is
> way more effective than testing 1% of the population in detail with a 1%
> failure rate).
> 
> *shrugs*
> 
> Codify any testing scheme you like, and I bet I can construct a case which
> is somehow not adequately treated in it. It doesn't matter that I can do
> that - it's the rate with which it actually happens that matters, and the
> amount of resources it takes to run the scheme.
> 

Your explanation is comprehensive, rational and plausible. I'd better rescue
my statistics notes from the loft. Yes - I did study stats a long time ago.

Nevertheless, I am not persuaded. Your rating is based on: "Four legs good,
two legs bad!". While that may be generally true, it will throw up many
anomalies, and the problem is you neither know which these are, nor how
many, because you haven't and can't properly test your hypothesis. We have
to set all this against the ethic of FG which is: "realism above all else".
Thus, your rating system will penalize a realistic model, and unfairly
elevate an attractive one. Let me illustrate further.

Compare the F14 and the Lightning. Without a doubt the F14 is the more
attractive cockpit, and probably accurate (I haven't checked). The Lightning
is very accurate (I have checked). However, the F14 has YASim as its FDM.
YASim does not model the transonic or supersonic regimes, so the performance
is only an approximation, how good an approximation I don't know. The
Lighting uses JSBSim and actual wind tunnel data, so the outcome should be
very realistic, but in fact isn't all that nice to fly. You rating will, I
think, pick the F14, but in fact the Lightning is a better model of the real
aircraft. The F14 developer has chosen to make a wing fall off under high G
loading: is this realistic? I don't know, but I know of no such real life
incident.

Let us take the C172. As Martin said, it's never going to rate in a beauty
contest, but it probably one of the most realistic models we have, based on
personal knowledge by owners and pilots.

Or the Sopwith Camel. It has every detail modeled: every flying, landing and
control wire, every wing frame, spark plug, screw head, switch, lever, and
even those items that only God can see. It has a realistic FDM based on
contemporary reports. It's a pig to fly and land, but then so was the real
one. It is our most accurate and detailed model. Does it win the ratings
war? 

Before I witter on too long I will assert that 40-60% of our aircraft have
generic/no instruments and a generic FDM. Of the remainder, some are
superficially attractive, but have deficient FDM configs, or vice versa. Are
any both attractive, and have accurate FDMs? Probably, but I don't know for
sure which they are. There are several which are less attractive but are
realistic and with good FDM configs. Even some with no instruments have well
developed FDMs.

If I recall my stats correctly, your assumption that there is a causal
relationship between attractiveness of the cockpit and a high realism is
unproven. In our statistically small sample, I think it will throw up as
many wrong results as correct ones. Concorde is but one example. I think
your rating system is just a comparison of the subjective attractiveness of
cockpits. You may care to _prove_ me wrong. 

Vivian





------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel

Reply via email to