> Nevertheless, I am not persuaded. Your rating is based on: "Four legs > good, two legs bad!". While that may be generally true, it will throw up > many anomalies, and the problem is you neither know which these are, > nor how many, because you haven't and can't properly test your hypothesis.
First of all, I'm not making (and haven't made) any strong statements about the accuracy of FDMs, because the number of planes for which I have an idea what that number should be is small, and I think there is a general consensus that to judge an FDM adequately is lots of work. My statements about 'quality' and the correlation with 'beauty' are chiefly based on modelling of systems, instrumentation and implemented procedures - these I can judge better. It is simply not true that I can't test my hypothesis with regard to instrumentation - I have flown about 40 aircraft with some regularity since installing Flightgear, I have taken a look at real cockpit photographs for some of them, I have read their documentation and have knowledge of what the different buttons do, so I have a fair idea about how detailed their instrumentation is modelled. My hypothesis for fairly detailed planes is tested on that subsample of 10% of the available aircraft. In addition, there are about 40+ aircraft for which the lack of instrumentation and systems is fairly obvious (i.e. I see no gauge in the cockpit...) even without spending a longer time in the aircraft. For these I likewise claim knowledge of the quality of systems which is implemented. So I do know about 20% of the total number of aircraft in sufficient detail to estimate a correlation. I think a fair statement is 'A rating for the detail of instrumentation and systems has a chance of 80% to be no more than 2 points different from a rating of visual detail, i.e. there is an 80% chance that the visual rating and the final rating (averaged over visuals and instrumentation details) do not differ by more than 1 point. Let's look at a few examples (not brought up by myself): *** Stuart's rating of the c172p: 4/5 rescaled to a 10 point scale, that's an 8/10 where I have 7/10 - check. *** The KC-135 I'm not sure what your quality rating from 0-10 would be - probably not really zero, so I assume it's 1 or 2, so averaged with the visuals that's about 2 or 2.5 where I have rated 3 - yes. *** Sopwith Camel > Does it win the ratings > war? Indeed it does - it received 10/10. *** Lightning Assuming you'd rate the FDM and systems 10, the average with beauty would be something like 9.7 or 9.5, dependent if I take the FDM into consideration. My rating is 9. *** p51d-jsbsim Hal self-rated 7.5/10, I rated the p51d with 6 - that would pretty much fit already, except that the p51d-jsbsim is a bit more detailed than the p51d, so I would rate that 7 (well, sure I can say this after the fact...). Yes, fits as well. *** It seems you are bothered more by the fact that the F-14b is rated above the Lightning, but here you are asking too much of the test. The test can pick out both planes as 'has high probability of having very detailed systems and above average FDM', but if you want to know which of them is better in detail, the accuracy is not sufficient. The correlation between 'beauty' and 'quality' is there, but it is not that strong, correlation isn't equality. Which is why I am very much in favour of bringing in additional information (like the developer self-rating Stuart suggested). My point is not that rating based on visual detail is perfect and we should leave it at that - my point is that in practice it works quite a bit better than a mere beauty contest. > If I recall my stats correctly, your assumption that there is a causal > relationship between attractiveness of the cockpit and a high realism is > unproven. In our statistically small sample, I think it will throw up as > many wrong results as correct ones. Concorde is but one example. I have not assumed a causal relationship (not do I need to). I observe in practive a correlation, I utilize it, I don't need to understand it to do so (I have given some speculation where it comes from though...). Assuming that Stuart and Hal did the self-rating 'fair', and assuming you did not know my numbers when you picked your examples, the system has (within its accuracy) in fact not thrown as many wrong results as correct ones. From the above, it has managed well with 5/5 examples, 5/6 is you add the Concorde - but that was cherry-picked by myself as counterexample (!) and therefore doesn't really count for a statistical test of a hypothesis. So, under the reasonable assumption that you didn't pick planes randomly but that you picked planes assuming they would be likely to be counterexamples to my rating, you have to grant me that the system has dealt with them rather well and that your supposed counterexamples have in fact not turned out to be as many wrong results as correct ones. * Thorsten ------------------------------------------------------------------------------ Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev _______________________________________________ Flightgear-devel mailing list Flightgear-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flightgear-devel