Re: [mySociety:public] Do stuff with the survey data!

Francis Davey Sat, 08 May 2010 11:49:59 -0700

On 8 May 2010 12:40, Timothy Green <[email protected]> wrote:
> I think that's error in the mean, and since the Lib Dems had the most
> responses it's the lowest (error = sigma/sqrt(n), right?). The variance in
> the answers is stddev.
>
> Someone who has done stats more recently than myself can probably correct
> me.
>


I'd bet that I did stats much longer ago, but...

Yes "(standard) error" is a name for the sample standard deviation
divided by the square root of the sample size. It can be used as an
estimate of the standard deviation of the mean of samples of the same
size. Whether its at all useful or helpful to give it is another
matter - its really a religious discussion.

"Religious" in the sense that, while some aspects of statistics are
simply matters of maths or logic, or in some cases philosophy (eg
whether its meaningful to talk about posterior probabilities or not),
descriptive statistics is about data presentation, which is a matter
on which people argue a great deal.

For my money, standard deviation is a *pants* measure. Even for nice
(normally distributed) data, most people will read a measure like:

mean=50 +/- 10 (standard deviation)

as meaning that the range 40-60 covers pretty much all the cases of
interest, whereas all it means is that about 68% of the possible
values are in that range. Even 30-70 only gets you 95% of the
possibilities (i.e. 1/20 fall outside, and 1/20 is quite a lot of
things sometimes - when I commuted to work 1/20 of my journeys took
place each fortnight, or many times a year).

The only virtue of s.d. is its what everyone uses - so you can just
put it there without much more explanation.

A more useful comment: the MDS looks fine (its a neat rough and ready
way to reduce data to 2D) but the "distance" is interesting. Simply
taking the difference between two % agreements has an odd effect.

Consider:

[1] Party A and party B have 0% and 50% agreement with a particular
policy - they score 50% difference. The probability that any pair of
people from either party would agree on that topic is 50%.

[2] The figures are now 25% and 75%, and the probability of pairwise
agreement falls to about 38% but the score difference is 50%.

Well you see what I mean: its not a linear measure - just something
worth bearing in mind.

-- 
Francis Davey

_______________________________________________
Mailing list [email protected]
Archive, settings, or unsubscribe:
https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public

Re: [mySociety:public] Do stuff with the survey data!

Reply via email to