Re: sampling formula for proportion without independence?

Rich Ulrich Sun, 02 Nov 2003 17:18:38 -0800

On 29 Oct 2003 19:49:52 -0800, [EMAIL PROTECTED] (Scott
Edwards  Tallahassee,FL) wrote:


> Rich,
> Thanks for your response.  See comments/questions interspersed below.
> Scott

I am deleting most of that.  Here's a couple of points.


ru > > 
> > You are asking for n, for the planning of a survey among N, 
> > and your formula is using Finite population correction.
> > You can check with groups.google and see how often 
> > I have told  people that FPC  is *usually*  a bad idea.   

se [ here, and all the later quotes] > 
>   I was unaware of this. I will check your former messages.
> This appeared to be the 'standard' formula for sample size calculation
> when you are interested in a proportion of items that pass/fail, *and
> you have independence* (e.g. political polls), so I'm afraid that many
> of us are making this error. However, I am definitely not tied to this
> formula and am just looking for a method to get the job done as
> accurately as possible.

The Finite Population Correction (FPC) is not used in 
standard political polls -- which only tap a tiny fraction
of the voting population.  It is used on election eve, 
when votes are coming in, and then it is used with great
caution;  or else folks can end up predicting (say) that
Florida has gone to Bush by 100,000 votes.

I said before, I could not be sure that your application
did not justify using the FPC.  So, I am still interested in
what sort of application it is.

I learned in a previous interchange that the FPC   is
used in animal management, in studies that those officials
do  call 'research.'  Up until then, I had had the tentative
conclusion that nobody doing 'research'  had any business
using the FPC.    But it is still a tool for management, 
rather than *scientific*  inference.

[ ... ]
> Regarding your comment on regulation, this is simply a data analysis
> problem - I'm not clear why the issue of regulation is relevant.  If

Why is regulation relevant?
The FPC  is most likely to fit when there is a vote or 
fixed standard.  These are administrative applications,
where the requirement has a fixed time frame.  Laws 
often have to work that way.  I have not seen anyone
suggest it, for internal business management, except
for applications of 'quality control'.  

The internal 'statistical survey', I should think, would be 
more snoopy, on the one hand, and more informative to 
management.


> the methodology had been laid out in a regulation then I definitely
> wouldn't be wasting your guys time asking help in formulating one. 
> The problem from a research design/analysis standpoint doesn't strike
> me as *that* unusual.  I've read many times of how analyses must be
> adjusted due to 'clusters' of data points that are not independent
> (eg. effect of temperature on performance of athletes measured
> multiple time), I just haven't seen how to approach it from a sample
> size determination perspective.  Actually, it occurs to me that

Unfortunately, the theory is dull and tedious.  And it
depends so much on the exact details, that the application
you ask about is probably something that you can get 
only by iteration -- Try a sample, make the estimate, 
and then inflate the next sample N  in order to raise the 
power proportionately.   The first test might assume 
independence;  or if you know that independence won't
exist, boost the N  by an arbitrary fraction.  

The choice is between pilot data or dummy data.

[ snip, some]
>                .  However, I posted it here for two additional important
> reasons.
...
> 
>     2.  I was under the impression that this group was for the purpose
> of discussing interesting statistical issues/problems that had
> applicablility beyond the specific problem.  Perhaps I'm missing

Perhaps you do have something interesting to measure.
However, so far, *I*  don't sense much of the wider applicability.
The stats-groups will muse, a little, about new problems, but 
mostly we serve to direct folks to answers that are (at worst) 
a bit obscure.

If you have pilot data, you can compute the test-total, and  you 
can extrapolate to larger Ns and larger power.  If you don't 
have any numbers to start with, I don't think you can get very
far.  I don't see much as a theoretical problem, and I don't 
see detail that allows better referrals for the concrete problem.

> something, but I'm unable to see your perspective that this problem
> would only come up in the context of 'regulation' - to the contrary,

 - The general problem of group-dependencies, of course,
exist outside of 'regulation'.  There are some references 
concerning 'effective sample size'  in a post by Jon Volstad,
saved in my stats-FAQ  at 

  http://www.pitt.edu/~wpilib/statfaq/96sampn.html

> it seems to me it would come up in many instances of evaluating
> organizations as a whole, with many individuals, performing multiple
> tasks (e.g a factory, with many employees, making many widgets each
> and you wanted to estimate *factory-wide* the proportion of defects in
> the widgets that was occurring - this is the *exact* same problem that
> I have)
[ ... ]

>From this, it seems like someone wants a small-variance 
estimator of an overall total.  "Stratification of surveys" comes
to mind.   Maybe someone has a formula or a reference,
but you still would need to have an estimate of the 
dependency, measured as an intraclass correlation or 
something similar.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
"Taxes are the price we pay for civilization." 
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: sampling formula for proportion without independence?

Reply via email to