On Sun, Mar 08, 2020 at 07:08:25PM +1100, Bruce Kellett wrote:
> On Sun, Mar 8, 2020 at 6:14 PM Russell Standish <[email protected]> wrote:
> 
>     On Thu, Mar 05, 2020 at 09:45:38PM +1100, Bruce Kellett wrote:
>     > On Thu, Mar 5, 2020 at 5:26 PM Russell Standish <[email protected]>
>     wrote:
>     >
>     >     But a very large proportion of them (→1 as N→∞) will report being
>     >     within ε (called a confidence interval) of 50% for any given ε>0
>     >     chosen at the outset of the experiment. This is simply the law of
>     >     large numbers theorem. You can't focus on the vanishingly small
>     >     population that lie outside the confidence interval.
>     >
>     >
>     > This is wrong.
> 
>     Them's fighting words. Prove it!
> 
> 
> I have, in other posts and below.

You didn't do it below, that's why I said prove it. What you wrote
below had little bearing on what I wrote.

> 
> 
>     > In the binary situation where both outcomes occur for every
>     > trial, there are 2^N binary sequences for N repetitions of the
>     experiment. This
>     > set of binary sequences exhausts the possibilities, so the same sequence
>     is
>     > obtained for any two-component initial state -- regardless of the
>     amplitudes.
> 
>     > You appear to assume that the natural probability in this situation is p
>     = 0.5
>     > and, what is more, your appeal to the law of large numbers applies only
>     for
>     > single-world probabilities, in which there is only one outcome on each
>     trial.
> 
>     I didn't mention proability once in the above paragraph, not even
>     implicitly. I used the term "proportion". That the proportion will be
>     equal to the probability in a single universe case is a frequentist
>     assumption, and should be uncontroversial, but goes beyond what I
>     stated above.
> 
> 
> Sure. But the proportion of the 2^N sequences that exhibit any particular p
> value (proportion of 1's) decreases with N.
> 

So what?

> 
>     > In order to infer a probability of p = 0.5, your branch data must have
>     > approximately equal numbers of zeros and ones. The number of branches
>     with
>     > equal numbers of zeros and ones is given by the binomial coefficient. 
> For
>     large
>     > even N = 2M trials, this coefficient is N!/M!*M!. Using the Stirling
>     > approximation to the factorial for large N, this goes as 2^N/sqrt(N)
>     (within
>     > factors of order one). Since there are 2^N sequences, the proportion 
> with
>     n_0 =
>     > n_1 vanishes as 1/sqrt(N) for N large. 
> 
>     I wasn't talking about that. I was talking about the proportion of
>     sequences whose ratio of 0 bits to 1 bits lie within ε of 0.5, rather
>     than the proportion of sequences that have exactly equal 0 or 1
>     bits. That proportion grows as sqrt N.
> 
> 
> 
> No, it falls as 1/sqrt(N). Remember, the confidence interval depends on the
> standard deviation, and that falls as 1/sqrt(n). Consequently deviations from
> equal numbers of zeros and ones for p to be within the CI of 0.5 must decline
> as n becomes large
>

The value ε defined above is fixed at the outset. It is independent of
N. Maybe I incorrectly called it a confidence interval, although it is
surely related. 

The number of bitstrings having a ratio of 0 to 1 within ε of 0.5
grows as √N.

IIRC, a confidence interval is the interval of a fixed proportion, ie we can be 
95% confident that strings will have a ratio between 49.5% and 51.5%. That 
interval (49.5% and 51.5%) will decrease as √N for fixed confidence level 
(95%). 

> 
> 
>     > Now sequences with small departures from equal numbers will still give
>     > probabilities within the confidence interval of p = 0.5. But this
>     confidence
>     > interval also shrinks as 1/sqrt(N) as N increases, so these additional
>     > sequences do not contribute a growing number of cases giving p ~ 0.5 as 
> N
>     > increases.
> 
>     The confidence interval ε is fixed.
> 
> 
> No, it is not. The width of, say the 95% CI, decreases with N since the
> standard deviation falls as 1/sqrt(N).

Which only demonstrates my point. An increasing number of strings will
lie in the fixed interval ε. I apologise if I used the term "confidence
interval" in a nonstandard way.


-- 

----------------------------------------------------------------------------
Dr Russell Standish                    Phone 0425 253119 (mobile)
Principal, High Performance Coders     [email protected]
                      http://www.hpcoders.com.au
----------------------------------------------------------------------------

-- 
You received this message because you are subscribed to the Google Groups 
"Everything List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/everything-list/20200308085904.GE2903%40zen.

Reply via email to