Re: [R] Offtopic, HT vs. HH in coin flips

2009-09-01 Thread Ted Harding
On 31-Aug-09 19:16:33, Erik Iverson wrote:
 Dear R-help, 
 Could someone please try to explain this paradox to me? What is
 more likely to show up first in a string of coin tosses, Heads
 then Tails, or Heads then Heads?  
 
##generate 2500 strings of random coin flips
 ht - replicate(2500,
 paste(sample(c(H, T), 100, replace = TRUE),
   collapse = ))
 
## find first occurrence of HT
 mean(regexpr(HT, ht))+1#mean of HT position, 4
 
## find first occurrence of HH
 mean(regexpr(HH, ht))+1#mean of HH position, 6
 
 FYI, this is not homework, I have not been in school in years.
 I saw a similar problem posed in a blog post on the Revolutions R
 blog, and although I believe the answer, I'm having a hard time
 figuring out why this should be? 
 
 Thanks,
 Erik Iverson

Be very careful about the statement of the problem!

[1] The probability that HH will occur first (i.e. before HT)
is the same as the probability that HT will occur first (i.e.
before HH).

[2] However, the probability that the first occurrence of HT
will be on a given position of the H is generally not the same
as the probability that the first occurrence of HH will be on
the same position of the first H.

[1]: At the first occurrence of (either HH or HT), there is
an initial string S, ending in an H, followed by either an H
(for HH) or a T (for HT). Both are equally likely.

So the probability that the first occurrence of (either HH or HT)
is an HH is the same as the probability that it is an HT.

[2]: (A) the first occurrence of an HH is in a sequence of
any collection of H and T provided there is no HH in the
sequence, and the last is H, followed by H.
However, HT is allowed to occur in the sequence.

But (B) the first occurrence of an HT is in a sequence of
(zero or more T) followed by (1 or more H) followed by T.
This is the only pattern in which HT does not occur prior to
the final HT.
Similarly, HH is allowed to pccur in the sequence.

The reason that, in general, the probability of HH first occuring
at a given position is different from the probability if HT first
occurring at that position lies in the differences between the
number of possible sequences satisfying (A), and the number of
possible sequences satisfying (B).

The first few cases (HH or HT first occurring at (k+1), so
that the position of the first H in HH or HT is at k) are,
with their probabilities:

k=1:   HH HT
  1/41/4

K=2:  THH HHT
  THT
  1/8 2/8

k=3: TTHH HHHT
 HTHH THHT
  TTHT
 2/16 3/16

k=4:TTTHH T
THTHH THHHT
HTTHH TTHHT
  TTTHT
 3/32 4/32

The HT case is simple:
  P.HT[k] = Prob(1st HT at (k+1)) = k/(2^(k+1))
Exercise for the reader: Sum(P.HT) = 1

The HH case is more interesting. Experimental scribblings on
parer threw up an hypothesis, which I decided to explore in R.
Thanks to Gerrit Eichner for suggestion the use of expand.grid()!

  ## Function to count sequences giving 1st HH on throw k+1
  countHH - function(k){
M - as.matrix(expand.grid(rep(list(0:1),k)))
ix - (M[,k]==1) ## k must be an H (then k+1 will be H)
for(i in (1:(k-1))){ ix-ix( !((M[,i]==1)(M[,i+1]==1)) ) }
sum(ix)
## list(Count=sum(ix),Which=M[ix,])
  }

Now, ignoring the case k=1:

  HHcounts - NULL
  for(i in (2:12)){ HHcounts-c(HHcounts,countHH(i)) }
  rbind((3:13),HHcounts)

  # [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
  #3456789   10   111213
  #HHcounts12358   13   21   34   5589   144

Lo and Behold, we have a Fibonnaci sequence! Another exercise for
the reader ...

Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 01-Sep-09   Time: 10:38:58
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Offtopic, HT vs. HH in coin flips

2009-08-31 Thread Greg Snow
Well,

If the first flip is H, then the HT pattern occurs with the first flip in the 
second run (after however long the 1st run of heads is).  If the first flip is 
T, then the second run will be H's and the HT pattern will be the first flip of 
the 3rd run.  So the HT pattern will occur after 1 or 2 runs (at the beginning 
of the 2nd or 3rd). 

On the other hand, the HH pattern can occur after any number of runs (given 
that the H runs are only 1 long).

This means that the HH pattern has a higher probability of being in the right 
tail of the distribution which will increase the mean.  The probability of HH 
or HT as the 1st pair is the same.

Just looking at the first 3 flips, the probability of HH occurring first at 
flips 2 and 3 has only 1 chance (THH, HHH means that the first HH was at 1 and 
2) and therefore has probability 1/8.

The probability of HT first occurring at 2  3 has 2 options THT or HHT and 
therefore is twice as likely.

Does this help?



-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Erik Iverson
 Sent: Monday, August 31, 2009 1:17 PM
 To: r-help@r-project.org
 Subject: [R] Offtopic, HT vs. HH in coin flips
 
 Dear R-help,
 
 Could someone please try to explain this paradox to me? What is more
 likely to show up first in a string of coin tosses, Heads then Tails,
 or Heads then Heads?
 
 ##generate 2500 strings of random coin flips
 ht - replicate(2500,
 paste(sample(c(H, T), 100, replace = TRUE),
   collapse = ))
 
 ## find first occurrence of HT
 mean(regexpr(HT, ht))+1#mean of HT position, 4
 
 ## find first occurrence of HH
 mean(regexpr(HH, ht))+1#mean of HH position, 6
 
 FYI, this is not homework, I have not been in school in years.  I saw a
 similar problem posed in a blog post on the Revolutions R blog, and
 although I believe the answer, I'm having a hard time figuring out why
 this should be?
 
 Thanks,
 Erik Iverson
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Offtopic, HT vs. HH in coin flips

2009-08-31 Thread David Winsemius

Case starting with H: Pr= 0.5
H first H second
Subcase  1a: Pr= 0.5 * 0.5 = 0.25

H first T second... leads to TH evenually
Subcase 1b: Pr = 0.5 * 0.5 = 0.25

===
Case T first: Pr = 0.5
all subcases lead to TH first


--  
David.


On Aug 31, 2009, at 3:16 PM, Erik Iverson wrote:


Dear R-help,

Could someone please try to explain this paradox to me? What is more  
likely to show up first in a string of coin tosses, Heads then  
Tails, or Heads then Heads?


##generate 2500 strings of random coin flips
ht - replicate(2500,
   paste(sample(c(H, T), 100, replace = TRUE),
 collapse = ))

## find first occurrence of HT
mean(regexpr(HT, ht))+1#mean of HT position, 4

## find first occurrence of HH
mean(regexpr(HH, ht))+1#mean of HH position, 6

FYI, this is not homework, I have not been in school in years.  I  
saw a similar problem posed in a blog post on the Revolutions R  
blog, and although I believe the answer, I'm having a hard time  
figuring out why this should be?


Thanks,
Erik Iverson

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Offtopic, HT vs. HH in coin flips

2009-08-31 Thread Erik Iverson
Part of my issue was that I was not answering my original question.  What is 
more likely to show up first, HT or HH? The answer to that turns out to be 
neither, or identical chances. 

ht - replicate(2500,
paste(sample(c(H, T), 100, replace = TRUE),
  collapse = ))

hts - regexpr(HT, ht) + 1
hhs - regexpr(HH, ht) + 1

## which is first?
table(hts  hhs)  # about 50/50 

summary(hts)  #mean of 4
summary(hhs)  #mean of 6

So, What is more likely to show up first, HH or HT? is of course a different 
question than Are the expected values of the positions for the first HT or HH 
the same?  I suppose that's where confusion set in.  It seems that if HH 
appears later in the string on average (i.e., after 6 tosses instead of 4), 
that the probability of it being first would be lower than HT, which is 
obviously wrong!

A quick graphic that helps show this (you must run the above code first):

library(lattice)

ht.df - data.frame(count = c(hts, hhs),
type = gl(2, 1250, labels = c(HT, HH)))

barchart(prop.table(xtabs(~ count + type, data = ht.df)),
 stack = FALSE, horizontal = FALSE,
 box.ratio = .8, auto.key = TRUE)

Thanks to all those who replied, and also someone sent me the following link 
off list, it also clears up the confusion:

http://www.mit.edu/~emin/writings/coinGame.html

Best, 
Erik 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Erik Iverson
Sent: Monday, August 31, 2009 2:17 PM
To: r-help@r-project.org
Subject: [R] Offtopic, HT vs. HH in coin flips

Dear R-help, 

Could someone please try to explain this paradox to me? What is more likely to 
show up first in a string of coin tosses, Heads then Tails, or Heads then 
Heads?  

##generate 2500 strings of random coin flips
ht - replicate(2500,
paste(sample(c(H, T), 100, replace = TRUE),
  collapse = ))

## find first occurrence of HT
mean(regexpr(HT, ht))+1#mean of HT position, 4

## find first occurrence of HH
mean(regexpr(HH, ht))+1#mean of HH position, 6

FYI, this is not homework, I have not been in school in years.  I saw a similar 
problem posed in a blog post on the Revolutions R blog, and although I believe 
the answer, I'm having a hard time figuring out why this should be? 

Thanks,
Erik Iverson

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Offtopic, HT vs. HH in coin flips

2009-08-31 Thread William Dunlap
It gets even more interesting when you ask about which
of 2 triples of head/tail sequences appears first in an 
infinite sequence of heads and tails.  Martin Gardiner
wrote about this in the early 1970's
 Martin Gardner, Mathematical Games: The Paradox of the Nontransitive
Dice and the Elusive Principle of Indifference. Scientific American
223, 110-114, Dec. 1970
(and perhaps again in 1974).  His book, The Colossal
Book of Mathematics: classic puzzles, paradoxes, and
problems has that stuff reprinted and updated.


Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com  

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson
 Sent: Monday, August 31, 2009 1:35 PM
 To: Erik Iverson; r-help@r-project.org
 Subject: Re: [R] Offtopic, HT vs. HH in coin flips
 
 Part of my issue was that I was not answering my original 
 question.  What is more likely to show up first, HT or HH? 
 The answer to that turns out to be neither, or identical chances. 
 
 ht - replicate(2500,
 paste(sample(c(H, T), 100, replace = TRUE),
   collapse = ))
 
 hts - regexpr(HT, ht) + 1
 hhs - regexpr(HH, ht) + 1
 
 ## which is first?
 table(hts  hhs)  # about 50/50 
 
 summary(hts)  #mean of 4
 summary(hhs)  #mean of 6
 
 So, What is more likely to show up first, HH or HT? is of 
 course a different question than Are the expected values of 
 the positions for the first HT or HH the same?  I suppose 
 that's where confusion set in.  It seems that if HH appears 
 later in the string on average (i.e., after 6 tosses instead 
 of 4), that the probability of it being first would be lower 
 than HT, which is obviously wrong!
 
 A quick graphic that helps show this (you must run the above 
 code first):
 
 library(lattice)
 
 ht.df - data.frame(count = c(hts, hhs),
 type = gl(2, 1250, labels = c(HT, HH)))
 
 barchart(prop.table(xtabs(~ count + type, data = ht.df)),
  stack = FALSE, horizontal = FALSE,
  box.ratio = .8, auto.key = TRUE)
 
 Thanks to all those who replied, and also someone sent me the 
 following link off list, it also clears up the confusion:
 
 http://www.mit.edu/~emin/writings/coinGame.html
 
 Best, 
 Erik 
 
 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson
 Sent: Monday, August 31, 2009 2:17 PM
 To: r-help@r-project.org
 Subject: [R] Offtopic, HT vs. HH in coin flips
 
 Dear R-help, 
 
 Could someone please try to explain this paradox to me? What 
 is more likely to show up first in a string of coin tosses, 
 Heads then Tails, or Heads then Heads?  
 
 ##generate 2500 strings of random coin flips
 ht - replicate(2500,
 paste(sample(c(H, T), 100, replace = TRUE),
   collapse = ))
 
 ## find first occurrence of HT
 mean(regexpr(HT, ht))+1#mean of HT position, 4
 
 ## find first occurrence of HH
 mean(regexpr(HH, ht))+1#mean of HH position, 6
 
 FYI, this is not homework, I have not been in school in 
 years.  I saw a similar problem posed in a blog post on the 
 Revolutions R blog, and although I believe the answer, I'm 
 having a hard time figuring out why this should be? 
 
 Thanks,
 Erik Iverson
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Offtopic, HT vs. HH in coin flips

2009-08-31 Thread David M Smith
For those interested, the original puzzle was detailed in a TED presentation
by Peter Donnelly, which you can see at blog.revolution-computing.com here:
http://bit.ly/2l0ZwS
The original problem was: which coin-toss sequence do you expect to see
first, HTH or HTT? Peter's explanation comes at about 5:00 in the video, and
for a lay audience it's great. (The champagne metaphor is wonderful.)

I'm loving the R simulations of this and related problems in this thread!

Cheers,
# David

On Mon, Aug 31, 2009 at 1:55 PM, William Dunlap wdun...@tibco.com wrote:

 It gets even more interesting when you ask about which
 of 2 triples of head/tail sequences appears first in an
 infinite sequence of heads and tails.  Martin Gardiner
 wrote about this in the early 1970's
  Martin Gardner, Mathematical Games: The Paradox of the Nontransitive
 Dice and the Elusive Principle of Indifference. Scientific American
 223, 110-114, Dec. 1970
 (and perhaps again in 1974).  His book, The Colossal
 Book of Mathematics: classic puzzles, paradoxes, and
 problems has that stuff reprinted and updated.


 Bill Dunlap
 TIBCO Software Inc - Spotfire Division
 wdunlap tibco.com

  -Original Message-
  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson
  Sent: Monday, August 31, 2009 1:35 PM
  To: Erik Iverson; r-help@r-project.org
  Subject: Re: [R] Offtopic, HT vs. HH in coin flips
 
  Part of my issue was that I was not answering my original
  question.  What is more likely to show up first, HT or HH?
  The answer to that turns out to be neither, or identical chances.
 
  ht - replicate(2500,
  paste(sample(c(H, T), 100, replace = TRUE),
collapse = ))
 
  hts - regexpr(HT, ht) + 1
  hhs - regexpr(HH, ht) + 1
 
  ## which is first?
  table(hts  hhs)  # about 50/50
 
  summary(hts)  #mean of 4
  summary(hhs)  #mean of 6
 
  So, What is more likely to show up first, HH or HT? is of
  course a different question than Are the expected values of
  the positions for the first HT or HH the same?  I suppose
  that's where confusion set in.  It seems that if HH appears
  later in the string on average (i.e., after 6 tosses instead
  of 4), that the probability of it being first would be lower
  than HT, which is obviously wrong!
 
  A quick graphic that helps show this (you must run the above
  code first):
 
  library(lattice)
 
  ht.df - data.frame(count = c(hts, hhs),
  type = gl(2, 1250, labels = c(HT, HH)))
 
  barchart(prop.table(xtabs(~ count + type, data = ht.df)),
   stack = FALSE, horizontal = FALSE,
   box.ratio = .8, auto.key = TRUE)
 
  Thanks to all those who replied, and also someone sent me the
  following link off list, it also clears up the confusion:
 
  http://www.mit.edu/~emin/writings/coinGame.html
 
  Best,
  Erik
 
  -Original Message-
  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson
  Sent: Monday, August 31, 2009 2:17 PM
  To: r-help@r-project.org
  Subject: [R] Offtopic, HT vs. HH in coin flips
 
  Dear R-help,
 
  Could someone please try to explain this paradox to me? What
  is more likely to show up first in a string of coin tosses,
  Heads then Tails, or Heads then Heads?
 
  ##generate 2500 strings of random coin flips
  ht - replicate(2500,
  paste(sample(c(H, T), 100, replace = TRUE),
collapse = ))
 
  ## find first occurrence of HT
  mean(regexpr(HT, ht))+1#mean of HT position, 4
 
  ## find first occurrence of HH
  mean(regexpr(HH, ht))+1#mean of HH position, 6
 
  FYI, this is not homework, I have not been in school in
  years.  I saw a similar problem posed in a blog post on the
  Revolutions R blog, and although I believe the answer, I'm
  having a hard time figuring out why this should be?
 
  Thanks,
  Erik Iverson
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
David M Smith da...@revolution-computing.com
Director of Community, REvolution Computing www.revolution-computing.com
Tel: +1 (206) 577-4778 x3203 (San Francisco, USA)

Check out our upcoming events schedule