Re: [R] Offtopic, HT vs. HH in coin flips
On 31-Aug-09 19:16:33, Erik Iverson wrote: Dear R-help, Could someone please try to explain this paradox to me? What is more likely to show up first in a string of coin tosses, Heads then Tails, or Heads then Heads? ##generate 2500 strings of random coin flips ht - replicate(2500, paste(sample(c(H, T), 100, replace = TRUE), collapse = )) ## find first occurrence of HT mean(regexpr(HT, ht))+1#mean of HT position, 4 ## find first occurrence of HH mean(regexpr(HH, ht))+1#mean of HH position, 6 FYI, this is not homework, I have not been in school in years. I saw a similar problem posed in a blog post on the Revolutions R blog, and although I believe the answer, I'm having a hard time figuring out why this should be? Thanks, Erik Iverson Be very careful about the statement of the problem! [1] The probability that HH will occur first (i.e. before HT) is the same as the probability that HT will occur first (i.e. before HH). [2] However, the probability that the first occurrence of HT will be on a given position of the H is generally not the same as the probability that the first occurrence of HH will be on the same position of the first H. [1]: At the first occurrence of (either HH or HT), there is an initial string S, ending in an H, followed by either an H (for HH) or a T (for HT). Both are equally likely. So the probability that the first occurrence of (either HH or HT) is an HH is the same as the probability that it is an HT. [2]: (A) the first occurrence of an HH is in a sequence of any collection of H and T provided there is no HH in the sequence, and the last is H, followed by H. However, HT is allowed to occur in the sequence. But (B) the first occurrence of an HT is in a sequence of (zero or more T) followed by (1 or more H) followed by T. This is the only pattern in which HT does not occur prior to the final HT. Similarly, HH is allowed to pccur in the sequence. The reason that, in general, the probability of HH first occuring at a given position is different from the probability if HT first occurring at that position lies in the differences between the number of possible sequences satisfying (A), and the number of possible sequences satisfying (B). The first few cases (HH or HT first occurring at (k+1), so that the position of the first H in HH or HT is at k) are, with their probabilities: k=1: HH HT 1/41/4 K=2: THH HHT THT 1/8 2/8 k=3: TTHH HHHT HTHH THHT TTHT 2/16 3/16 k=4:TTTHH T THTHH THHHT HTTHH TTHHT TTTHT 3/32 4/32 The HT case is simple: P.HT[k] = Prob(1st HT at (k+1)) = k/(2^(k+1)) Exercise for the reader: Sum(P.HT) = 1 The HH case is more interesting. Experimental scribblings on parer threw up an hypothesis, which I decided to explore in R. Thanks to Gerrit Eichner for suggestion the use of expand.grid()! ## Function to count sequences giving 1st HH on throw k+1 countHH - function(k){ M - as.matrix(expand.grid(rep(list(0:1),k))) ix - (M[,k]==1) ## k must be an H (then k+1 will be H) for(i in (1:(k-1))){ ix-ix( !((M[,i]==1)(M[,i+1]==1)) ) } sum(ix) ## list(Count=sum(ix),Which=M[ix,]) } Now, ignoring the case k=1: HHcounts - NULL for(i in (2:12)){ HHcounts-c(HHcounts,countHH(i)) } rbind((3:13),HHcounts) # [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] #3456789 10 111213 #HHcounts12358 13 21 34 5589 144 Lo and Behold, we have a Fibonnaci sequence! Another exercise for the reader ... Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 01-Sep-09 Time: 10:38:58 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Offtopic, HT vs. HH in coin flips
Well, If the first flip is H, then the HT pattern occurs with the first flip in the second run (after however long the 1st run of heads is). If the first flip is T, then the second run will be H's and the HT pattern will be the first flip of the 3rd run. So the HT pattern will occur after 1 or 2 runs (at the beginning of the 2nd or 3rd). On the other hand, the HH pattern can occur after any number of runs (given that the H runs are only 1 long). This means that the HH pattern has a higher probability of being in the right tail of the distribution which will increase the mean. The probability of HH or HT as the 1st pair is the same. Just looking at the first 3 flips, the probability of HH occurring first at flips 2 and 3 has only 1 chance (THH, HHH means that the first HH was at 1 and 2) and therefore has probability 1/8. The probability of HT first occurring at 2 3 has 2 options THT or HHT and therefore is twice as likely. Does this help? -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Erik Iverson Sent: Monday, August 31, 2009 1:17 PM To: r-help@r-project.org Subject: [R] Offtopic, HT vs. HH in coin flips Dear R-help, Could someone please try to explain this paradox to me? What is more likely to show up first in a string of coin tosses, Heads then Tails, or Heads then Heads? ##generate 2500 strings of random coin flips ht - replicate(2500, paste(sample(c(H, T), 100, replace = TRUE), collapse = )) ## find first occurrence of HT mean(regexpr(HT, ht))+1#mean of HT position, 4 ## find first occurrence of HH mean(regexpr(HH, ht))+1#mean of HH position, 6 FYI, this is not homework, I have not been in school in years. I saw a similar problem posed in a blog post on the Revolutions R blog, and although I believe the answer, I'm having a hard time figuring out why this should be? Thanks, Erik Iverson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Offtopic, HT vs. HH in coin flips
Case starting with H: Pr= 0.5 H first H second Subcase 1a: Pr= 0.5 * 0.5 = 0.25 H first T second... leads to TH evenually Subcase 1b: Pr = 0.5 * 0.5 = 0.25 === Case T first: Pr = 0.5 all subcases lead to TH first -- David. On Aug 31, 2009, at 3:16 PM, Erik Iverson wrote: Dear R-help, Could someone please try to explain this paradox to me? What is more likely to show up first in a string of coin tosses, Heads then Tails, or Heads then Heads? ##generate 2500 strings of random coin flips ht - replicate(2500, paste(sample(c(H, T), 100, replace = TRUE), collapse = )) ## find first occurrence of HT mean(regexpr(HT, ht))+1#mean of HT position, 4 ## find first occurrence of HH mean(regexpr(HH, ht))+1#mean of HH position, 6 FYI, this is not homework, I have not been in school in years. I saw a similar problem posed in a blog post on the Revolutions R blog, and although I believe the answer, I'm having a hard time figuring out why this should be? Thanks, Erik Iverson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Offtopic, HT vs. HH in coin flips
Part of my issue was that I was not answering my original question. What is more likely to show up first, HT or HH? The answer to that turns out to be neither, or identical chances. ht - replicate(2500, paste(sample(c(H, T), 100, replace = TRUE), collapse = )) hts - regexpr(HT, ht) + 1 hhs - regexpr(HH, ht) + 1 ## which is first? table(hts hhs) # about 50/50 summary(hts) #mean of 4 summary(hhs) #mean of 6 So, What is more likely to show up first, HH or HT? is of course a different question than Are the expected values of the positions for the first HT or HH the same? I suppose that's where confusion set in. It seems that if HH appears later in the string on average (i.e., after 6 tosses instead of 4), that the probability of it being first would be lower than HT, which is obviously wrong! A quick graphic that helps show this (you must run the above code first): library(lattice) ht.df - data.frame(count = c(hts, hhs), type = gl(2, 1250, labels = c(HT, HH))) barchart(prop.table(xtabs(~ count + type, data = ht.df)), stack = FALSE, horizontal = FALSE, box.ratio = .8, auto.key = TRUE) Thanks to all those who replied, and also someone sent me the following link off list, it also clears up the confusion: http://www.mit.edu/~emin/writings/coinGame.html Best, Erik -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson Sent: Monday, August 31, 2009 2:17 PM To: r-help@r-project.org Subject: [R] Offtopic, HT vs. HH in coin flips Dear R-help, Could someone please try to explain this paradox to me? What is more likely to show up first in a string of coin tosses, Heads then Tails, or Heads then Heads? ##generate 2500 strings of random coin flips ht - replicate(2500, paste(sample(c(H, T), 100, replace = TRUE), collapse = )) ## find first occurrence of HT mean(regexpr(HT, ht))+1#mean of HT position, 4 ## find first occurrence of HH mean(regexpr(HH, ht))+1#mean of HH position, 6 FYI, this is not homework, I have not been in school in years. I saw a similar problem posed in a blog post on the Revolutions R blog, and although I believe the answer, I'm having a hard time figuring out why this should be? Thanks, Erik Iverson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Offtopic, HT vs. HH in coin flips
It gets even more interesting when you ask about which of 2 triples of head/tail sequences appears first in an infinite sequence of heads and tails. Martin Gardiner wrote about this in the early 1970's Martin Gardner, Mathematical Games: The Paradox of the Nontransitive Dice and the Elusive Principle of Indifference. Scientific American 223, 110-114, Dec. 1970 (and perhaps again in 1974). His book, The Colossal Book of Mathematics: classic puzzles, paradoxes, and problems has that stuff reprinted and updated. Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson Sent: Monday, August 31, 2009 1:35 PM To: Erik Iverson; r-help@r-project.org Subject: Re: [R] Offtopic, HT vs. HH in coin flips Part of my issue was that I was not answering my original question. What is more likely to show up first, HT or HH? The answer to that turns out to be neither, or identical chances. ht - replicate(2500, paste(sample(c(H, T), 100, replace = TRUE), collapse = )) hts - regexpr(HT, ht) + 1 hhs - regexpr(HH, ht) + 1 ## which is first? table(hts hhs) # about 50/50 summary(hts) #mean of 4 summary(hhs) #mean of 6 So, What is more likely to show up first, HH or HT? is of course a different question than Are the expected values of the positions for the first HT or HH the same? I suppose that's where confusion set in. It seems that if HH appears later in the string on average (i.e., after 6 tosses instead of 4), that the probability of it being first would be lower than HT, which is obviously wrong! A quick graphic that helps show this (you must run the above code first): library(lattice) ht.df - data.frame(count = c(hts, hhs), type = gl(2, 1250, labels = c(HT, HH))) barchart(prop.table(xtabs(~ count + type, data = ht.df)), stack = FALSE, horizontal = FALSE, box.ratio = .8, auto.key = TRUE) Thanks to all those who replied, and also someone sent me the following link off list, it also clears up the confusion: http://www.mit.edu/~emin/writings/coinGame.html Best, Erik -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson Sent: Monday, August 31, 2009 2:17 PM To: r-help@r-project.org Subject: [R] Offtopic, HT vs. HH in coin flips Dear R-help, Could someone please try to explain this paradox to me? What is more likely to show up first in a string of coin tosses, Heads then Tails, or Heads then Heads? ##generate 2500 strings of random coin flips ht - replicate(2500, paste(sample(c(H, T), 100, replace = TRUE), collapse = )) ## find first occurrence of HT mean(regexpr(HT, ht))+1#mean of HT position, 4 ## find first occurrence of HH mean(regexpr(HH, ht))+1#mean of HH position, 6 FYI, this is not homework, I have not been in school in years. I saw a similar problem posed in a blog post on the Revolutions R blog, and although I believe the answer, I'm having a hard time figuring out why this should be? Thanks, Erik Iverson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Offtopic, HT vs. HH in coin flips
For those interested, the original puzzle was detailed in a TED presentation by Peter Donnelly, which you can see at blog.revolution-computing.com here: http://bit.ly/2l0ZwS The original problem was: which coin-toss sequence do you expect to see first, HTH or HTT? Peter's explanation comes at about 5:00 in the video, and for a lay audience it's great. (The champagne metaphor is wonderful.) I'm loving the R simulations of this and related problems in this thread! Cheers, # David On Mon, Aug 31, 2009 at 1:55 PM, William Dunlap wdun...@tibco.com wrote: It gets even more interesting when you ask about which of 2 triples of head/tail sequences appears first in an infinite sequence of heads and tails. Martin Gardiner wrote about this in the early 1970's Martin Gardner, Mathematical Games: The Paradox of the Nontransitive Dice and the Elusive Principle of Indifference. Scientific American 223, 110-114, Dec. 1970 (and perhaps again in 1974). His book, The Colossal Book of Mathematics: classic puzzles, paradoxes, and problems has that stuff reprinted and updated. Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson Sent: Monday, August 31, 2009 1:35 PM To: Erik Iverson; r-help@r-project.org Subject: Re: [R] Offtopic, HT vs. HH in coin flips Part of my issue was that I was not answering my original question. What is more likely to show up first, HT or HH? The answer to that turns out to be neither, or identical chances. ht - replicate(2500, paste(sample(c(H, T), 100, replace = TRUE), collapse = )) hts - regexpr(HT, ht) + 1 hhs - regexpr(HH, ht) + 1 ## which is first? table(hts hhs) # about 50/50 summary(hts) #mean of 4 summary(hhs) #mean of 6 So, What is more likely to show up first, HH or HT? is of course a different question than Are the expected values of the positions for the first HT or HH the same? I suppose that's where confusion set in. It seems that if HH appears later in the string on average (i.e., after 6 tosses instead of 4), that the probability of it being first would be lower than HT, which is obviously wrong! A quick graphic that helps show this (you must run the above code first): library(lattice) ht.df - data.frame(count = c(hts, hhs), type = gl(2, 1250, labels = c(HT, HH))) barchart(prop.table(xtabs(~ count + type, data = ht.df)), stack = FALSE, horizontal = FALSE, box.ratio = .8, auto.key = TRUE) Thanks to all those who replied, and also someone sent me the following link off list, it also clears up the confusion: http://www.mit.edu/~emin/writings/coinGame.html Best, Erik -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson Sent: Monday, August 31, 2009 2:17 PM To: r-help@r-project.org Subject: [R] Offtopic, HT vs. HH in coin flips Dear R-help, Could someone please try to explain this paradox to me? What is more likely to show up first in a string of coin tosses, Heads then Tails, or Heads then Heads? ##generate 2500 strings of random coin flips ht - replicate(2500, paste(sample(c(H, T), 100, replace = TRUE), collapse = )) ## find first occurrence of HT mean(regexpr(HT, ht))+1#mean of HT position, 4 ## find first occurrence of HH mean(regexpr(HH, ht))+1#mean of HH position, 6 FYI, this is not homework, I have not been in school in years. I saw a similar problem posed in a blog post on the Revolutions R blog, and although I believe the answer, I'm having a hard time figuring out why this should be? Thanks, Erik Iverson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- David M Smith da...@revolution-computing.com Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (San Francisco, USA) Check out our upcoming events schedule