Part of my issue was that I was not answering my original question. "What is more likely to show up first, HT or HH?" The answer to that turns out to be "neither", or "identical chances".
ht <- replicate(2500, paste(sample(c("H", "T"), 100, replace = TRUE), collapse = "")) hts <- regexpr("HT", ht) + 1 hhs <- regexpr("HH", ht) + 1 ## which is first? table(hts < hhs) # about 50/50 summary(hts) #mean of 4 summary(hhs) #mean of 6 So, "What is more likely to show up first, HH or HT?" is of course a different question than "Are the expected values of the positions for the first HT or HH the same?" I suppose that's where confusion set in. It seems that if HH appears later in the string on average (i.e., after 6 tosses instead of 4), that the probability of it being first would be lower than HT, which is obviously wrong! A quick graphic that helps show this (you must run the above code first): library(lattice) ht.df <- data.frame(count = c(hts, hhs), type = gl(2, 1250, labels = c("HT", "HH"))) barchart(prop.table(xtabs(~ count + type, data = ht.df)), stack = FALSE, horizontal = FALSE, box.ratio = .8, auto.key = TRUE) Thanks to all those who replied, and also someone sent me the following link off list, it also clears up the confusion: http://www.mit.edu/~emin/writings/coinGame.html Best, Erik -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson Sent: Monday, August 31, 2009 2:17 PM To: r-help@r-project.org Subject: [R] Offtopic, HT vs. HH in coin flips Dear R-help, Could someone please try to explain this paradox to me? What is more likely to show up first in a string of coin tosses, "Heads then Tails", or "Heads then Heads"? ##generate 2500 strings of random coin flips ht <- replicate(2500, paste(sample(c("H", "T"), 100, replace = TRUE), collapse = "")) ## find first occurrence of HT mean(regexpr("HT", ht))+1 #mean of HT position, 4 ## find first occurrence of HH mean(regexpr("HH", ht))+1 #mean of HH position, 6 FYI, this is not homework, I have not been in school in years. I saw a similar problem posed in a blog post on the Revolutions R blog, and although I believe the answer, I'm having a hard time figuring out why this should be? Thanks, Erik Iverson ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.