# Re: Another tedious hypothetical

rmiller wrote:

At 05:22 PM 6/8/2005, Jesse Mazer wrote:
rmiller wrote:

At 02:45 PM 6/7/2005, Jesse Mazer wrote:
(snip)

Of course in this example Feynman did not anticipate in advance what licence plate he'd see, but the kind of "hindsight bias" you are engaging in can be shown with another example. Suppose you pick 100 random words out of a dictionary, and then notice that the list contains the words "sun", "also", and "rises"...as it so happens, that particular 3-word "gestalt" is also part of the title of a famous book, "the sun also rises" by Hemingway. Is this evidence that Hemingway was able to anticipate the results of your word-selection through ESP? Would it be fair to test for ESP by calculating the probability that someone would title a book with the exact 3-word gestalt "sun, also, rises"? No, because this would be tailoring the choice of gestalt to Hemingway's book in order to make it seem more unlikely, in fact there are 970,200 possible 3-word gestalts you could pick out of a list of 100 possible words, so the probability that a book published earlier would contain *any* of these gestalts is a lot higher than the probability it would contain the precise gestalt "sun, also, rises". Selecting a precise target gestalt on the basis of the fact that you already know there's a book/story containing that gestalt is an example of hindsight bias--in the Heinlein example, you wouldn't have chosen the precise gestalt of Szilard/lens/beryllium/uranium/bomb from a long list of words associated with the Manhattan Project if you didn't already know about Heinlein's story.

RM wrote:
In two words: Conclusions first.
Can you really offer no scientific procedure to evaluate Heinlein's story? At the cookie jar level, can you at least grudgingly admit that the word "Szilard" sure looks like "Silard"? Sounds like it too. Or is that a coincidence as well? What are the odds. Should be calculable--how many stories written in 1939 include the names of Los Alamos scientists in conjunction with the words "bomb" , "uranium. . ."

And that, in my view, is the heart of the problem. Rather than swallow hard and look at this in a non-biased fashion, you seem to be glued to the proposition that (1) it's intractable or (2) it's not worth analyzing because the answer is obvious.

I think you misunderstood what I was arguing in my previous posts. If you look them over again, you'll see that I wasn't making a broad statement about the impossibility of estimating the probability that this event would have happened by chance, I was making a specific criticism of *your* method of doing so, where you estimate the probability of the particular "gestalt" of Szilard/lens/beryllium/uranium/bomb, rather than trying to estimate the probability that a story would anticipate *any* possible gestalt associated with the Manhattan Project. By doing this, you are incorporating hindsight knowledge of Heinlein's story into your choice of the "target" whose probability you want to estimate, and in general this will always lead to estimates of the significance of a "hit" which are much too high. If you instead asked someone with no knowledge of of Heinlein's story to come up with a list of as many possible words associated with the Manhattan Project that he could think of, then estimated the probability that a story would anticipate *any* combination of words on the list, then your method would not be vulnerable to this criticism (it might be flawed for other reasons, but I didn't address any of these other reasons in my previous posts).

Good starting premise. But words have meaning, and while "the sun also rises" may be interpreted to presage the bomb, it in fact is about bullfighting. No nukes there.

My example had nothing to do with nukes, it was just about the fact that Hemingway's book title "anticipated" three of the words on my random list of 100 words.

Heinlein's story is clearly about energy being derived from uranium--*and* has the name "Silard." These can not be compared with random number associations, simply because these words involve more information. To use a crude example, in the science community the name "Szilard" conjures up one prime association.

This is a complete non sequitur--the fact that the words have meaning has nothing to do with calculating the probability that someone like Heinlein would guess them by chance (similarly, in my example it wouldn't really make a difference if the 100 words were part of a meaningful poem rather than being selected at random). The point of the analogy is just that there are lots of other words associated with the Manhattan Project ('Oppenheimer', 'mushroom', 'fat man', etc.), words which of course all have meaning too, and that calculating the probability of the *particular* words "Szilard/lens/uranium/etc." appearing in a story is not legitimate because that choice of target is completely based on your hindsight knowledge of Heinlein's story. You should instead calculate the probability that a story would contain *any* combination of meaningful words associated with the Manhattan project. This is exactly analogous to the fact that in my example, you should have been calculating the probability that *any* combination of words from the list of 100 would appear in a book title, not the probability that the particular word combination "sun", "also", and "rises" would appear.

Look over the analogy I made in my last post again:

Suppose you pick 100 random words out of a
dictionary, and then notice that the list contains the words "sun", "also",
and "rises"...as it so happens, that particular 3-word "gestalt" is also
part of the title of a famous book, "the sun also rises" by Hemingway. Is
this evidence that Hemingway was able to anticipate the results of your
word-selection through ESP? Would it be fair to test for ESP by calculating
the probability that someone would title a book with the exact 3-word
gestalt "sun, also, rises"? No, because this would be tailoring the choice
of gestalt to Hemingway's book in order to make it seem more unlikely, in
fact there are 970,200 possible 3-word gestalts you could pick out of a list
of 100 possible words, so the probability that a book published earlier
would contain *any* of these gestalts is a lot higher than the probability
it would contain the precise gestalt "sun, also, rises".

To repeat, Heinlein's story is about uranium energy, the possibility of the factory blowing up, etc. The context is fairly clear. Hemingway's story is about Spain, bullfighting and affairs of the heart. No nukes there.

I thought it was pretty clear that my analogy was about general issues relating to calculations of probabilities, it wasn't meant to have anything to do with nukes specifically.

To simplify things even further, let's say you simply make a list of ten random numbers from 1 to 100, and before you make the list I make the prediction "the list will contain the numbers 23 and 89". If it turns out that those two numbers are indeed on your list, what is the significance of this result as evidence for precognition on my part? Your method would be like ignoring the other 8 numbers on the list and just finding the probability that I would hit the precise target of "23, 89" by chance, which (assuming order doesn't matter) would be only about a 1 in 5025 shot, if my math is right. But the probability that both the numbers I guess will be *somewhere* on the list of ten is significantly higher--I get that the probability of this would be about 1 in 121. So if this experiment is done in many alternate universes, then if in fact I have no precognitive abilities, in about 1 in 121 universes, both numbers I guess will happen to be on your list by luck. But then if you used the method of tailoring the choice of target to my guess, in each such universe you will conclude that I only had a 1 in 5025 chance of making that guess by chance. Clearly, then, you get bad conclusions if you use hindsight knowledge to tailor the choice of target to what you know was actually guessed in this way. But it's also clear that this example is sufficiently well-defined that I would have no general objection to estimating the probability that my "hit" could have occurred by chance, it's just that the correct answer is 1 in 121, not 1 in 5025.

Sorry. In the raw sense, numbers merely represent values---unless you want to get into that weird set of coincidences about 1/139--i.e. Enrico Fermi's hospital room, etc. (And I sincerely hope you *don't*.)

Another non-sequitur. When you talk about the probability of someone guessing something in advance by pure luck (ie under the null hypothesis of no ESP), it doesn't make a difference whether the thing he is supposed to be guessing is meaningful words, meaningless words, numbers, playing cards, Presidents, etc. (unless the nature of the thing is such that even without ESP, he can narrow down the options somehow by using information available to him--but there was no information available to Heinlein at the time that would allow him to reasonably anticipate that a name like "Szilard" was any more likely to be associated with a nuclear bomb than any other name).

Again, my concern is that scientists are too willing to prejudge something before diving into it.

OK, but this is a tangent that has nothing to do with the issue I raised in my posts about the wrongness of selecting the target (whose probability of guessing you want to calculate) using hindsight knowledge of what was actually guessed. If you don't want to discuss this specific issue then say so--I am not really interested in discussing the larger issue of what the "correct" way to calculate the probability of the Heinlein coincidences would be, I only wanted to talk about this specific way in which *your* method is obviously wrong. Like I said before, any method that could be invented by someone who didn't know in advance about Heinlein's story would avoid this particular mistake, although it might suffer from other flaws.

Jesse