On Dec 12, 2007 4:27 PM, Álvaro Begué <[EMAIL PROTECTED]> wrote: > Clearly I'm missing something, because I still don't understand. Let's > > take a simple example of a move is on the 3rd line and has a gamma value of > > 1.75. What is the equation or sequence of discrete values that I can > > take the derivative of? > > > > We start with a database of games, and we are trying to find a set of > gamma values. For a given set of gamma values, we can compute the > probability of all the moves happening exactly as they happened in the > database. So if the first move is E4 and we had E4 as having a probability > of 0.005, we start with that, then we take the next move, and multiply > 0.005 by the probability of the second move, etc. By the end of the > database, we'll have some number like 3.523E-9308 which is the probability > of all of the moves in the database happening. This is the probability of > the database if it had been generated by a random process following the > probability distributions modeled by the set gamma values. You can see this > as a function of the gamma values. This function is usually called > "likelihood function". In order to pick the best gammas, we choose the ones > with the maximum likelihood. Sometimes we use the logarithm of the > likelihood instead, which has the interpretation of being "minus the amount > of information in the database", plus it's not a number with gazillion 0s > after the decimal point. > > Now, around the point where the maximum likelihood happens, you can try to > move one of the gammas and see how much it hurts the likelihood. For some > features it will hurt a lot, which means that the value has to be very close > to the one you computed, or you'll get a bad model, and for some features it > will hurt very little, which means that there are other settings of the > value that are sort of equivalent. The second derivative of the likelihood > (or the log of the likelihood, I don't think it should matter much), will > tell you how narrow a peak you are at. > > Does that make some sense? >
It makes perfect sense. Thanks. If I do the math right, using the notations of Remi's paper, d^2 log(L) / (dgamma_i)^2 = sum_j [C_ij^2/(C_ij*gamma_i+D_ij)^2 - A_ij^2/(A_ij*gamma_i+B_ij)^2] = 1 / (sigma for ELO of gamma_i)^2
_______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
