Don Dailey: <[EMAIL PROTECTED]>: > >Hideki Kato wrote: >> Don Dailey: <[EMAIL PROTECTED]>: >> >>> [EMAIL PROTECTED] wrote: >>> >>>>> I experimented with something similar a while ago, using the >>>>> publicly available mogo and manipulating komi between moves. >>>>> >>>>> If its win probability fell below a certain threshold (and the move >>>>> number wasn't too high), I told it to play on the assumption that it >>>>> would receive a few points more komi (and similarly when the win >>>>> probability became high). >>>>> >>>>> >>>> That's certainly a nice solution ! >>>> >>>> It's probably easy to implement, and relatively easy to tune. >>>> >>>> Now, the question is: did it seem stronger to our eyes because it played >>>> more human-like ? >>>> >>>> >>> I experimented with this idea extensively a while back and never found >>> an implementation that improved it's playing ability. In fact every >>> version played weaker. I tried versions that dynamically adjusted >>> komi during the course of a game when things looked close to hopeless >>> (or certain) and I came to the eventual conclusion that whenever you >>> didn't use the correct komi, you were in fact decreasing it winning >>> chances. If you tell it that is not winning when it really IS >>> winning by 1/2 point, then there will be games where it plays stupid >>> desperate moves when it doesn't have to. >>> >>> If it is almost losing and you tell it (by adjusting komi) that it has a >>> good chance, it will tend to lose with a higher score, but you have >>> essentially treated it as a spoiled child, lowering your expectations >>> for it and it is happy to play with the new goal of losing. >>> >>> This whole concept is based on what appears to be a flawed idea, >>> deceive the program into believing something that isn't true in the >>> hopes that it will somehow make it play better. >>> >> >> I don't agree. If the estimated score by playouts had no error, you >> were correct. >> > >It's not about the error - it's about the side-effects of trying to >second guess the cure. My grandmother used to think castor oil was >the solution to every illness.
I don't think that's a universal solver. You wrote the idea is flawed but in fact the idea is effective for, at least, my program. >It's naive to think some simplistic deception imposed upon the program >is going to correct the error when you don't even know if the program is >erring in the first place. How can you say, "the program thinks it >is losing, but it MUST be in error and I'm going to change the komi so >that it will 'play better'?" You must have some basis for knowing the >program is in error before you can lower it's goal. I never say so. Perhaps my explanation made you misunderstand my issue. Please read my example below once more. >You have basically 2 cases when losing. One case is that the program >really is busted and is in a dead lost position. The other case is >that the program THINKS it's lost but really is winning (or at least has >excellent chances.) In the first case, we are simply playing for a >miracle, hoping that if we lower our expectations the opponent will >falter. This is at best a speculative and extremely minor >enhancement if we can even make it work and I argue that it also imposes >risks of it's own. Unless the score is actually zero in every line, >the program is going to play for any outside chance of a win, no matter >how small. Telling it that a big loss is good enough will cause it to >play safer, not more aggressively like you think. (Yes, you can >hand pick specific positions where you can "trick" it into doing the >right thing, but that is "cooking the books", it's not a real solution.) > >The other case is that the program really isn't losing even though it >thinks it is dead busted. How does the program know this is really >the case? If it knew that, your program would be better. So you >don't even know why it thinks it's busted. Even so, I argue that >telling it to accept less than what is required to win is risky and will >lose more games than it will win (which isn't very many since it almost >certainly losing anyway.) > >So the bottom line is that you are trying to make it play better in >positions where it is probably dead lost. That is a noble cause worth >a tiny gain in strength if you are fine tuning your program and if you >can even make it work, but I would rather focus on other things. > >I honestly believe this whole concept is a big waste of time if you are >trying to do it to make your program stronger. It's slightly more >worthwhile as a cosmetic improvement if that's what you want and you are >willing to give up a little bit of strength. It's a few days hack immediately before the largest computer go tournament in Japan. So, I'd like to say thanks and don't worry. >> I've applied my formula (in my previous post) when estimated winning >> probability beyonds 65% and beneth 45% or so and get better >> results. For example, withou this, my program sometime lost games >> against weaker programs due to misunderstanding of L&D at corners >> with nakade. >> >I'm impressed if you actually got a measurable improvement doing this. >I have no doubt that in hand picked individual positions it will play >better. When I test things like this I am not happy with anything >less than thousands of games unless the improvement is large. How >many ELO points of improvement do you think that you achieved and how >many games did you test with? > >I would like to point out that even if your confidence is 95% in some >improvement, you can be easily deceived. If I'm trying to make >something work and I try enough different implementations sooner or >later one of them will test as an improvement with high confidence (even >though it isn't.) I have played 10 games in a row where one version >wins the first 10 games and yet it turns out that it is the weaker >version. Quote from my previous post for your question. (I didn't want to duplicate.) >I've implemented and used the idea for the first UEC Cup Computer-Go >tournament (19x19, Japanese rules) on Dec. 1-2 and was 5th/27 >participants. >http://jsb.cs.uec.ac.jp/~igo/eng/ #No results in English pages. > >It's easy to implement but hard to tune, though by my experiments. >The point is how to caliculate an appropriate bias or komi from >current info, ie,expected winnig rate, number of moves, etc. I've >tested some means and used following very ad-hoc formula for the >tournament. > >delta_komi = 10^(K * (number_of_empty_points / 400 - 1)), >where K is 1 if winnig and is 2 if loosing. Also, if expected >winning rate is around 50%, Komi is unmodified. > >By a few tens of benchmark games against GNU Go before the tounament, >it improved about 100 ELO but the number of games is not enough >to believe this. Clearly, as myself wrote here, the number of samples is not enough. >It's quite difficult to measure a small improvement I have serious >doubts you achieved a large improvement, but I will wait for your >report and keep an open mind. As I wrote first, I don't think this idea gives us general improvement. I just reported it works well in my case, which is written at the end of previous mail. But perhaps you didn't read it because I've accidentally inserted my signature here. My fault. I decreased the quotation level by 2 in the following. > We had this conversation recently but with idea of changing the scoring > algorithm, not the komi. I think the same principles apply. > I actually think some variation of the idea would work if we understood > the issues better. In SOME positions it might very well play > objectively better if we lie to it by lowering or raising the komi or > manipulating the scoring algorithm, but we have to pick and choose the > types of positions where this doesn't lead to weaker play. For > instance it's good if it makes it fight better, but in many positions > lowering the komi to make it fight in a losing position will cause it to > consolidate a losing position. UCT isn't going to fight when it can > consolidate. It doesn't care how difficult a position is to play for > you, only for it. It depends on the property of playouts of programs, I guess. That is, if the error of the estimated score decreases in monotonic manner, this idea won't help any. BTW, please don't forget it helps mc programs play similar to human's style. -Hideki -- [EMAIL PROTECTED] (Kato) _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/