Re: [Computer-go] Learning related stuff
On 11/29/2017 6:15 PM, Dave Dyer wrote: My question is this; people have been messing around with neural nets and machine learning for 40 years; what was the breakthrough that made alphago succeed so spectacularly. maybe it was https://en.wikipedia.org/wiki/Vanishing_gradient_problem#Residual_networks. they are pretty new i think. or some combination of things. thanks -- Honesty is a very expensive gift. So, don't expect it from cheap people - Warren Buffett http://tayek.com/ ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
> My question is this; people have been messing around with neural nets > and machine learning for 40 years; what was the breakthrough that made > alphago succeed so spectacularly. 5 or 6 orders more magnitude CPU power (relative to the late 90s) (*). This means you can try out ideas to see if they work, and get the answer back in hours, rather than years. After 10 hrs it was playing with an elo somewhere between 0 and 1000 (Figure 3 in the alpha go zero paper). I.e. idiot level. That is something like 1100 years of effort on 1995 hardware. They put together a large team (by hobbyist computer go standards) of top people, at least two of which had made strong go programs before. I'd name two other things: dropout (and other regularization techniques) allowed deeper networks; the work on image recognition gave you production-ready CNNs, without having to work through all the dead ends yourself. Also better optimization techniques. Taken together maybe algorithmic advances are worth another order of magnitude. Darren *: The source is the intro to my own book ;-) From memory, I made the estimate as the average of top supercomputer 20 years apart, and a typical high-end PC 20 years apart. https://en.wikipedia.org/wiki/History_of_supercomputing#Historical_TOP500_table -- Darren Cook, Software Researcher/Developer My New Book: Practical Machine Learning with H2O: http://shop.oreilly.com/product/0636920053170.do ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
My question is this; people have been messing around with neural nets and machine learning for 40 years; what was the breakthrough that made alphago succeed so spectacularly. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
It's nearly comic to imagine a player at 1,1 trying to figure things out. It's not a diss on you; I honestly want for people to relax, take a minute, and treat badmouthing the alpha go team's ideas as a secondary consideration. They did good work. Probably arguing about the essentials won't prove that they're stupid in any way. So let's learn, move forward, and have no bad words about their ridiculously well-funded effort. Recreating their work at a smaller scale would be awesome. s. On Nov 29, 2017 4:33 PM, "Eric Boesch" wrote: > Could you be reading too much into my comment? AlphaGo Zero is an amazing > achievement, and I might guess its programmers will succeed in applying > their methods to other fields. Nonetheless, I thought it was interesting, > and it would appear the programmers did too, that before improving to > superhuman level, AlphaGo was temporarily stuck in a rut of playing > literally the worst first move on the board (excluding pass). That doesn't > mean I think I could do better. > > > On Tue, Nov 28, 2017 at 4:50 AM, uurtamo . wrote: > >> This is starting to feel like asking along the lines of, "how can I >> explain this to myself or improve on what's already been done in a way that >> will make this whole process work faster on my hardware". >> >> It really doesn't look like there are a bunch of obvious shortcuts. >> That's the whole point of decision-trees imposed by humans for 20+ years on >> the game; it wasn't really better. >> >> Probably what would be good to convince oneself of these things would be >> to challenge each assumption in divergent branches (suggested earlier) and >> watch the resulting players' strength over time. Yes, this might take a >> year or more on your hardware. >> >> I feel like maybe a lot of this is sour grapes; let's please again >> acknowledge that the hobbyists aren't there yet without trying to tear down >> the accomplishments of others. >> >> s. >> >> On Nov 27, 2017 7:36 PM, "Eric Boesch" wrote: >> >>> I imagine implementation determines whether transferred knowledge is >>> helpful. It's like asking whether forgetting is a problem -- it often is, >>> but evidently not for AlphaGo Zero. >>> >>> One crude way to encourage stability is to include an explicit or >>> implicit age parameter that forces the program to perform smaller >>> modifications to its state during later stages. If the parameters you copy >>> from problem A to problem B also include that age parameter, so the network >>> acts old even though it is faced with a new problem, then its initial >>> exploration may be inefficient. For an MCTS based example, if a MCTS node >>> is initialized to a 10877-6771 win/loss record based on evaluations under >>> slightly different game rules, then with a naive implementation, even if >>> the program discovers the right refutation under the new rules right away, >>> it would still need to revisit that node thousands of times to convince >>> itself the node is now probably a losing position. >>> >>> But unlearning bad plans in a reasonable time frame is already a feature >>> you need from a good learning algorithm. Even AlphaGo almost fell into trap >>> states; from their paper, it appears that it stuck with 1-1 as an opening >>> move for much longer than you would expect from a program probably already >>> much better than 40 kyu. Even if it's unrealistic for Go specifically, you >>> could imagine some other game where after days of analysis, the program >>> suddenly discovers a reliable trick that adds one point for white to every >>> single game. The effect would be the same as your komi change -- a mature >>> network now needs to adapt to a general shift in the final score. So the >>> task of adapting to handle similar games may be similar to the task of >>> adapting to analysis reversals within a single game, and improvements to >>> one could lead to improvements to the other. >>> >>> >>> >>> On Fri, Nov 24, 2017 at 7:54 AM, Stephan K >>> wrote: >>> 2017-11-21 23:27 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>: > My understanding is that the AlphaGo hardware is standing > somewhere in London, idle and waitung for new action... > > Ingo. The announcement at https://deepmind.com/blog/applying-machine-learning-mammography/ seems to disagree: "Our partners in this project wanted researchers at both DeepMind and Google involved in this research so that the project could take advantage of the AI expertise in both teams, as well as Google’s supercomputing infrastructure - widely regarded as one of the best in the world, and the same global infrastructure that powered DeepMind’s victory over the world champion at the ancient game of Go." ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go >>> >>> >>> _
Re: [Computer-go] Learning related stuff
Could you be reading too much into my comment? AlphaGo Zero is an amazing achievement, and I might guess its programmers will succeed in applying their methods to other fields. Nonetheless, I thought it was interesting, and it would appear the programmers did too, that before improving to superhuman level, AlphaGo was temporarily stuck in a rut of playing literally the worst first move on the board (excluding pass). That doesn't mean I think I could do better. On Tue, Nov 28, 2017 at 4:50 AM, uurtamo . wrote: > This is starting to feel like asking along the lines of, "how can I > explain this to myself or improve on what's already been done in a way that > will make this whole process work faster on my hardware". > > It really doesn't look like there are a bunch of obvious shortcuts. That's > the whole point of decision-trees imposed by humans for 20+ years on the > game; it wasn't really better. > > Probably what would be good to convince oneself of these things would be > to challenge each assumption in divergent branches (suggested earlier) and > watch the resulting players' strength over time. Yes, this might take a > year or more on your hardware. > > I feel like maybe a lot of this is sour grapes; let's please again > acknowledge that the hobbyists aren't there yet without trying to tear down > the accomplishments of others. > > s. > > On Nov 27, 2017 7:36 PM, "Eric Boesch" wrote: > >> I imagine implementation determines whether transferred knowledge is >> helpful. It's like asking whether forgetting is a problem -- it often is, >> but evidently not for AlphaGo Zero. >> >> One crude way to encourage stability is to include an explicit or >> implicit age parameter that forces the program to perform smaller >> modifications to its state during later stages. If the parameters you copy >> from problem A to problem B also include that age parameter, so the network >> acts old even though it is faced with a new problem, then its initial >> exploration may be inefficient. For an MCTS based example, if a MCTS node >> is initialized to a 10877-6771 win/loss record based on evaluations under >> slightly different game rules, then with a naive implementation, even if >> the program discovers the right refutation under the new rules right away, >> it would still need to revisit that node thousands of times to convince >> itself the node is now probably a losing position. >> >> But unlearning bad plans in a reasonable time frame is already a feature >> you need from a good learning algorithm. Even AlphaGo almost fell into trap >> states; from their paper, it appears that it stuck with 1-1 as an opening >> move for much longer than you would expect from a program probably already >> much better than 40 kyu. Even if it's unrealistic for Go specifically, you >> could imagine some other game where after days of analysis, the program >> suddenly discovers a reliable trick that adds one point for white to every >> single game. The effect would be the same as your komi change -- a mature >> network now needs to adapt to a general shift in the final score. So the >> task of adapting to handle similar games may be similar to the task of >> adapting to analysis reversals within a single game, and improvements to >> one could lead to improvements to the other. >> >> >> >> On Fri, Nov 24, 2017 at 7:54 AM, Stephan K >> wrote: >> >>> 2017-11-21 23:27 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>: >>> > My understanding is that the AlphaGo hardware is standing >>> > somewhere in London, idle and waitung for new action... >>> > >>> > Ingo. >>> >>> The announcement at >>> https://deepmind.com/blog/applying-machine-learning-mammography/ seems >>> to disagree: >>> >>> "Our partners in this project wanted researchers at both DeepMind and >>> Google involved in this research so that the project could take >>> advantage of the AI expertise in both teams, as well as Google’s >>> supercomputing infrastructure - widely regarded as one of the best in >>> the world, and the same global infrastructure that powered DeepMind’s >>> victory over the world champion at the ancient game of Go." >>> ___ >>> Computer-go mailing list >>> Computer-go@computer-go.org >>> http://computer-go.org/mailman/listinfo/computer-go >>> >> >> >> ___ >> Computer-go mailing list >> Computer-go@computer-go.org >> http://computer-go.org/mailman/listinfo/computer-go >> > > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
This is starting to feel like asking along the lines of, "how can I explain this to myself or improve on what's already been done in a way that will make this whole process work faster on my hardware". It really doesn't look like there are a bunch of obvious shortcuts. That's the whole point of decision-trees imposed by humans for 20+ years on the game; it wasn't really better. Probably what would be good to convince oneself of these things would be to challenge each assumption in divergent branches (suggested earlier) and watch the resulting players' strength over time. Yes, this might take a year or more on your hardware. I feel like maybe a lot of this is sour grapes; let's please again acknowledge that the hobbyists aren't there yet without trying to tear down the accomplishments of others. s. On Nov 27, 2017 7:36 PM, "Eric Boesch" wrote: > I imagine implementation determines whether transferred knowledge is > helpful. It's like asking whether forgetting is a problem -- it often is, > but evidently not for AlphaGo Zero. > > One crude way to encourage stability is to include an explicit or implicit > age parameter that forces the program to perform smaller modifications to > its state during later stages. If the parameters you copy from problem A to > problem B also include that age parameter, so the network acts old even > though it is faced with a new problem, then its initial exploration may be > inefficient. For an MCTS based example, if a MCTS node is initialized to a > 10877-6771 win/loss record based on evaluations under slightly different > game rules, then with a naive implementation, even if the program discovers > the right refutation under the new rules right away, it would still need to > revisit that node thousands of times to convince itself the node is now > probably a losing position. > > But unlearning bad plans in a reasonable time frame is already a feature > you need from a good learning algorithm. Even AlphaGo almost fell into trap > states; from their paper, it appears that it stuck with 1-1 as an opening > move for much longer than you would expect from a program probably already > much better than 40 kyu. Even if it's unrealistic for Go specifically, you > could imagine some other game where after days of analysis, the program > suddenly discovers a reliable trick that adds one point for white to every > single game. The effect would be the same as your komi change -- a mature > network now needs to adapt to a general shift in the final score. So the > task of adapting to handle similar games may be similar to the task of > adapting to analysis reversals within a single game, and improvements to > one could lead to improvements to the other. > > > > On Fri, Nov 24, 2017 at 7:54 AM, Stephan K > wrote: > >> 2017-11-21 23:27 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>: >> > My understanding is that the AlphaGo hardware is standing >> > somewhere in London, idle and waitung for new action... >> > >> > Ingo. >> >> The announcement at >> https://deepmind.com/blog/applying-machine-learning-mammography/ seems >> to disagree: >> >> "Our partners in this project wanted researchers at both DeepMind and >> Google involved in this research so that the project could take >> advantage of the AI expertise in both teams, as well as Google’s >> supercomputing infrastructure - widely regarded as one of the best in >> the world, and the same global infrastructure that powered DeepMind’s >> victory over the world champion at the ancient game of Go." >> ___ >> Computer-go mailing list >> Computer-go@computer-go.org >> http://computer-go.org/mailman/listinfo/computer-go >> > > > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
I imagine implementation determines whether transferred knowledge is helpful. It's like asking whether forgetting is a problem -- it often is, but evidently not for AlphaGo Zero. One crude way to encourage stability is to include an explicit or implicit age parameter that forces the program to perform smaller modifications to its state during later stages. If the parameters you copy from problem A to problem B also include that age parameter, so the network acts old even though it is faced with a new problem, then its initial exploration may be inefficient. For an MCTS based example, if a MCTS node is initialized to a 10877-6771 win/loss record based on evaluations under slightly different game rules, then with a naive implementation, even if the program discovers the right refutation under the new rules right away, it would still need to revisit that node thousands of times to convince itself the node is now probably a losing position. But unlearning bad plans in a reasonable time frame is already a feature you need from a good learning algorithm. Even AlphaGo almost fell into trap states; from their paper, it appears that it stuck with 1-1 as an opening move for much longer than you would expect from a program probably already much better than 40 kyu. Even if it's unrealistic for Go specifically, you could imagine some other game where after days of analysis, the program suddenly discovers a reliable trick that adds one point for white to every single game. The effect would be the same as your komi change -- a mature network now needs to adapt to a general shift in the final score. So the task of adapting to handle similar games may be similar to the task of adapting to analysis reversals within a single game, and improvements to one could lead to improvements to the other. On Fri, Nov 24, 2017 at 7:54 AM, Stephan K wrote: > 2017-11-21 23:27 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>: > > My understanding is that the AlphaGo hardware is standing > > somewhere in London, idle and waitung for new action... > > > > Ingo. > > The announcement at > https://deepmind.com/blog/applying-machine-learning-mammography/ seems > to disagree: > > "Our partners in this project wanted researchers at both DeepMind and > Google involved in this research so that the project could take > advantage of the AI expertise in both teams, as well as Google’s > supercomputing infrastructure - widely regarded as one of the best in > the world, and the same global infrastructure that powered DeepMind’s > victory over the world champion at the ancient game of Go." > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
2017-11-21 23:27 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>: > My understanding is that the AlphaGo hardware is standing > somewhere in London, idle and waitung for new action... > > Ingo. The announcement at https://deepmind.com/blog/applying-machine-learning-mammography/ seems to disagree: "Our partners in this project wanted researchers at both DeepMind and Google involved in this research so that the project could take advantage of the AI expertise in both teams, as well as Google’s supercomputing infrastructure - widely regarded as one of the best in the world, and the same global infrastructure that powered DeepMind’s victory over the world champion at the ancient game of Go." ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
Le 21/11/2017 à 23:27, "Ingo Althöfer" a écrit : > Hi Erik, > >> No need for AlphaGo hardware to find out; any >> toy problem will suffice to explore different >> initialization schemes... > I know that. > > My intention with the question is a different one: > I am thinking how humans are learning. Is it beneficial > to have learnt related - but different - stuff before? > The answer will depend on the case, of course. > > And in my role as a voyeur, I want to understand if having > learnt a Go variant X before turning my interest to a > "slightly" different Go variant Y. Do, I want to combine > the subject with some entertaining learning process. > (For instance, looking at the AlphaGo Zero games from the > 72 h experiment in steps of 2 hours was not only insightful > but also entertaining.) > >> you typically want to start with small weights so >> that the initial mapping is relatively smooth. > But again: For instance, when a eight year old child starts > to play violin, is it helpful or not when it had played > say a trumpet before? I believe that Human brain is too far from the alphago neural network that one knowledge about one can be transfered to the other. > My understanding is that the AlphaGo hardware is standing > somewhere in London, idle and waitung for new action... Definitely not idle: “[They] needed the computers for something else.” source: https://techcrunch.com/2017/11/02/deepmind-has-yet-to-find-out-how-smart-its-alphago-zero-ai-could-be/ > Ingo. > > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
In my experience people who are first taught variant a) and after a short while move on to b) remain overly fixated on capturing and are much slower to grasp the real game. So in this case I would argue that people really do have trouble unlearning when the games are too close … particularly when the first variant has such a simple and expected goal that must be deprecated to be able to move from b) to c). Cheers, David G Doshay ddos...@mac.com > On 22, Nov 2017, at 6:23 AM, Ingo Althöfer <3-hirn-ver...@gmx.de> wrote: > > In teaching go, one possible path (even with 2 steps) is > to start with > (a) Atari-Go on 9x9 board > then switch to > (b) "true" Go on 9x9 > then switch to > (c) Go on 19x19 > > What are optimal lengths for phases (a) und (b) in doing so? ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
Hello Stephan, > Another option for your experiment might be to take the 72-hour-old > network, but only retain the first layers, and initialize randomly the > last layers. yes, or many others. Not all of them have to be fantastic, but when you/we get some experience and have a new try every 3 or 4 days (by simply editing some hundred bytes of code), at the end of the year or decade some pearls will be in the harvest. Ingo. PS. My wife will find a way to have the power bills paid ;-) At least this is my expectation. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
2017-11-22 15:17 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>: > For instance, with respect to the 72-hour run of AlphaGo Zero > one might start several runs for Go(with komi=5.5), > the first one starting from fresh, the second one from the > 72-hour process after 1 hour, the next one after 2 hours ... > > Ingo Another option for your experiment might be to take the 72-hour-old network, but only retain the first layers, and initialize randomly the last layers. Stephan ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
Hi Petri, "Petri Pitkanen" > >>But again: For instance, when a eight year old child starts >>to play violin, is it helpful or not when it had played >>say a trumpet before? > > It would be and this is well known in practice. Logic > around the music is the same so hw would learn faster. > In the very long run there might be no wanted effects. > i.e. hard to learn away from something too similar... the question is, which intermediate point is optimal to switch from instrument/game 1 to game 2. Having in mind a complicated game 2, it might be helpful first to teach a simpler game 1 (for some limited time) and only then switch to game 2. In teaching go, one possible path (even with 2 steps) is to start with (a) Atari-Go on 9x9 board then switch to (b) "true" Go on 9x9 then switch to (c) Go on 19x19 What are optimal lengths for phases (a) und (b) in doing so? Ingo. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
Hi Alvaro, Von: "Álvaro Begué" > The term you are looking for is "transfer learning": > https://en.wikipedia.org/wiki/Transfer_learning thanks for that interesting hint. However, it is not exactly what I am looking at. My question was more in observing and understanding "transfer learning phenomena", let them be positive or negative. For instance, with respect to the 72-hour run of AlphaGo Zero one might start several runs for Go(with komi=5.5), the first one starting from fresh, the second one from the 72-hour process after 1 hour, the next one after 2 hours ... Ingo. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
>But again: For instance, when a eight year old child starts >to play violin, is it helpful or not when it had played >say a trumpet before? It would be and this is well known in practice. Logic around the music is the same so hw would learn faster. In the very long run there might be no wanted effects. i.e. hard to learn away from something too similar. But in case of trumper and violin no. But lets say 10 years of training violin played like bluegrass player plays and then switchin to classical. That would be hard. due required unlearning for which humans do no really have mechanism for. New skill needs to be learned better thatn the old skill or time needs to erase the untrained old skill. While the DCNN can learn and unlearn quite easily 2017-11-22 0:48 GMT+02:00 Álvaro Begué : > The term you are looking for is "transfer learning": https://en. > wikipedia.org/wiki/Transfer_learning > > > On Tue, Nov 21, 2017 at 5:27 PM, "Ingo Althöfer" <3-hirn-ver...@gmx.de> > wrote: > >> Hi Erik, >> >> > No need for AlphaGo hardware to find out; any >> > toy problem will suffice to explore different >> > initialization schemes... >> >> I know that. >> >> My intention with the question is a different one: >> I am thinking how humans are learning. Is it beneficial >> to have learnt related - but different - stuff before? >> The answer will depend on the case, of course. >> >> And in my role as a voyeur, I want to understand if having >> learnt a Go variant X before turning my interest to a >> "slightly" different Go variant Y. Do, I want to combine >> the subject with some entertaining learning process. >> (For instance, looking at the AlphaGo Zero games from the >> 72 h experiment in steps of 2 hours was not only insightful >> but also entertaining.) >> >> >> > you typically want to start with small weights so >> > that the initial mapping is relatively smooth. >> >> But again: For instance, when a eight year old child starts >> to play violin, is it helpful or not when it had played >> say a trumpet before? >> >> My understanding is that the AlphaGo hardware is standing >> somewhere in London, idle and waitung for new action... >> >> Ingo. >> >> ___ >> Computer-go mailing list >> Computer-go@computer-go.org >> http://computer-go.org/mailman/listinfo/computer-go >> > > > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
The term you are looking for is "transfer learning": https://en.wikipedia.org/wiki/Transfer_learning On Tue, Nov 21, 2017 at 5:27 PM, "Ingo Althöfer" <3-hirn-ver...@gmx.de> wrote: > Hi Erik, > > > No need for AlphaGo hardware to find out; any > > toy problem will suffice to explore different > > initialization schemes... > > I know that. > > My intention with the question is a different one: > I am thinking how humans are learning. Is it beneficial > to have learnt related - but different - stuff before? > The answer will depend on the case, of course. > > And in my role as a voyeur, I want to understand if having > learnt a Go variant X before turning my interest to a > "slightly" different Go variant Y. Do, I want to combine > the subject with some entertaining learning process. > (For instance, looking at the AlphaGo Zero games from the > 72 h experiment in steps of 2 hours was not only insightful > but also entertaining.) > > > > you typically want to start with small weights so > > that the initial mapping is relatively smooth. > > But again: For instance, when a eight year old child starts > to play violin, is it helpful or not when it had played > say a trumpet before? > > My understanding is that the AlphaGo hardware is standing > somewhere in London, idle and waitung for new action... > > Ingo. > > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
Hi Darren, > Can I correctly rephrase your question as: if you take a well-trained > komi 7.5 network, then give it komi 5.5 training data, will it adapt > quickly, or would it be faster/better to start over from scratch? (From > the point of view of creating a strong komi 5.5 program.) (?) in principle yes, but the training should be only with self-generated data, not with master games from outside. > Surely it would train much more quickly: all the early layers are about > learning liberty counting, atari and then life/death, good shape, etc. > (But, it would be fascinating if an experiment showed that wasn't the > case, and starting from a fresh random network trained more quickly!) Indeed. For these two options, I would like to know which one is the "true". Ingo. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
Hi Erik, > No need for AlphaGo hardware to find out; any > toy problem will suffice to explore different > initialization schemes... I know that. My intention with the question is a different one: I am thinking how humans are learning. Is it beneficial to have learnt related - but different - stuff before? The answer will depend on the case, of course. And in my role as a voyeur, I want to understand if having learnt a Go variant X before turning my interest to a "slightly" different Go variant Y. Do, I want to combine the subject with some entertaining learning process. (For instance, looking at the AlphaGo Zero games from the 72 h experiment in steps of 2 hours was not only insightful but also entertaining.) > you typically want to start with small weights so > that the initial mapping is relatively smooth. But again: For instance, when a eight year old child starts to play violin, is it helpful or not when it had played say a trumpet before? My understanding is that the AlphaGo hardware is standing somewhere in London, idle and waitung for new action... Ingo. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
> Would it typically help or disrupt to start > instead with values that are non-random? > What I have in mind concretely: Can I correctly rephrase your question as: if you take a well-trained komi 7.5 network, then give it komi 5.5 training data, will it adapt quickly, or would it be faster/better to start over from scratch? (From the point of view of creating a strong komi 5.5 program.) (?) Surely it would train much more quickly: all the early layers are about learning liberty counting, atari and then life/death, good shape, etc. (But, it would be fascinating if an experiment showed that wasn't the case, and starting from a fresh random network trained more quickly!) Darren ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Learning related stuff
No need for AlphaGo hardware to find out; any toy problem will suffice to explore different initialization schemes... The main benefit of starting random is to break symmetries (otherwise individual neurons cannot specialize), but there are other approaches that can work even better. Further you typically want to start with small weights so that the initial mapping is relatively smooth. E. On Tue, Nov 21, 2017 at 2:24 PM, "Ingo Althöfer" <3-hirn-ver...@gmx.de> wrote: > AlphaGo Zero started with random values in > its neural net - and reached top level > within 72 hours. > > Would it typically help or disrupt to start > instead with values that are non-random? > What I have in mind concretely: > > Look at 19x19 Go with komi=5.5 > In run A you start with random values in the net. > In another run B you start with the values that had > emerged in the 7.5-NN after 72 hours. > > Would typically A or B learn better? > Would there be a danger that B would not be able > to leave the 7.5-"solution"? > > It is a pity that I/we do not have the hardware of > AlphaGo Zero at hand for such experiments. > > Ingo. > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
[Computer-go] Learning related stuff
AlphaGo Zero started with random values in its neural net - and reached top level within 72 hours. Would it typically help or disrupt to start instead with values that are non-random? What I have in mind concretely: Look at 19x19 Go with komi=5.5 In run A you start with random values in the net. In another run B you start with the values that had emerged in the 7.5-NN after 72 hours. Would typically A or B learn better? Would there be a danger that B would not be able to leave the 7.5-"solution"? It is a pity that I/we do not have the hardware of AlphaGo Zero at hand for such experiments. Ingo. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go