Re: [Computer-go] Learning related stuff

2017-11-29 Thread Ray Tayek

On 11/29/2017 6:15 PM, Dave Dyer wrote:


My question is this; people have been messing around with neural nets
and machine learning for 40 years; what was the breakthrough that made
alphago succeed so spectacularly.



maybe it was 
https://en.wikipedia.org/wiki/Vanishing_gradient_problem#Residual_networks. 
they are pretty new i think. or some combination of things.


thanks

--
Honesty is a very expensive gift. So, don't expect it from cheap people 
- Warren Buffett

http://tayek.com/
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-29 Thread Darren Cook
> My question is this; people have been messing around with neural nets
> and machine learning for 40 years; what was the breakthrough that made
> alphago succeed so spectacularly.

5 or 6 orders more magnitude CPU power (relative to the late 90s) (*).

This means you can try out ideas to see if they work, and get the answer
back in hours, rather than years.

After 10 hrs it was playing with an elo somewhere between 0 and 1000
(Figure 3 in the alpha go zero paper). I.e. idiot level. That is
something like 1100 years of effort on 1995 hardware.

They put together a large team (by hobbyist computer go standards) of
top people, at least two of which had made strong go programs before.

I'd name two other things: dropout (and other regularization techniques)
allowed deeper networks; the work on image recognition gave you
production-ready CNNs, without having to work through all the dead ends
yourself. Also better optimization techniques. Taken together maybe
algorithmic advances are worth another order of magnitude.

Darren

*: The source is the intro to my own book ;-) From memory, I made the
estimate as the average of top supercomputer 20 years apart, and a
typical high-end PC 20 years apart.
https://en.wikipedia.org/wiki/History_of_supercomputing#Historical_TOP500_table

-- 
Darren Cook, Software Researcher/Developer
My New Book: Practical Machine Learning with H2O:
  http://shop.oreilly.com/product/0636920053170.do
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-29 Thread Dave Dyer

My question is this; people have been messing around with neural nets
and machine learning for 40 years; what was the breakthrough that made
alphago succeed so spectacularly.


___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-29 Thread uurtamo .
It's nearly comic to imagine a player at 1,1 trying to figure things out.

It's not a diss on you; I honestly want for people to relax, take a minute,
and treat badmouthing the alpha go team's ideas as a secondary
consideration. They did good work. Probably arguing about the essentials
won't prove that they're stupid in any way. So let's learn, move forward,
and have no bad words about their ridiculously well-funded effort.

Recreating their work at a smaller scale would be awesome.

s.

On Nov 29, 2017 4:33 PM, "Eric Boesch"  wrote:

> Could you be reading too much into my comment? AlphaGo Zero is an amazing
> achievement, and I might guess its programmers will succeed in applying
> their methods to other fields. Nonetheless, I thought it was interesting,
> and it would appear the programmers did too, that before improving to
> superhuman level, AlphaGo was temporarily stuck in a rut of playing
> literally the worst first move on the board (excluding pass). That doesn't
> mean I think I could do better.
>
>
> On Tue, Nov 28, 2017 at 4:50 AM, uurtamo .  wrote:
>
>> This is starting to feel like asking along the lines of, "how can I
>> explain this to myself or improve on what's already been done in a way that
>> will make this whole process work faster on my hardware".
>>
>> It really doesn't look like there are a bunch of obvious shortcuts.
>> That's the whole point of decision-trees imposed by humans for 20+ years on
>> the game; it wasn't really better.
>>
>> Probably what would be good to convince oneself of these things would be
>> to challenge each assumption in divergent branches (suggested earlier) and
>> watch the resulting players' strength over time. Yes, this might take a
>> year or more on your hardware.
>>
>> I feel like maybe a lot of this is sour grapes; let's  please again
>> acknowledge that the hobbyists aren't there yet without trying to tear down
>> the accomplishments of others.
>>
>> s.
>>
>> On Nov 27, 2017 7:36 PM, "Eric Boesch"  wrote:
>>
>>> I imagine implementation determines whether transferred knowledge is
>>> helpful. It's like asking whether forgetting is a problem -- it often is,
>>> but evidently not for AlphaGo Zero.
>>>
>>> One crude way to encourage stability is to include an explicit or
>>> implicit age parameter that forces the program to perform smaller
>>> modifications to its state during later stages. If the parameters you copy
>>> from problem A to problem B also include that age parameter, so the network
>>> acts old even though it is faced with a new problem, then its initial
>>> exploration may be inefficient. For an MCTS based example, if a MCTS node
>>> is initialized to a 10877-6771 win/loss record based on evaluations under
>>> slightly different game rules, then with a naive implementation, even if
>>> the program discovers the right refutation under the new rules right away,
>>> it would still need to revisit that node thousands of times to convince
>>> itself the node is now probably a losing position.
>>>
>>> But unlearning bad plans in a reasonable time frame is already a feature
>>> you need from a good learning algorithm. Even AlphaGo almost fell into trap
>>> states; from their paper, it appears that it stuck with 1-1 as an opening
>>> move for much longer than you would expect from a program probably already
>>> much better than 40 kyu. Even if it's unrealistic for Go specifically, you
>>> could imagine some other game where after days of analysis, the program
>>> suddenly discovers a reliable trick that adds one point for white to every
>>> single game. The effect would be the same as your komi change -- a mature
>>> network now needs to adapt to a general shift in the final score. So the
>>> task of adapting to handle similar games may be similar to the task of
>>> adapting to analysis reversals within a single game, and improvements to
>>> one could lead to improvements to the other.
>>>
>>>
>>>
>>> On Fri, Nov 24, 2017 at 7:54 AM, Stephan K 
>>> wrote:
>>>
 2017-11-21 23:27 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>:
 > My understanding is that the AlphaGo hardware is standing
 > somewhere in London, idle and waitung for new action...
 >
 > Ingo.

 The announcement at
 https://deepmind.com/blog/applying-machine-learning-mammography/ seems
 to disagree:

 "Our partners in this project wanted researchers at both DeepMind and
 Google involved in this research so that the project could take
 advantage of the AI expertise in both teams, as well as Google’s
 supercomputing infrastructure - widely regarded as one of the best in
 the world, and the same global infrastructure that powered DeepMind’s
 victory over the world champion at the ancient game of Go."
 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 

Re: [Computer-go] Learning related stuff

2017-11-29 Thread Eric Boesch
Could you be reading too much into my comment? AlphaGo Zero is an amazing
achievement, and I might guess its programmers will succeed in applying
their methods to other fields. Nonetheless, I thought it was interesting,
and it would appear the programmers did too, that before improving to
superhuman level, AlphaGo was temporarily stuck in a rut of playing
literally the worst first move on the board (excluding pass). That doesn't
mean I think I could do better.


On Tue, Nov 28, 2017 at 4:50 AM, uurtamo .  wrote:

> This is starting to feel like asking along the lines of, "how can I
> explain this to myself or improve on what's already been done in a way that
> will make this whole process work faster on my hardware".
>
> It really doesn't look like there are a bunch of obvious shortcuts. That's
> the whole point of decision-trees imposed by humans for 20+ years on the
> game; it wasn't really better.
>
> Probably what would be good to convince oneself of these things would be
> to challenge each assumption in divergent branches (suggested earlier) and
> watch the resulting players' strength over time. Yes, this might take a
> year or more on your hardware.
>
> I feel like maybe a lot of this is sour grapes; let's  please again
> acknowledge that the hobbyists aren't there yet without trying to tear down
> the accomplishments of others.
>
> s.
>
> On Nov 27, 2017 7:36 PM, "Eric Boesch"  wrote:
>
>> I imagine implementation determines whether transferred knowledge is
>> helpful. It's like asking whether forgetting is a problem -- it often is,
>> but evidently not for AlphaGo Zero.
>>
>> One crude way to encourage stability is to include an explicit or
>> implicit age parameter that forces the program to perform smaller
>> modifications to its state during later stages. If the parameters you copy
>> from problem A to problem B also include that age parameter, so the network
>> acts old even though it is faced with a new problem, then its initial
>> exploration may be inefficient. For an MCTS based example, if a MCTS node
>> is initialized to a 10877-6771 win/loss record based on evaluations under
>> slightly different game rules, then with a naive implementation, even if
>> the program discovers the right refutation under the new rules right away,
>> it would still need to revisit that node thousands of times to convince
>> itself the node is now probably a losing position.
>>
>> But unlearning bad plans in a reasonable time frame is already a feature
>> you need from a good learning algorithm. Even AlphaGo almost fell into trap
>> states; from their paper, it appears that it stuck with 1-1 as an opening
>> move for much longer than you would expect from a program probably already
>> much better than 40 kyu. Even if it's unrealistic for Go specifically, you
>> could imagine some other game where after days of analysis, the program
>> suddenly discovers a reliable trick that adds one point for white to every
>> single game. The effect would be the same as your komi change -- a mature
>> network now needs to adapt to a general shift in the final score. So the
>> task of adapting to handle similar games may be similar to the task of
>> adapting to analysis reversals within a single game, and improvements to
>> one could lead to improvements to the other.
>>
>>
>>
>> On Fri, Nov 24, 2017 at 7:54 AM, Stephan K 
>> wrote:
>>
>>> 2017-11-21 23:27 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>:
>>> > My understanding is that the AlphaGo hardware is standing
>>> > somewhere in London, idle and waitung for new action...
>>> >
>>> > Ingo.
>>>
>>> The announcement at
>>> https://deepmind.com/blog/applying-machine-learning-mammography/ seems
>>> to disagree:
>>>
>>> "Our partners in this project wanted researchers at both DeepMind and
>>> Google involved in this research so that the project could take
>>> advantage of the AI expertise in both teams, as well as Google’s
>>> supercomputing infrastructure - widely regarded as one of the best in
>>> the world, and the same global infrastructure that powered DeepMind’s
>>> victory over the world champion at the ancient game of Go."
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-28 Thread uurtamo .
This is starting to feel like asking along the lines of, "how can I explain
this to myself or improve on what's already been done in a way that will
make this whole process work faster on my hardware".

It really doesn't look like there are a bunch of obvious shortcuts. That's
the whole point of decision-trees imposed by humans for 20+ years on the
game; it wasn't really better.

Probably what would be good to convince oneself of these things would be to
challenge each assumption in divergent branches (suggested earlier) and
watch the resulting players' strength over time. Yes, this might take a
year or more on your hardware.

I feel like maybe a lot of this is sour grapes; let's  please again
acknowledge that the hobbyists aren't there yet without trying to tear down
the accomplishments of others.

s.

On Nov 27, 2017 7:36 PM, "Eric Boesch"  wrote:

> I imagine implementation determines whether transferred knowledge is
> helpful. It's like asking whether forgetting is a problem -- it often is,
> but evidently not for AlphaGo Zero.
>
> One crude way to encourage stability is to include an explicit or implicit
> age parameter that forces the program to perform smaller modifications to
> its state during later stages. If the parameters you copy from problem A to
> problem B also include that age parameter, so the network acts old even
> though it is faced with a new problem, then its initial exploration may be
> inefficient. For an MCTS based example, if a MCTS node is initialized to a
> 10877-6771 win/loss record based on evaluations under slightly different
> game rules, then with a naive implementation, even if the program discovers
> the right refutation under the new rules right away, it would still need to
> revisit that node thousands of times to convince itself the node is now
> probably a losing position.
>
> But unlearning bad plans in a reasonable time frame is already a feature
> you need from a good learning algorithm. Even AlphaGo almost fell into trap
> states; from their paper, it appears that it stuck with 1-1 as an opening
> move for much longer than you would expect from a program probably already
> much better than 40 kyu. Even if it's unrealistic for Go specifically, you
> could imagine some other game where after days of analysis, the program
> suddenly discovers a reliable trick that adds one point for white to every
> single game. The effect would be the same as your komi change -- a mature
> network now needs to adapt to a general shift in the final score. So the
> task of adapting to handle similar games may be similar to the task of
> adapting to analysis reversals within a single game, and improvements to
> one could lead to improvements to the other.
>
>
>
> On Fri, Nov 24, 2017 at 7:54 AM, Stephan K 
> wrote:
>
>> 2017-11-21 23:27 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>:
>> > My understanding is that the AlphaGo hardware is standing
>> > somewhere in London, idle and waitung for new action...
>> >
>> > Ingo.
>>
>> The announcement at
>> https://deepmind.com/blog/applying-machine-learning-mammography/ seems
>> to disagree:
>>
>> "Our partners in this project wanted researchers at both DeepMind and
>> Google involved in this research so that the project could take
>> advantage of the AI expertise in both teams, as well as Google’s
>> supercomputing infrastructure - widely regarded as one of the best in
>> the world, and the same global infrastructure that powered DeepMind’s
>> victory over the world champion at the ancient game of Go."
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-27 Thread Eric Boesch
I imagine implementation determines whether transferred knowledge is
helpful. It's like asking whether forgetting is a problem -- it often is,
but evidently not for AlphaGo Zero.

One crude way to encourage stability is to include an explicit or implicit
age parameter that forces the program to perform smaller modifications to
its state during later stages. If the parameters you copy from problem A to
problem B also include that age parameter, so the network acts old even
though it is faced with a new problem, then its initial exploration may be
inefficient. For an MCTS based example, if a MCTS node is initialized to a
10877-6771 win/loss record based on evaluations under slightly different
game rules, then with a naive implementation, even if the program discovers
the right refutation under the new rules right away, it would still need to
revisit that node thousands of times to convince itself the node is now
probably a losing position.

But unlearning bad plans in a reasonable time frame is already a feature
you need from a good learning algorithm. Even AlphaGo almost fell into trap
states; from their paper, it appears that it stuck with 1-1 as an opening
move for much longer than you would expect from a program probably already
much better than 40 kyu. Even if it's unrealistic for Go specifically, you
could imagine some other game where after days of analysis, the program
suddenly discovers a reliable trick that adds one point for white to every
single game. The effect would be the same as your komi change -- a mature
network now needs to adapt to a general shift in the final score. So the
task of adapting to handle similar games may be similar to the task of
adapting to analysis reversals within a single game, and improvements to
one could lead to improvements to the other.



On Fri, Nov 24, 2017 at 7:54 AM, Stephan K  wrote:

> 2017-11-21 23:27 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>:
> > My understanding is that the AlphaGo hardware is standing
> > somewhere in London, idle and waitung for new action...
> >
> > Ingo.
>
> The announcement at
> https://deepmind.com/blog/applying-machine-learning-mammography/ seems
> to disagree:
>
> "Our partners in this project wanted researchers at both DeepMind and
> Google involved in this research so that the project could take
> advantage of the AI expertise in both teams, as well as Google’s
> supercomputing infrastructure - widely regarded as one of the best in
> the world, and the same global infrastructure that powered DeepMind’s
> victory over the world champion at the ancient game of Go."
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-24 Thread Stephan K
2017-11-21 23:27 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>:
> My understanding is that the AlphaGo hardware is standing
> somewhere in London, idle and waitung for new action...
>
> Ingo.

The announcement at
https://deepmind.com/blog/applying-machine-learning-mammography/ seems
to disagree:

"Our partners in this project wanted researchers at both DeepMind and
Google involved in this research so that the project could take
advantage of the AI expertise in both teams, as well as Google’s
supercomputing infrastructure - widely regarded as one of the best in
the world, and the same global infrastructure that powered DeepMind’s
victory over the world champion at the ancient game of Go."
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-23 Thread Xavier Combelle


Le 21/11/2017 à 23:27, "Ingo Althöfer" a écrit :
> Hi Erik,
>
>> No need for AlphaGo hardware to find out; any 
>> toy problem will suffice to explore different 
>> initialization schemes... 
> I know that. 
>
> My intention with the question is a different one:
> I am thinking how humans are learning. Is it beneficial
> to have learnt related - but different - stuff before?
> The answer will depend on the case, of course.
>
> And in my role as a voyeur, I want to understand if having
> learnt a Go variant X before turning my interest to a
> "slightly" different Go variant Y. Do, I want to combine
> the subject with some entertaining learning process.
> (For instance, looking at the AlphaGo Zero games from the
> 72 h experiment in steps of 2 hours was not only insightful
> but also entertaining.)
>
>> you typically want to start with small weights so 
>> that the initial mapping is relatively smooth.
> But again: For instance, when a eight year old child starts
> to play violin, is it helpful or not when it had played
> say a trumpet before?
I believe that Human brain is too far from the alphago neural network
that one knowledge about one can be transfered to the other.
> My understanding is that the AlphaGo hardware is standing 
> somewhere in London, idle and waitung for new action...
Definitely not idle:
“[They] needed the computers for something else.”

source:
https://techcrunch.com/2017/11/02/deepmind-has-yet-to-find-out-how-smart-its-alphago-zero-ai-could-be/
> Ingo.
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-23 Thread David Doshay
In my experience people who are first taught variant a) and after a short while 
move on to b) remain overly fixated on capturing and are much slower to grasp 
the real game. So in this case I would argue that people really do have trouble 
unlearning when the games are too close … particularly when the first variant 
has such a simple and expected goal that must be deprecated to be able to move 
from b) to c).

Cheers,
David G Doshay

ddos...@mac.com





> On 22, Nov 2017, at 6:23 AM, Ingo Althöfer <3-hirn-ver...@gmx.de> wrote:
> 
> In teaching go, one possible path (even with 2 steps) is 
> to start with 
> (a) Atari-Go on 9x9 board
> then switch to
> (b) "true" Go on 9x9
> then switch to
> (c) Go on 19x19
> 
> What are optimal lengths for phases (a) und (b) in doing so?

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-23 Thread Ingo Althöfer
Hello Stephan,

> Another option for your experiment might be to take the 72-hour-old
> network, but only retain the first layers, and initialize randomly the
> last layers.
 
yes, or many others. Not all of them have to be fantastic,
but when you/we get some experience and have a new try
every 3 or 4 days (by simply editing some hundred bytes of code), 
at the end of the year or decade some pearls will be in the harvest.

Ingo.

PS. My wife will find a way to have the power bills paid ;-)
At least this is my expectation.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-23 Thread Stephan K
2017-11-22 15:17 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>:
> For instance, with respect to the 72-hour run of AlphaGo Zero
> one might start several runs for Go(with komi=5.5),
> the first one starting from fresh, the second one from the
> 72-hour process after 1 hour, the next one after 2 hours ...
>
> Ingo

Another option for your experiment might be to take the 72-hour-old
network, but only retain the first layers, and initialize randomly the
last layers.

Stephan
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-22 Thread Ingo Althöfer
Hi Petri,
 
"Petri Pitkanen" 
>
>>But again: For instance, when a eight year old child starts
>>to play violin, is it helpful or not when it had played
>>say a trumpet before?
> 
> It would be and this is well known in practice. Logic 
> around the music is the same so hw would learn faster. 
> In the very long run there might be no wanted effects. 
> i.e. hard to learn away from something too similar...

the question is, which intermediate point is optimal 
to switch from instrument/game 1 to game 2.

Having in mind a complicated game 2, it might be helpful
first to teach a simpler game 1 (for some limited time)
and only then switch to game 2. 

In teaching go, one possible path (even with 2 steps) is 
to start with 
(a) Atari-Go on 9x9 board
then switch to
(b) "true" Go on 9x9
then switch to
(c) Go on 19x19

What are optimal lengths for phases (a) und (b) in doing so?

Ingo.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-22 Thread Ingo Althöfer
Hi Alvaro,

Von: "Álvaro Begué" 
> The term you are looking for is "transfer learning": 
> https://en.wikipedia.org/wiki/Transfer_learning
 
thanks for that interesting hint.
However, it is not exactly what I am looking at. 

My question was more in observing and understanding 
"transfer learning phenomena", let them be positive
or negative.

For instance, with respect to the 72-hour run of AlphaGo Zero
one might start several runs for Go(with komi=5.5), 
the first one starting from fresh, the second one from the
72-hour process after 1 hour, the next one after 2 hours ...

Ingo.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-21 Thread Petri Pitkanen
>But again: For instance, when a eight year old child starts
>to play violin, is it helpful or not when it had played
>say a trumpet before?

It would be and this is well known in practice. Logic around the music is
the same so hw would learn faster. In the very long run there might be no
wanted effects. i.e. hard to learn away from something too similar. But in
case of trumper and violin no. But lets say 10 years of training violin
played like bluegrass player plays and then switchin to classical. That
would be hard. due required unlearning for which humans do no really have
mechanism for. New skill needs to be learned better thatn the old skill or
time needs to erase the untrained old skill. While the DCNN can learn and
unlearn quite easily

2017-11-22 0:48 GMT+02:00 Álvaro Begué :

> The term you are looking for is "transfer learning": https://en.
> wikipedia.org/wiki/Transfer_learning
>
>
> On Tue, Nov 21, 2017 at 5:27 PM, "Ingo Althöfer" <3-hirn-ver...@gmx.de>
> wrote:
>
>> Hi Erik,
>>
>> > No need for AlphaGo hardware to find out; any
>> > toy problem will suffice to explore different
>> > initialization schemes...
>>
>> I know that.
>>
>> My intention with the question is a different one:
>> I am thinking how humans are learning. Is it beneficial
>> to have learnt related - but different - stuff before?
>> The answer will depend on the case, of course.
>>
>> And in my role as a voyeur, I want to understand if having
>> learnt a Go variant X before turning my interest to a
>> "slightly" different Go variant Y. Do, I want to combine
>> the subject with some entertaining learning process.
>> (For instance, looking at the AlphaGo Zero games from the
>> 72 h experiment in steps of 2 hours was not only insightful
>> but also entertaining.)
>>
>>
>> > you typically want to start with small weights so
>> > that the initial mapping is relatively smooth.
>>
>> But again: For instance, when a eight year old child starts
>> to play violin, is it helpful or not when it had played
>> say a trumpet before?
>>
>> My understanding is that the AlphaGo hardware is standing
>> somewhere in London, idle and waitung for new action...
>>
>> Ingo.
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-21 Thread Álvaro Begué
The term you are looking for is "transfer learning":
https://en.wikipedia.org/wiki/Transfer_learning


On Tue, Nov 21, 2017 at 5:27 PM, "Ingo Althöfer" <3-hirn-ver...@gmx.de>
wrote:

> Hi Erik,
>
> > No need for AlphaGo hardware to find out; any
> > toy problem will suffice to explore different
> > initialization schemes...
>
> I know that.
>
> My intention with the question is a different one:
> I am thinking how humans are learning. Is it beneficial
> to have learnt related - but different - stuff before?
> The answer will depend on the case, of course.
>
> And in my role as a voyeur, I want to understand if having
> learnt a Go variant X before turning my interest to a
> "slightly" different Go variant Y. Do, I want to combine
> the subject with some entertaining learning process.
> (For instance, looking at the AlphaGo Zero games from the
> 72 h experiment in steps of 2 hours was not only insightful
> but also entertaining.)
>
>
> > you typically want to start with small weights so
> > that the initial mapping is relatively smooth.
>
> But again: For instance, when a eight year old child starts
> to play violin, is it helpful or not when it had played
> say a trumpet before?
>
> My understanding is that the AlphaGo hardware is standing
> somewhere in London, idle and waitung for new action...
>
> Ingo.
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-21 Thread Ingo Althöfer
Hi Darren,

> Can I correctly rephrase your question as: if you take a well-trained
> komi 7.5 network, then give it komi 5.5 training data, will it adapt
> quickly, or would it be faster/better to start over from scratch? (From
> the point of view of creating a strong komi 5.5 program.) (?)

in principle yes, but the training should be only with self-generated
data, not with master games from outside.

> Surely it would train much more quickly: all the early layers are about
> learning liberty counting, atari and then life/death, good shape, etc.
> (But, it would be fascinating if an experiment showed that wasn't the
> case, and starting from a fresh random network trained more quickly!)

Indeed. For these two options, I would like to
know which one is the "true".

Ingo.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-21 Thread Ingo Althöfer
Hi Erik,

> No need for AlphaGo hardware to find out; any 
> toy problem will suffice to explore different 
> initialization schemes... 

I know that. 

My intention with the question is a different one:
I am thinking how humans are learning. Is it beneficial
to have learnt related - but different - stuff before?
The answer will depend on the case, of course.

And in my role as a voyeur, I want to understand if having
learnt a Go variant X before turning my interest to a
"slightly" different Go variant Y. Do, I want to combine
the subject with some entertaining learning process.
(For instance, looking at the AlphaGo Zero games from the
72 h experiment in steps of 2 hours was not only insightful
but also entertaining.)


> you typically want to start with small weights so 
> that the initial mapping is relatively smooth.

But again: For instance, when a eight year old child starts
to play violin, is it helpful or not when it had played
say a trumpet before?

My understanding is that the AlphaGo hardware is standing 
somewhere in London, idle and waitung for new action...

Ingo.

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-21 Thread Darren Cook
> Would it typically help or disrupt to start
> instead with values that are non-random?
> What I have in mind concretely:

Can I correctly rephrase your question as: if you take a well-trained
komi 7.5 network, then give it komi 5.5 training data, will it adapt
quickly, or would it be faster/better to start over from scratch? (From
the point of view of creating a strong komi 5.5 program.) (?)


Surely it would train much more quickly: all the early layers are about
learning liberty counting, atari and then life/death, good shape, etc.
(But, it would be fascinating if an experiment showed that wasn't the
case, and starting from a fresh random network trained more quickly!)

Darren
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

2017-11-21 Thread Erik van der Werf
No need for AlphaGo hardware to find out; any toy problem will suffice to
explore different initialization schemes... The main benefit of starting
random is to break symmetries (otherwise individual neurons cannot
specialize), but there are other approaches that can work even better.
Further you typically want to start with small weights so that the initial
mapping is relatively smooth.

E.

On Tue, Nov 21, 2017 at 2:24 PM, "Ingo Althöfer" <3-hirn-ver...@gmx.de>
wrote:

> AlphaGo Zero started with random values in
> its neural net - and reached top level
> within 72 hours.
>
> Would it typically help or disrupt to start
> instead with values that are non-random?
> What I have in mind concretely:
>
> Look at 19x19 Go with komi=5.5
> In run A you start with random values in the net.
> In another run B you start with the values that had
> emerged in the 7.5-NN after 72 hours.
>
> Would typically A or B learn better?
> Would there be a danger that B would not be able
> to leave the 7.5-"solution"?
>
> It is a pity that I/we do not have the hardware of
> AlphaGo Zero at hand for such experiments.
>
> Ingo.
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] Learning related stuff

2017-11-21 Thread Ingo Althöfer
AlphaGo Zero started with random values in
its neural net - and reached top level
within 72 hours.

Would it typically help or disrupt to start
instead with values that are non-random?
What I have in mind concretely:

Look at 19x19 Go with komi=5.5
In run A you start with random values in the net.
In another run B you start with the values that had
emerged in the 7.5-NN after 72 hours. 

Would typically A or B learn better?
Would there be a danger that B would not be able 
to leave the 7.5-"solution"?

It is a pity that I/we do not have the hardware of 
AlphaGo Zero at hand for such experiments.

Ingo.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go