Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread Robert Jasiek

On 20.10.2017 21:12, uurtamo . wrote:

do something like really careful experimental design across many dimensions
simultaneously (node weights) and several million experiments -- each of
which will require hundreds if not tens of thousands of games to find the
result of the change. Worse, there are probably tens of millions of neural
nets of this size that will perform equally well (isomorphisms plus minor
weight changes). So many changes will result in no change or a completely
useless game model.


It is possible that things turn out as complex as you describe...


"modeling through human knowledge" neural nets doesn't sound like a
sensible goal


...but I am not convinced. Researchers in the human brain's thinking 
keep their optimism, too.


Nevertheless, alternative approaches can be imagined. E.g., while 
building a neural net of eventually great strength also build in its own 
semantic interpretator, semantic verificator (including exclusion of 
errors as far as computationally possible) and translator between 
internal structure and human (or programming) language representation. I 
do not know if such dynamic self-representations of neural nets have 
already been described but if not this would be an interesting research 
topic.


--
robert jasiek
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Zero performance

2017-10-20 Thread Gian-Carlo Pascutto
I agree. Even on 19x19 you can use smaller searches. 400 iterations MCTS is
probably already a lot stronger than the raw network, especially if you are
expanding every node (very different from a normal program at 400
playouts!). Some tuning of these mini searches is important. Surely you
don't want to explore every child node for the first play urgency... I
remember this little algorithmic detail was missing from the first paper as
well.

So that's a factor 32 gain. Because the network is smaller, it should learn
much faster too. Someone on reddit posted a comparison of 20 blocks vs 40
blocks.

With 10 people you can probably get some results in a few months. The
question is, how much Elo have we lost on the way...

Another advantage would be that, as long as you keep all the SGF, you can
bootstrap a bigger network from the data! So, nothing is lost from starting
small. You can "upgrade" if the improvements start to plateau.

On Fri, Oct 20, 2017, 23:32 Álvaro Begué  wrote:

> I suggest scaling down the problem until some experience is gained.
>
> You don't need the full-fledge 40-block network to get started. You can
> probably get away with using only 20 blocks and maybe 128 features (from
> 256). That should save you about a factor of 8, plus you can use larger
> mini-batches.
>
> You can also start with 9x9 go. That way games are shorter, and you
> probably don't need 1600 network evaluations per move to do well.
>
> Álvaro.
>
>
> On Fri, Oct 20, 2017 at 1:44 PM, Gian-Carlo Pascutto 
> wrote:
>
>> I reconstructed the full AlphaGo Zero network in Caffe:
>> https://sjeng.org/dl/zero.prototxt
>>
>> I did some performance measurements, with what should be
>> state-of-the-art on consumer hardware:
>>
>> GTX 1080 Ti
>> NVIDIA-Caffe + CUDA 9 + cuDNN 7
>> batch size = 8
>>
>> Memory use is about ~2G. (It's much more for learning, the original
>> minibatch size of 32 wouldn't fit on this card!)
>>
>> Running 2000 iterations takes 93 seconds.
>>
>> In the AlphaGo paper, they claim 0.4 seconds to do 1600 MCTS
>> simulations, and they expand 1 node per visit (if I got it right) so
>> that would be 1600 network evaluations as well, or 200 of my iterations.
>>
>> So it would take me ~9.3s to produce a self-play move, compared to 0.4s
>> for them.
>>
>> I would like to extrapolate how long it will take to reproduce the
>> research, but I think I'm missing how many GPUs are in each self-play
>> worker (4 TPU or 64 GPU or ?), or perhaps the average length of the games.
>>
>> Let's say the latter is around 200 moves. They generated 29 million
>> games for the final result, which means it's going to take me about 1700
>> years to replicate this. I initially estimated 7 years based on the
>> reported 64 GPU vs 1 GPU, but this seems far worse. Did I miss anything
>> in the calculations above, or was it really a *pile* of those 64 GPU
>> machines?
>>
>> Because the performance on playing seems reasonable (you would be able
>> to actually run the MCTS on a consumer machine, and hence end up with a
>> strong program), I would be interested in setting up a distributed
>> effort for this. But realistically there will be maybe 10 people
>> joining, 80 if we're very lucky (looking at Stockfish numbers). That
>> means it'd still take 20 to 170 years.
>>
>> Someone please tell me I missed a factor of 100 or more somewhere. I'd
>> love to be wrong here.
>>
>
>> --
>> GCP
>
>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go

-- 

GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Zero performance

2017-10-20 Thread John Tromp
> You can also start with 9x9 go. That way games are shorter, and you probably
> don't need 1600 network evaluations per move to do well.

Bonus points if you can have it play on goquest where many
of us can enjoy watching its progress, or even challenge it...

regards,
-John
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Zero performance

2017-10-20 Thread fotland
The paper describes 20 and 40 block networks, but the section on comparison 
says AlphaGo Zero uses 20 blocks. I think your protobuf describes a 40 block 
network. That's a factor of two 

If you only want pro strength rather than superhuman, you can train for half 
their time.

Your time looks reasonable when calculating the time to generate the 29M games 
at about 10 seconds per move. This is only the time to generate the input data. 
Do you have an estimate of the additional time it takes to do the training? 
It's probably small in comparison, but it might not be.

My plan is to start out with a little supervised learning, since I'm not trying 
to prove a breakthrough. I experimented last year for a few months with 
res-nets for a policy network and there are some things I discovered there that 
probably apply to this network. They should get perhaps a factor of 5 to 10 
speedup. For a commercial program I'll be happy with 7-dan amateur with about 6 
months of training using my two GPUs and sixteen i7 cores. 

David

-Original Message-
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of 
Gian-Carlo Pascutto
Sent: Friday, October 20, 2017 10:45 AM
To: computer-go@computer-go.org
Subject: [Computer-go] Zero performance

I reconstructed the full AlphaGo Zero network in Caffe:
https://sjeng.org/dl/zero.prototxt

I did some performance measurements, with what should be state-of-the-art on 
consumer hardware:

GTX 1080 Ti
NVIDIA-Caffe + CUDA 9 + cuDNN 7
batch size = 8

Memory use is about ~2G. (It's much more for learning, the original minibatch 
size of 32 wouldn't fit on this card!)

Running 2000 iterations takes 93 seconds.

In the AlphaGo paper, they claim 0.4 seconds to do 1600 MCTS simulations, and 
they expand 1 node per visit (if I got it right) so that would be 1600 network 
evaluations as well, or 200 of my iterations.

So it would take me ~9.3s to produce a self-play move, compared to 0.4s for 
them.

I would like to extrapolate how long it will take to reproduce the research, 
but I think I'm missing how many GPUs are in each self-play worker (4 TPU or 64 
GPU or ?), or perhaps the average length of the games.

Let's say the latter is around 200 moves. They generated 29 million games for 
the final result, which means it's going to take me about 1700 years to 
replicate this. I initially estimated 7 years based on the reported 64 GPU vs 1 
GPU, but this seems far worse. Did I miss anything in the calculations above, 
or was it really a *pile* of those 64 GPU machines?

Because the performance on playing seems reasonable (you would be able to 
actually run the MCTS on a consumer machine, and hence end up with a strong 
program), I would be interested in setting up a distributed effort for this. 
But realistically there will be maybe 10 people joining, 80 if we're very lucky 
(looking at Stockfish numbers). That means it'd still take 20 to 170 years.

Someone please tell me I missed a factor of 100 or more somewhere. I'd love to 
be wrong here.

--
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Zero performance

2017-10-20 Thread Sorin Gherman
Training of AlphaGo Zero has been done on thousands of TPUs, according to
this source:
https://www.reddit.com/r/baduk/comments/777ym4/alphago_zero_learning_from_scratch_deepmind/dokj1uz/?context=3

Maybe that should explain the difference in orders of magnitude that you
noticed?


On Fri, Oct 20, 2017 at 10:44 AM, Gian-Carlo Pascutto  wrote:

> I reconstructed the full AlphaGo Zero network in Caffe:
> https://sjeng.org/dl/zero.prototxt
>
> I did some performance measurements, with what should be
> state-of-the-art on consumer hardware:
>
> GTX 1080 Ti
> NVIDIA-Caffe + CUDA 9 + cuDNN 7
> batch size = 8
>
> Memory use is about ~2G. (It's much more for learning, the original
> minibatch size of 32 wouldn't fit on this card!)
>
> Running 2000 iterations takes 93 seconds.
>
> In the AlphaGo paper, they claim 0.4 seconds to do 1600 MCTS
> simulations, and they expand 1 node per visit (if I got it right) so
> that would be 1600 network evaluations as well, or 200 of my iterations.
>
> So it would take me ~9.3s to produce a self-play move, compared to 0.4s
> for them.
>
> I would like to extrapolate how long it will take to reproduce the
> research, but I think I'm missing how many GPUs are in each self-play
> worker (4 TPU or 64 GPU or ?), or perhaps the average length of the games.
>
> Let's say the latter is around 200 moves. They generated 29 million
> games for the final result, which means it's going to take me about 1700
> years to replicate this. I initially estimated 7 years based on the
> reported 64 GPU vs 1 GPU, but this seems far worse. Did I miss anything
> in the calculations above, or was it really a *pile* of those 64 GPU
> machines?
>
> Because the performance on playing seems reasonable (you would be able
> to actually run the MCTS on a consumer machine, and hence end up with a
> strong program), I would be interested in setting up a distributed
> effort for this. But realistically there will be maybe 10 people
> joining, 80 if we're very lucky (looking at Stockfish numbers). That
> means it'd still take 20 to 170 years.
>
> Someone please tell me I missed a factor of 100 or more somewhere. I'd
> love to be wrong here.
>
> --
> GCP
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread Gian-Carlo Pascutto
On Fri, Oct 20, 2017, 21:48 Petr Baudis  wrote:

>   Few open questions I currently have, comments welcome:
>
>   - there is no input representing the number of captures; is this
> information somehow implicit or can the learned winrate predictor
> never truly approximate the true values because of this?
>

They are using Chinese rules, so prisoners don't matter. There are simply
less stones of one color on the board.


>   - what ballpark values for c_{puct} are reasonable?
>

The original paper has the value they used. But this likely needs tuning. I
would tune with a supervised network to get started, but you need games for
that. Does it even matter much early on? The network is random :)


>   - why is the dirichlet noise applied only at the root node, if it's
> useful?
>

It's only used to get some randomness in the move selection, no ? It's not
actually useful for anything besides that.


>   - the training process is quite lazy - it's not like the network sees
> each game immediately and adjusts, it looks at last 500k games and
> samples 1000*2048 positions, meaning about 4 positions per game (if
> I understood this right) - I wonder what would happen if we trained
> it more aggressively, and what AlphaGo does during the initial 500k
> games; currently, I'm training on all positions immediately, I guess
> I should at least shuffle them ;)
>

I think the lazyness may be related to the concern that reinforcement
methods can easily "forget" things they had learned before. The value
network training also likes positions from distinct games.


-- 

GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Zero performance

2017-10-20 Thread Álvaro Begué
I suggest scaling down the problem until some experience is gained.

You don't need the full-fledge 40-block network to get started. You can
probably get away with using only 20 blocks and maybe 128 features (from
256). That should save you about a factor of 8, plus you can use larger
mini-batches.

You can also start with 9x9 go. That way games are shorter, and you
probably don't need 1600 network evaluations per move to do well.

Álvaro.


On Fri, Oct 20, 2017 at 1:44 PM, Gian-Carlo Pascutto  wrote:

> I reconstructed the full AlphaGo Zero network in Caffe:
> https://sjeng.org/dl/zero.prototxt
>
> I did some performance measurements, with what should be
> state-of-the-art on consumer hardware:
>
> GTX 1080 Ti
> NVIDIA-Caffe + CUDA 9 + cuDNN 7
> batch size = 8
>
> Memory use is about ~2G. (It's much more for learning, the original
> minibatch size of 32 wouldn't fit on this card!)
>
> Running 2000 iterations takes 93 seconds.
>
> In the AlphaGo paper, they claim 0.4 seconds to do 1600 MCTS
> simulations, and they expand 1 node per visit (if I got it right) so
> that would be 1600 network evaluations as well, or 200 of my iterations.
>
> So it would take me ~9.3s to produce a self-play move, compared to 0.4s
> for them.
>
> I would like to extrapolate how long it will take to reproduce the
> research, but I think I'm missing how many GPUs are in each self-play
> worker (4 TPU or 64 GPU or ?), or perhaps the average length of the games.
>
> Let's say the latter is around 200 moves. They generated 29 million
> games for the final result, which means it's going to take me about 1700
> years to replicate this. I initially estimated 7 years based on the
> reported 64 GPU vs 1 GPU, but this seems far worse. Did I miss anything
> in the calculations above, or was it really a *pile* of those 64 GPU
> machines?
>
> Because the performance on playing seems reasonable (you would be able
> to actually run the MCTS on a consumer machine, and hence end up with a
> strong program), I would be interested in setting up a distributed
> effort for this. But realistically there will be maybe 10 people
> joining, 80 if we're very lucky (looking at Stockfish numbers). That
> means it'd still take 20 to 170 years.
>
> Someone please tell me I missed a factor of 100 or more somewhere. I'd
> love to be wrong here.
>
> --
> GCP
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread uurtamo .
This sounds like a nice idea that is a misguided project.

Keep in mind the number of weights to change, and the fact that "one factor
at a time" testing will tell you nearly nothing about the overall dynamics
in a system of tens of thousands of dimensions. So you're going to need to
do something like really careful experimental design across many dimensions
simultaneously (node weights) and several million experiments -- each of
which will require hundreds if not tens of thousands of games to find the
result of the change. Worse, there are probably tens of millions of neural
nets of this size that will perform equally well (isomorphisms plus minor
weight changes). So many changes will result in no change or a completely
useless game model.

"modeling through human knowledge" neural nets doesn't sound like a
sensible goal -- it sounds more like a need to understand a topic in a
language not equipped for it without a simultaneous desire to understand a
topic under its own fundamental requirements in its own language.

Or you could build a machine-learning model to try to model those
changes except that you'd end up where you started, roughly. Another
black box and another frustrated human.

Just accept that something awesome happened and that studying those things
that make it work well are more interesting than translating coefficients
into a bad understanding for people.

I'm sorry that this NN can't teach anyone how to be a better player through
anything other than kicking their ass, but it wasn't built for that.

s.


On Fri, Oct 20, 2017 at 8:24 AM, Robert Jasiek  wrote:

> On 20.10.2017 15:07, adrian.b.rob...@gmail.com wrote:
>
>> 1) Where is the semantic translation of the neural net to human theory
>>> knowledge?
>>>
>> As far as (1), if we could do it, it would mean we could relate the
>> structures embedded in the net's weight patterns to some other domain --
>>
>
> The other domain can be "human go theory". It has various forms, from
> informal via textbook to mathematically proven. Sure, it is also incomplete
> but it can cope with additions.
>
> The neural net's weights and whatnot are given. This raw data can be
> deciphered in principle. By humans, algorithms or a combination.
>
> You do not know where to start? Why, that is easy: test! Modify ONE weight
> and study its effect on ONE aspect of human go theory, such as the
> occurrance (frequency) of independent life. No effect? Increase the
> modification, test a different weight, test a subset of adjacent weights
> etc. It has been possible to study semantics of parts of DNA, e.g., from
> differences related to illnesses. Modifications on the weights is like
> creating causes for illnesses (or improved health).
>
> There is no "we cannot do it", but maybe there is too much required effort
> for it to be financially worthwhile for the "too specialised" case of Go?
> As I say, a mathematical proof of a complete solution of Go will occur
> before AI playing perfectly;)
>
> So far neural
>> nets have been trained and applied within single domains, and any
>> "generalization" means within that domain.
>>
>
> Yes.
>
> --
> robert jasiek
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] Zero performance

2017-10-20 Thread Gian-Carlo Pascutto
I reconstructed the full AlphaGo Zero network in Caffe:
https://sjeng.org/dl/zero.prototxt

I did some performance measurements, with what should be
state-of-the-art on consumer hardware:

GTX 1080 Ti
NVIDIA-Caffe + CUDA 9 + cuDNN 7
batch size = 8

Memory use is about ~2G. (It's much more for learning, the original
minibatch size of 32 wouldn't fit on this card!)

Running 2000 iterations takes 93 seconds.

In the AlphaGo paper, they claim 0.4 seconds to do 1600 MCTS
simulations, and they expand 1 node per visit (if I got it right) so
that would be 1600 network evaluations as well, or 200 of my iterations.

So it would take me ~9.3s to produce a self-play move, compared to 0.4s
for them.

I would like to extrapolate how long it will take to reproduce the
research, but I think I'm missing how many GPUs are in each self-play
worker (4 TPU or 64 GPU or ?), or perhaps the average length of the games.

Let's say the latter is around 200 moves. They generated 29 million
games for the final result, which means it's going to take me about 1700
years to replicate this. I initially estimated 7 years based on the
reported 64 GPU vs 1 GPU, but this seems far worse. Did I miss anything
in the calculations above, or was it really a *pile* of those 64 GPU
machines?

Because the performance on playing seems reasonable (you would be able
to actually run the MCTS on a consumer machine, and hence end up with a
strong program), I would be interested in setting up a distributed
effort for this. But realistically there will be maybe 10 people
joining, 80 if we're very lucky (looking at Stockfish numbers). That
means it'd still take 20 to 170 years.

Someone please tell me I missed a factor of 100 or more somewhere. I'd
love to be wrong here.

-- 
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Subject: Re: AlphaGo Zero

2017-10-20 Thread Robert Jasiek

On 20.10.2017 16:44, Hendrik Baier wrote:

Where is the respect and the appreciation for other people's
groundbreaking work without immediately having to make the discussion
about your own research, or otherwise derailing it into the irrelevant
or fantastical?


Instead of joining your meta-discussion, let me point out some motivation:
- Research should also proceed after groundbreaking work.
- Research should not be isolated but different research approaches can 
be combined. Research is not a one-way street.
- Research in go also serves as a model for research in other or more 
general fields, such as generalised AI, which includes use of AI to 
provide a) interchange with semantics of human domains or b) assistance 
for or replacement of human activities, such as car driving, which 
profit from avoiding errors. Therefore I discuss these aspects.
- Ethical questions are becoming increasingly important in view of the 
fast progress of AI. Surely you are aware that Deepmind knows this.


--
robert jasiek
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread Gian-Carlo Pascutto
On 19-10-17 13:00, Aja Huang via Computer-go wrote:
> Hi Hiroshi,
> 
> I think these are good questions. You can ask them at 
> https://www.reddit.com/r/MachineLearning/comments/76xjb5/ama_we_are_david_silver_and_julian_schrittwieser/

It seems the question was indeed asked but not answered:
https://www.reddit.com/r/MachineLearning/comments/76xjb5/ama_we_are_david_silver_and_julian_schrittwieser/dol03aq/

-- 
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread Robert Jasiek

On 20.10.2017 15:07, adrian.b.rob...@gmail.com wrote:

1) Where is the semantic translation of the neural net to human theory
knowledge?

As far as (1), if we could do it, it would mean we could relate the
structures embedded in the net's weight patterns to some other domain --


The other domain can be "human go theory". It has various forms, from 
informal via textbook to mathematically proven. Sure, it is also 
incomplete but it can cope with additions.


The neural net's weights and whatnot are given. This raw data can be 
deciphered in principle. By humans, algorithms or a combination.


You do not know where to start? Why, that is easy: test! Modify ONE 
weight and study its effect on ONE aspect of human go theory, such as 
the occurrance (frequency) of independent life. No effect? Increase the 
modification, test a different weight, test a subset of adjacent weights 
etc. It has been possible to study semantics of parts of DNA, e.g., from 
differences related to illnesses. Modifications on the weights is like 
creating causes for illnesses (or improved health).


There is no "we cannot do it", but maybe there is too much required 
effort for it to be financially worthwhile for the "too specialised" 
case of Go? As I say, a mathematical proof of a complete solution of Go 
will occur before AI playing perfectly;)



So far neural
nets have been trained and applied within single domains, and any
"generalization" means within that domain.


Yes.

--
robert jasiek
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread Álvaro Begué
When I did something like this for Spanish checkers (training a neural
network to be the evaluation function in an alpha-beta search, without any
human knowledge), I solved the problem of adding game variety by using UCT
for the opening moves. That means that I kept a tree structure with the
opening moves and I used the UCB1 formula to pick the next move as long as
the game was in the tree. Once outside the tree, I used alpha-beta search
to play a normal [very fast] game.

One important characteristic of this UCT opening-book builder is that the
last move inside the tree is basically random, so this explores a lot of
unbalanced positions.

Álvaro.



On Fri, Oct 20, 2017 at 9:23 AM, Petr Baudis  wrote:

>   I tried to reimplement the system - in a simplified way, trying to
> find the minimum that learns to play 5x5 in a few thousands of
> self-plays.  Turns out there are several components which are important
> to avoid some obvious attractors (like the network predicting black
> loses on every move from its second game on):
>
>   - disabling resignation in a portion of games is essential not just
> for tuning resignation threshold (if you want to even do that), but
> just to correct prediction signal by actual scoring rather than
> starting to always resign early in the game
>
>   - dirichlet (or other) noise is essential for the network getting
> looped into the same game - which is also self-reinforcing
>
>   - i have my doubts about the idea of high temperature move choices
> at the beginning, especially with T=1 ... maybe that's just bad
> very early in the training
>
> On Thu, Oct 19, 2017 at 02:23:41PM +0200, Petr Baudis wrote:
> >   The order of magnitude matches my parameter numbers.  (My attempt to
> > reproduce a simplified version of this is currently evolving at
> > https://github.com/pasky/michi/tree/nnet but the code is a mess right
> > now.)
>
> --
> Petr Baudis, Rossum
> Run before you walk! Fly before you crawl! Keep moving forward!
> If we fail, I'd rather fail really hugely.  -- Moist von Lipwig
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread Gian-Carlo Pascutto
On 19-10-17 13:23, Álvaro Begué wrote:
> Summing it all up, I get 22,837,864 parameters for the 20-block network
> and 46,461,544 parameters for the 40-block network.
> 
> Does this seem correct?

My Caffe model file is 185887898 bytes / 32-bit floats = 46 471 974

So yes, that seems pretty close. I'll send the model file and some
observations in a separate post.

-- 
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] Subject: Re: AlphaGo Zero

2017-10-20 Thread Hendrik Baier
Where is the respect and the appreciation for other people's
groundbreaking work without immediately having to make the discussion
about your own research, or otherwise derailing it into the irrelevant
or fantastical?

Congratulations again to the AlphaGo team! Excellent work well
described, a pleasure to read.

Hendrik Baier


> So there is a superstrong neural net.
>
> 1) Where is the semantic translation of the neural net to human theory
> knowledge?
>
> 2) Where is the analysis of the neural net's errors in decision-making?
>
> 3) Where is the world-wide discussion preventing a combination of AI and
> (nano-)robots, which self-replicate or permanently ensure energy access,
> from causing extinction of mankind?
>
> --
> robert jasiek
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread Petr Baudis
  I tried to reimplement the system - in a simplified way, trying to
find the minimum that learns to play 5x5 in a few thousands of
self-plays.  Turns out there are several components which are important
to avoid some obvious attractors (like the network predicting black
loses on every move from its second game on):

  - disabling resignation in a portion of games is essential not just
for tuning resignation threshold (if you want to even do that), but
just to correct prediction signal by actual scoring rather than
starting to always resign early in the game

  - dirichlet (or other) noise is essential for the network getting
looped into the same game - which is also self-reinforcing

  - i have my doubts about the idea of high temperature move choices
at the beginning, especially with T=1 ... maybe that's just bad
very early in the training

On Thu, Oct 19, 2017 at 02:23:41PM +0200, Petr Baudis wrote:
>   The order of magnitude matches my parameter numbers.  (My attempt to
> reproduce a simplified version of this is currently evolving at
> https://github.com/pasky/michi/tree/nnet but the code is a mess right
> now.)

-- 
Petr Baudis, Rossum
Run before you walk! Fly before you crawl! Keep moving forward!
If we fail, I'd rather fail really hugely.  -- Moist von Lipwig
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread Adrian . B . Robert
Robert Jasiek  writes:

> So there is a superstrong neural net.
>
> 1) Where is the semantic translation of the neural net to human theory
> knowledge?
>
> 2) Where is the analysis of the neural net's errors in decision-making?
>
> 3) Where is the world-wide discussion preventing a combination of AI
> and (nano-)robots, which self-replicate or permanently ensure energy
> access, from causing extinction of mankind?


As far as (1), if we could do it, it would mean we could relate the
structures embedded in the net's weight patterns to some other domain --
if nothing else, the domain of the meanings of words in some natural
language.  We cannot, and most certainly the net cannot.  So far neural
nets have been trained and applied within single domains, and any
"generalization" means within that domain.  A net may learn to recognize
and act similarly with respect to a certain eye pattern on different
parts of the board.  No one, as far as I know, has presented a net that
would be able to use a guideline like, "two eyes alive, one eye dead" to
help it speed learning of how to act on the board.  But a human can
apply "one"/"two", and "alive"/"dead" once it has been made clear that
"eye" in this context is standing for a recognizable structure of
same-color-surrounding-space, and thereby learn in one step what the net
learns in thousands of incremental iterations.

And (2) presupposes (1), since to understand why a situation was
mis-perceived or mis-acted upon requires some understanding of what
exactly the perception and judgment process was in the first place.

It may be that the recent successes in brute-force learning powered by
improved hardware together with improved crafting of the architecture
eventually play some role in understanding and recreating
"intelligence".  But so far, using the term "AI" in connection with 99%
of this kind of work is just hype.  Useful in accomplishing engineering
goals, yes.  But not so much to do with intelligence.

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread Dan Schmidt
On Fri, Oct 20, 2017 at 12:06 AM, Robert Jasiek  wrote:

>
> 3) Where is the world-wide discussion preventing a combination of AI and
> (nano-)robots, which self-replicate or permanently ensure energy access,
> from causing extinction of mankind?
>

You will find it if you Google for "artificial intelligence existential
threat". But the subject seems off-topic here.

Dan
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread Robert Jasiek

On 20.10.2017 09:38, Xavier Combelle wrote:

What is currently named nanorobot is simply hand assembled molecules
which have mechanical properties and need huge
framework to be able simply move.


Sure. But we must not wait until such a thing exists.

--
robert jasiek
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread Xavier Combelle
You seems to lack of knowing what is really a nano robot in current term.

They are very far to have the possibility to self replicate them self
and far more being able to dissolve the planet by doing that.

What is currently named nanorobot is simply hand assembled molecules
which have mechanical properties and need huge 

framework to be able simply move. So far to be a threat.


Le 20/10/2017 à 08:33, Robert Jasiek a écrit :
> On 20.10.2017 07:10, Petri Pitkanen wrote:
> >> 3) Where is the world-wide discussion preventing a combination of
> AI >> and (nano-)robots, which self-replicate or permanently ensure
> energy >> access, from causing extinction of mankind?
>> 3) Would it be a bad thing? All thing considered, not just human
>> point of
>> view
>
> Have you realised the potential of one successful self-duplication of
> a nano-robot? Iterate and the self-replicating nano-robots might
> dissolve the planet earth into elementary particles. Now discuss
> whether that might be good or bad. Not good for animals or plants, to
> start with.
>

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread Robert Jasiek

On 20.10.2017 07:10, Petri Pitkanen wrote:
>> 3) Where is the world-wide discussion preventing a combination of AI 
>> and (nano-)robots, which self-replicate or permanently ensure energy 
>> access, from causing extinction of mankind?

3) Would it be a bad thing? All thing considered, not just human point of
view


Have you realised the potential of one successful self-duplication of a 
nano-robot? Iterate and the self-replicating nano-robots might dissolve 
the planet earth into elementary particles. Now discuss whether that 
might be good or bad. Not good for animals or plants, to start with.


--
robert jasiek
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-20 Thread Petri Pitkanen
1) There is no such thing and I do doubt if it ever will exist. Even humans
fail elaborate why they know certain things
2) If we are talking about new one. Very few people seen it playing so I
guess we lack the data. For the old we know it made errors, dunno if
analysis points why. Neural nets tend to be black boxes
3) Would it be a bad thing? All thing considered, not just human point of
view

2017-10-20 7:06 GMT+03:00 Robert Jasiek :

> So there is a superstrong neural net.
>
> 1) Where is the semantic translation of the neural net to human theory
> knowledge?
>
> 2) Where is the analysis of the neural net's errors in decision-making?
>
> 3) Where is the world-wide discussion preventing a combination of AI and
> (nano-)robots, which self-replicate or permanently ensure energy access,
> from causing extinction of mankind?
>
> --
> robert jasiek
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go