Re: [Computer-go] Learning related stuff

uurtamo . Wed, 29 Nov 2017 16:56:44 -0800

It's nearly comic to imagine a player at 1,1 trying to figure things out.

It's not a diss on you; I honestly want for people to relax, take a minute,
and treat badmouthing the alpha go team's ideas as a secondary
consideration. They did good work. Probably arguing about the essentials
won't prove that they're stupid in any way. So let's learn, move forward,
and have no bad words about their ridiculously well-funded effort.


Recreating their work at a smaller scale would be awesome.

s.

On Nov 29, 2017 4:33 PM, "Eric Boesch" <ericboe...@gmail.com> wrote:

> Could you be reading too much into my comment? AlphaGo Zero is an amazing
> achievement, and I might guess its programmers will succeed in applying
> their methods to other fields. Nonetheless, I thought it was interesting,
> and it would appear the programmers did too, that before improving to
> superhuman level, AlphaGo was temporarily stuck in a rut of playing
> literally the worst first move on the board (excluding pass). That doesn't
> mean I think I could do better.
>
>
> On Tue, Nov 28, 2017 at 4:50 AM, uurtamo . <uurt...@gmail.com> wrote:
>
>> This is starting to feel like asking along the lines of, "how can I
>> explain this to myself or improve on what's already been done in a way that
>> will make this whole process work faster on my hardware".
>>
>> It really doesn't look like there are a bunch of obvious shortcuts.
>> That's the whole point of decision-trees imposed by humans for 20+ years on
>> the game; it wasn't really better.
>>
>> Probably what would be good to convince oneself of these things would be
>> to challenge each assumption in divergent branches (suggested earlier) and
>> watch the resulting players' strength over time. Yes, this might take a
>> year or more on your hardware.
>>
>> I feel like maybe a lot of this is sour grapes; let's  please again
>> acknowledge that the hobbyists aren't there yet without trying to tear down
>> the accomplishments of others.
>>
>> s.
>>
>> On Nov 27, 2017 7:36 PM, "Eric Boesch" <ericboe...@gmail.com> wrote:
>>
>>> I imagine implementation determines whether transferred knowledge is
>>> helpful. It's like asking whether forgetting is a problem -- it often is,
>>> but evidently not for AlphaGo Zero.
>>>
>>> One crude way to encourage stability is to include an explicit or
>>> implicit age parameter that forces the program to perform smaller
>>> modifications to its state during later stages. If the parameters you copy
>>> from problem A to problem B also include that age parameter, so the network
>>> acts old even though it is faced with a new problem, then its initial
>>> exploration may be inefficient. For an MCTS based example, if a MCTS node
>>> is initialized to a 10877-6771 win/loss record based on evaluations under
>>> slightly different game rules, then with a naive implementation, even if
>>> the program discovers the right refutation under the new rules right away,
>>> it would still need to revisit that node thousands of times to convince
>>> itself the node is now probably a losing position.
>>>
>>> But unlearning bad plans in a reasonable time frame is already a feature
>>> you need from a good learning algorithm. Even AlphaGo almost fell into trap
>>> states; from their paper, it appears that it stuck with 1-1 as an opening
>>> move for much longer than you would expect from a program probably already
>>> much better than 40 kyu. Even if it's unrealistic for Go specifically, you
>>> could imagine some other game where after days of analysis, the program
>>> suddenly discovers a reliable trick that adds one point for white to every
>>> single game. The effect would be the same as your komi change -- a mature
>>> network now needs to adapt to a general shift in the final score. So the
>>> task of adapting to handle similar games may be similar to the task of
>>> adapting to analysis reversals within a single game, and improvements to
>>> one could lead to improvements to the other.
>>>
>>>
>>>
>>> On Fri, Nov 24, 2017 at 7:54 AM, Stephan K <stephan.ku...@gmail.com>
>>> wrote:
>>>
>>>> 2017-11-21 23:27 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>:
>>>> > My understanding is that the AlphaGo hardware is standing
>>>> > somewhere in London, idle and waitung for new action...
>>>> >
>>>> > Ingo.
>>>>
>>>> The announcement at
>>>> https://deepmind.com/blog/applying-machine-learning-mammography/ seems
>>>> to disagree:
>>>>
>>>> "Our partners in this project wanted researchers at both DeepMind and
>>>> Google involved in this research so that the project could take
>>>> advantage of the AI expertise in both teams, as well as Google’s
>>>> supercomputing infrastructure - widely regarded as one of the best in
>>>> the world, and the same global infrastructure that powered DeepMind’s
>>>> victory over the world champion at the ancient game of Go."
>>>> _______________________________________________
>>>> Computer-go mailing list
>>>> Computer-go@computer-go.org
>>>> http://computer-go.org/mailman/listinfo/computer-go
>>>>
>>>
>>>
>>> _______________________________________________
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>
>> _______________________________________________
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

Reply via email to