"upgrade" your existing supervised learning approach

Wayne Joseph Thu, 05 Dec 2019 10:08:48 -0800

>> Why not try to "upgrade" your existing supervised learning approach?


Yes! This is something I really need to do!

Is this the right place to make a feature-request?

** I think it would be really useful if GNU could auto-magically save
screenshots of all of my errors above, say, -0.04 in a folder **

That it! (..for now - could be more sophisticated, i.e. auto-position
categorisation into sub-folders, racing cube errors, PNPL, make 5 point or
not, break prime, trap plays etc. etc. but a good place to start ;-)

Is this stuff of dreams even possible in 2019 A.D.?

Might other bg students find such a personalised blunder folder useful?

Thanks if you can help!

Wayne

-- Sent from my Android phone

On Thu, 5 Dec 2019, 5:00 pm , <[email protected]> wrote:

> Send Bug-gnubg mailing list submissions to
>         [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.gnu.org/mailman/listinfo/bug-gnubg
> or, via email, send a message with subject or body 'help' to
>         [email protected]
>
> You can reach the person managing the list at
>         [email protected]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Bug-gnubg digest..."
>
>
> Today's Topics:
>
>    1. Re: current development (Øystein Schønning-Johansen)
>    2. Re: current development (Joseph Heled)
>    3. Re: current development (Myshkin LeVine)
>    4. Re: current development (Joseph Heled)
>    5. Re: current development (Joseph Heled)
>    6. Re: current development (Philippe Michel)
>    7. Re: current development (Philippe Michel)
>    8. Re: current development (Joseph Heled)
>    9. Re: current development (Russ Allbery)
>   10. Re: current development (Joseph Heled)
>   11. Re: current development (Philippe Michel)
>   12. Re: current development (Joseph Heled)
>   13. Re: current development (Joseph Heled)
>   14. Re: Alphazero / Deepmind backgammon project (Wayne Joseph)
>   15. Re: Alphazero / Deepmind backgammon project (Joseph Heled)
>   16. Re: current development (Nikos Papachristou)
>   17. Re: current development (Øystein Schønning-Johansen)
>   18. Re: current development (Timothy Y. Chow)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 4 Dec 2019 21:23:09 +0100
> From: Øystein Schønning-Johansen <[email protected]>
> To: Joseph Heled <[email protected]>
> Cc: Ralph Corderoy <[email protected]>, "[email protected]"
>         <[email protected]>
> Subject: Re: current development
> Message-ID:
>         <
> caozpfnrhzsazdvnqor5_vd845v5zre-rjxhdhqootqgm80j...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> But let's chat about the idea instead. What will it actually mean to 'apply
> "AlphaZero methods" to backgammon.' ?
>
> AlphaZero (and AlphaGo and Lc0 and SugaR NN) is just more or less the same
> thing as reinforcement learning in backgammon. So, from my understanding,
> it is rather AlphaZero, who has applied the backgammon methods. They are
> both the chess and go variants trains with reinforcement learning pretty
> much like the original GNU Backgammon, Jellyfish and Snowie. In Go they had
> to make a move selection subroutine based on human play and then add MCTS
> to train. Also the neural networks are deeper and more complex. The nn
> inputs features are also so more complex and can to some extend resemble
> convolutions known from convolutional neural network (And that the inputs
> are not properly described in the high level articles.)
>
> Apart from that, it is actually same thing: Reinforcement learning.
>
> But how can we improve: We believe (at least I do) that the current state
> of backgammon bots are so strong that it plays close to perfect in standard
> positions. It is in uncommon and long term plan positions (like deep
> backgames and snake rolling prime positions) bots still can improve. Let me
> throw some ideas up in the air for discussion:
>
> Can we make a RL algorithm that is so fast that it can learn on the fly?
> Say we during play find a position where some indicator (that may be
> another challenge) indicates that this is a position that requires long
> term planning. If we then have the ability to RL train a neural net for
> that specific position, that could be an huge improvement in my opinion.
> (Lot's of details missing.)
>
> And then, could the evaluations be improved if we specialize neural
> networks in to specific position types, and then make a kind of nn
> selection system based on k-means of the input features. I tried that many
> years ago with only four classes. Those experiments showed that it's not
> hopeless approach, and with faster computers it can easily create much more
> than just four classes (fours was only the first number that popped into my
> head those days)
>
> Then next idea: What about huge scale distributed rollouts? Maybe we could
> have a system like BOINQ to do rollouts on the fly? I'm not sure how this
> should be used in a practical sense, and I'm not sure how hard it would be
> to implement (with or without BOINQ framework) but I'm just kind of
> brainstorming here.
>
> -Øystein
>
>
> On Wed, Dec 4, 2019 at 6:47 PM Joseph Heled <[email protected]> wrote:
>
> > I was intentionally rude because I thought his original post was
> > inappropriate.
> >
> > -Joseph
> >
> > On Thu, 5 Dec 2019 at 06:42, Ralph Corderoy <[email protected]>
> wrote:
> > >
> > > Hi Joseph,
> > >
> > > > I thought so.
> > > >
> > > > I had the same idea the day I heard they cracked go, but just saying
> > > > something is a good idea is not helpful at all in my book.
> > >
> > > I think you're wrong.  And also a bit rude to boot.
> > >
> > > It's fine for Tim to suggest or ponder an idea to the list.  It may
> > > encourage another subscriber, or draw out news of what a lurker has
> been
> > > working on that's related.
> > >
> > > --
> > > Cheers, Ralph.
> > >
> >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191204/4c825a45/attachment.html
> >
>
> ------------------------------
>
> Message: 2
> Date: Thu, 5 Dec 2019 09:34:45 +1300
> From: Joseph Heled <[email protected]>
> To: Øystein Schønning-Johansen <[email protected]>
> Cc: Ralph Corderoy <[email protected]>, "[email protected]"
>         <[email protected]>
> Subject: Re: current development
> Message-ID:
>         <CAG8x8-0mzJFO=_
> [email protected]>
> Content-Type: text/plain; charset="utf-8"
>
> The main difference, if I understand correctly (and I know very little
> here) is to bootstrap from the ground. That is, no pre-computed inputs. and
> let the network figure it out by self play.
>
> We have a great test case in that we can start with just racing.
>
> That said, I think we will need a net for each match score, since cubeless
> -> cubeful is where things get messy.
>
> Also, given that 0-ply rollouts are relatively fast, when playing against a
> human - if you can wait a second or two, you can play using cubeful 0-ply.
> Testing how good this is will be problematic.
>
> -Joseph
>
>
> On Thu, 5 Dec 2019 at 09:23, Øystein Schønning-Johansen <
> [email protected]>
> wrote:
>
> > But let's chat about the idea instead. What will it actually mean to
> > 'apply "AlphaZero methods" to backgammon.' ?
> >
> > AlphaZero (and AlphaGo and Lc0 and SugaR NN) is just more or less the
> same
> > thing as reinforcement learning in backgammon. So, from my understanding,
> > it is rather AlphaZero, who has applied the backgammon methods. They are
> > both the chess and go variants trains with reinforcement learning pretty
> > much like the original GNU Backgammon, Jellyfish and Snowie. In Go they
> had
> > to make a move selection subroutine based on human play and then add MCTS
> > to train. Also the neural networks are deeper and more complex. The nn
> > inputs features are also so more complex and can to some extend resemble
> > convolutions known from convolutional neural network (And that the inputs
> > are not properly described in the high level articles.)
> >
> > Apart from that, it is actually same thing: Reinforcement learning.
> >
> > But how can we improve: We believe (at least I do) that the current state
> > of backgammon bots are so strong that it plays close to perfect in
> standard
> > positions. It is in uncommon and long term plan positions (like deep
> > backgames and snake rolling prime positions) bots still can improve. Let
> me
> > throw some ideas up in the air for discussion:
> >
> > Can we make a RL algorithm that is so fast that it can learn on the fly?
> > Say we during play find a position where some indicator (that may be
> > another challenge) indicates that this is a position that requires long
> > term planning. If we then have the ability to RL train a neural net for
> > that specific position, that could be an huge improvement in my opinion.
> > (Lot's of details missing.)
> >
> > And then, could the evaluations be improved if we specialize neural
> > networks in to specific position types, and then make a kind of nn
> > selection system based on k-means of the input features. I tried that
> many
> > years ago with only four classes. Those experiments showed that it's not
> > hopeless approach, and with faster computers it can easily create much
> more
> > than just four classes (fours was only the first number that popped into
> my
> > head those days)
> >
> > Then next idea: What about huge scale distributed rollouts? Maybe we
> could
> > have a system like BOINQ to do rollouts on the fly? I'm not sure how this
> > should be used in a practical sense, and I'm not sure how hard it would
> be
> > to implement (with or without BOINQ framework) but I'm just kind of
> > brainstorming here.
> >
> > -Øystein
> >
> >
> > On Wed, Dec 4, 2019 at 6:47 PM Joseph Heled <[email protected]> wrote:
> >
> >> I was intentionally rude because I thought his original post was
> >> inappropriate.
> >>
> >> -Joseph
> >>
> >> On Thu, 5 Dec 2019 at 06:42, Ralph Corderoy <[email protected]>
> >> wrote:
> >> >
> >> > Hi Joseph,
> >> >
> >> > > I thought so.
> >> > >
> >> > > I had the same idea the day I heard they cracked go, but just saying
> >> > > something is a good idea is not helpful at all in my book.
> >> >
> >> > I think you're wrong.  And also a bit rude to boot.
> >> >
> >> > It's fine for Tim to suggest or ponder an idea to the list.  It may
> >> > encourage another subscriber, or draw out news of what a lurker has
> been
> >> > working on that's related.
> >> >
> >> > --
> >> > Cheers, Ralph.
> >> >
> >>
> >>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/f92fd680/attachment.html
> >
>
> ------------------------------
>
> Message: 3
> Date: Wed, 4 Dec 2019 20:40:56 +0000
> From: Myshkin LeVine <[email protected]>
> To: Superfly Jon <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Subject: Re: current development
> Message-ID:
>         <
> ch2pr15mb3688dc8677ab1e126abf471ebf...@ch2pr15mb3688.namprd15.prod.outlook.com
> >
>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi Jon,
>            Perhaps you could also include build instructions for the Mac
> in addition to Linux and Windows. I figured out how to fulfill the cglm
> dependency, but I think many Mac users could not. I do not see cglm in
> MacPorts, but it does have a formula for those Mac users who like Homebrew.
> The 3D board seems to be working fine for me, with the exception that
> clicking on the dice to end my turn has no effect, and I must instead use
> use Control-F.
>
> Myshkin LeVine
>
>
> On Dec 4, 2019, at 1:56 PM, Superfly Jon wrote:
>
> > My changes are still under development, the dependency with cglm will be
> resolved in due course.  I'll update the "INSTALL" file with more detailed
> how-to build instructions for linux and windows shortly.
> >
> > Jon
> >
>
>
>
>
> ------------------------------
>
> Message: 4
> Date: Thu, 5 Dec 2019 09:40:48 +1300
> From: Joseph Heled <[email protected]>
> To: Ingo Macherius <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Subject: Re: current development
> Message-ID:
>         <CAG8x8-1ULLS2164rwH8pSGBKFF=
> [email protected]>
> Content-Type: text/plain; charset="utf-8"
>
> What is the matter with you people? bug-gnubg is synonym with dev-gnubg.
>
> Perhaps it is time for me to stop being involved with GNUBG. Does not seem
> people here are interested in doing anything constructive. If this is not
> the case, prove me wrong.
>
> -Joseph
>
> On Thu, 5 Dec 2019 at 08:57, Ingo Macherius <[email protected]> wrote:
>
> > This is not a bug. While individual members of the gnubg team always had
> > setups for dependency management under various IDEs and OSes, there
> > never was a satisfying solution in the repository. There was a Linux
> > package management HOWTO, but it's went away with the rest of the web
> > pages. And no, automake and it's cryptic error message not really
> > qualifies as a dependency management system. I'm a Java guy and used to
> > tool based solutions such as maven or gradle. Picking and adding
> > something similar suiteable for C, ideally something which works on all
> > major OSes, would greatly improve the confusion you and everybody else
> > starting to work with the code has to go through.
> >
> > Ingo
> >
> > Am 04.12.19 um 19:00 schrieb Joseph Heled:
> > > And good riddance. This list is called bug-gnubg.
> > >
> > > -Joseph
> > >
> > > On Thu, 5 Dec 2019 at 06:59, Ralph Corderoy <[email protected]>
> > wrote:
> > >> Hi Joseph,
> > >>
> > >>> I was intentionally rude because I thought his original post was
> > >>> inappropriate.
> > >> How childish.  We put up with your many posts detailing your failure
> to
> > >> compile from source when a quick Google would have led to the method
> of
> > >> using the package manager to install the build dependencies.  No one
> was
> > >> rude.  You got civil help.
> > >>
> > >> I've no time for such antics.  This list has always been polite in my
> > >> experience.  I'm unsubscribing.
> > >>
> > >> --
> > >> Cheers, Ralph.
> > >>
> >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/fcef612b/attachment.html
> >
>
> ------------------------------
>
> Message: 5
> Date: Thu, 5 Dec 2019 09:48:19 +1300
> From: Joseph Heled <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Subject: Re: current development
> Message-ID:
>         <
> cag8x8-17jj2g3n0x0-ehnnb1chch4n-uekmghrkyuoe+x5m...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Ralph was not cc'ed
>
> Subject: Re: current development
> From: Joseph Heled <[email protected]>
> To: Ingo Macherius <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Content-Type: multipart/alternative;
> boundary="000000000000e96d5e0598e6d365"
>
> On Thu, 5 Dec 2019 at 09:44, Ralph Corderoy <[email protected]> wrote:
>
> > Joseph, please desist from CC-ing me.
> >
> > --
> > Cheers, Ralph.
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/c2efdfaf/attachment.html
> >
>
> ------------------------------
>
> Message: 6
> Date: Wed, 4 Dec 2019 22:14:48 +0100
> From: Philippe Michel <[email protected]>
> To: Russ Allbery <[email protected]>
> Cc: Joseph Heled <[email protected]>,    "[email protected]"
>         <[email protected]>
> Subject: Re: current development
> Message-ID: <20191204211448.GA8359@genesis>
> Content-Type: text/plain; charset=us-ascii
>
> On Tue, Dec 03, 2019 at 03:51:12PM -0800, Russ Allbery wrote:
> >
> > Philippe Michel <[email protected]> writes:
> >
> > > Reasonably recent versions of gcc and clang have a feature (ifuncs)
> that
> > > should allow to to this in one single binary. I don't know how onerous
> > > it would be at package building stage, but I think a few parts of
> Linux,
> > > for instance glibc, use that feature, so at least it wouldn't be
> unknown
> > > territory.
> >
> > Oh, interesting.  Is this something that I can just enable with a
> compiler
> > flag, or does it need code support in gnubg?
>
> This would be mostly in gnubg's code. Maybe something would be needed at
> configure stage as well.
>
> The post below shows a minimal example of how this is used:
> https://gcc.gnu.org/ml/gcc-help/2012-03/msg00209.html
>
>
>
> ------------------------------
>
> Message: 7
> Date: Wed, 4 Dec 2019 22:28:03 +0100
> From: Philippe Michel <[email protected]>
> To: Joseph Heled <[email protected]>
> Cc: Russ Allbery <[email protected]>, "[email protected]"
>         <[email protected]>
> Subject: Re: current development
> Message-ID: <20191204212803.GB8359@genesis>
> Content-Type: text/plain; charset=us-ascii
>
> On Wed, Dec 04, 2019 at 01:21:06PM +1300, Joseph Heled wrote:
>
> > Is that the right way to specify both?
> >
> > ./configure --enable-simd=avx --enable-simd=sse2
>
> It wasn't expected to specify both :-).
>
> I just checked what it does: the second option overrides the first one,
> so your example doesn't do what you hoped.
>
> Just use --enable-simd=yes, it will use avx if your computer supports it
> (plus some sse in places where there is no avx implementation), else
> sse2.
>
>
>
> ------------------------------
>
> Message: 8
> Date: Thu, 5 Dec 2019 10:31:44 +1300
> From: Joseph Heled <[email protected]>
> To: Philippe Michel <[email protected]>
> Cc: Russ Allbery <[email protected]>, "[email protected]"
>         <[email protected]>
> Subject: Re: current development
> Message-ID:
>         <
> cag8x8-3by+qvggpa-s5usavrq098v-9dtfqz51nbrce4vhs...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Thanks. will recompile. Also the other non-sse options should go somewhere
> else, but I don't know where they should go in the debian build process.
>
> On Thu, 5 Dec 2019 at 10:28, Philippe Michel <[email protected]>
> wrote:
>
> > On Wed, Dec 04, 2019 at 01:21:06PM +1300, Joseph Heled wrote:
> >
> > > Is that the right way to specify both?
> > >
> > > ./configure --enable-simd=avx --enable-simd=sse2
> >
> > It wasn't expected to specify both :-).
> >
> > I just checked what it does: the second option overrides the first one,
> > so your example doesn't do what you hoped.
> >
> > Just use --enable-simd=yes, it will use avx if your computer supports it
> > (plus some sse in places where there is no avx implementation), else
> > sse2.
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/b6184825/attachment.html
> >
>
> ------------------------------
>
> Message: 9
> Date: Wed, 04 Dec 2019 13:36:15 -0800
> From: Russ Allbery <[email protected]>
> To: [email protected]
> Subject: Re: current development
> Message-ID: <[email protected]>
> Content-Type: text/plain
>
> Joseph Heled <[email protected]> writes:
>
> > Thanks. will recompile. Also the other non-sse options should go
> > somewhere else, but I don't know where they should go in the debian
> > build process.
>
> The override_dh_auto_configure target includes the invocation of
> ./configure.
>
> --
> Russ Allbery ([email protected])             <https://www.eyrie.org/~eagle/>
>
>
>
> ------------------------------
>
> Message: 10
> Date: Thu, 5 Dec 2019 10:42:35 +1300
> From: Joseph Heled <[email protected]>
> To: "[email protected]" <[email protected]>
> Subject: Re: current development
> Message-ID:
>         <
> cag8x8-1koju_epyecnd4jqnx7ysgeub4_fzvtzjjnye4vur...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> On Wed, 4 Dec 2019 at 20:59, Joseph Heled <[email protected]> wrote:
>
> > An 8 "core" machine, i.e. fake intel count number
> >
> > $ grep -m1 '^model name' /proc/cpuinfo
> > model name      : Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz
> >
> > in debian rules file:
> > SSE = --enable-simd=avx --enable-simd=sse2 --enable-threads -with-gtk
> > --with-board3d --with-python
> > compiled with gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0  -O3
> >
> > position id: 4NvBGECYr8ELAA:MBngAAAAIAAE
> >
> > rollout cube action:
> >
> >
> So make this sse2.
>
>
> > 8 threads: 93 seconds
> > 4 threads: 100 seconds    On stock debian: 111 seconds
> > 2 threads: 172 seconds
> > 1 thread:   312
> >
>
> with avx,  8 threads: 81 seconds and 99 seconds with 4 threads, so maybe a
> small improvement, maybe not.
>
>
> > -Joseph
> >
> >
> > On Wed, 4 Dec 2019 at 20:29, Ralph Corderoy <[email protected]>
> wrote:
> > >
> > > Hi Joseph,
> > >
> > > > What we really need is someone with access to some computing power
> > > > (aka grid) to run a set of reference positions - 0-ply cube decisions
> > > > vs 2-ply, and see what the difference is. That would give a hint as
> to
> > > > what to do.
> > >
> > > How about reporting your
> > >
> > >     grep -m1 '^model name' /proc/cpuinfo
> > >
> > > along with the stock Ubuntu package version's time on a reference
> > > position when given 1, 2, ... threads.  And then you're self-compiled
> > > version for comparison, noting what you changed in debian/rules.
> > >
> > > It would be a start, and also offer some precision so if something is
> > > awry then others on the list may have data to judge by.
> > >
> > > --
> > > Cheers, Ralph.
> > >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/80c4e2b4/attachment.html
> >
>
> ------------------------------
>
> Message: 11
> Date: Wed, 4 Dec 2019 23:04:00 +0100
> From: Philippe Michel <[email protected]>
> To: [email protected]
> Cc: "[email protected]" <[email protected]>
> Subject: Re: current development
> Message-ID: <20191204220400.GA44635@genesis>
> Content-Type: text/plain; charset=us-ascii
>
> On Wed, Dec 04, 2019 at 02:07:18PM -0500, Timothy Y. Chow wrote:
>
> > Also, it's my impression that many people *don't* think this is even a
> > worthwhile idea to pursue.  Backgammon is already "solved," is what they
> > will say.  It's true that "AlphaGammon" will surely not crush existing
> > bots in a series of (say) 11-point matches.  At most I would expect a
> > slight advantage.  But to me, that is the wrong way to look at the
> issue.
> > I would like to understand superbackgames for their own sake, even
> though
> > they arise rarely in practice.  Furthermore, if we know that bots don't
> > understand superbackgames, then the closer a position gets to being a
> > superbackgame, the less we can trust the bot verdict.
>
> I'm not sure how related it may be, but there is a group of Greek
> academics that have published some articles on their work on a bot,
> Palamedes, that plays backgammon but also variants that have different
> rules and starting positions and lead to positions that would be very
> uncommon in backgammon.
>
>
>
>
>
> ------------------------------
>
> Message: 12
> Date: Thu, 5 Dec 2019 11:12:39 +1300
> From: Joseph Heled <[email protected]>
> To: Philippe Michel <[email protected]>
> Cc: [email protected], "[email protected]" <[email protected]>
> Subject: Re: current development
> Message-ID:
>         <
> cag8x8-1pgzx4z_puvq2ggnkdrg1spewk79dvjee0tviavf0...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> A link to something? article? software? did they use alpha-like strategies?
>
> -Joseph
>
> On Thu, 5 Dec 2019 at 11:04, Philippe Michel <[email protected]>
> wrote:
>
> > On Wed, Dec 04, 2019 at 02:07:18PM -0500, Timothy Y. Chow wrote:
> >
> > > Also, it's my impression that many people *don't* think this is even a
> > > worthwhile idea to pursue.  Backgammon is already "solved," is what
> they
> > > will say.  It's true that "AlphaGammon" will surely not crush existing
> > > bots in a series of (say) 11-point matches.  At most I would expect a
> > > slight advantage.  But to me, that is the wrong way to look at the
> > issue.
> > > I would like to understand superbackgames for their own sake, even
> > though
> > > they arise rarely in practice.  Furthermore, if we know that bots don't
> > > understand superbackgames, then the closer a position gets to being a
> > > superbackgame, the less we can trust the bot verdict.
> >
> > I'm not sure how related it may be, but there is a group of Greek
> > academics that have published some articles on their work on a bot,
> > Palamedes, that plays backgammon but also variants that have different
> > rules and starting positions and lead to positions that would be very
> > uncommon in backgammon.
> >
> >
> >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/ba104667/attachment.html
> >
>
> ------------------------------
>
> Message: 13
> Date: Thu, 5 Dec 2019 11:23:04 +1300
> From: Joseph Heled <[email protected]>
> To: Philippe Michel <[email protected]>
> Cc: [email protected], "[email protected]" <[email protected]>
> Subject: Re: current development
> Message-ID:
>         <
> cag8x8-0+kbwyfq4sws_wjhpys7zond98+_7tsuau81esytz...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> I googled and found this:
>
>    https://hal.inria.fr/hal-01521393/document
>
> Seems very much like GNUBG, only a smaller net. No way to tell how it
> compares to (say) GNUBG.
>
> -Joseph
>
> On Thu, 5 Dec 2019 at 11:12, Joseph Heled <[email protected]> wrote:
>
> > A link to something? article? software? did they use alpha-like
> strategies?
> >
> > -Joseph
> >
> > On Thu, 5 Dec 2019 at 11:04, Philippe Michel <[email protected]>
> > wrote:
> >
> >> On Wed, Dec 04, 2019 at 02:07:18PM -0500, Timothy Y. Chow wrote:
> >>
> >> > Also, it's my impression that many people *don't* think this is even a
> >> > worthwhile idea to pursue.  Backgammon is already "solved," is what
> >> they
> >> > will say.  It's true that "AlphaGammon" will surely not crush existing
> >> > bots in a series of (say) 11-point matches.  At most I would expect a
> >> > slight advantage.  But to me, that is the wrong way to look at the
> >> issue.
> >> > I would like to understand superbackgames for their own sake, even
> >> though
> >> > they arise rarely in practice.  Furthermore, if we know that bots
> don't
> >> > understand superbackgames, then the closer a position gets to being a
> >> > superbackgame, the less we can trust the bot verdict.
> >>
> >> I'm not sure how related it may be, but there is a group of Greek
> >> academics that have published some articles on their work on a bot,
> >> Palamedes, that plays backgammon but also variants that have different
> >> rules and starting positions and lead to positions that would be very
> >> uncommon in backgammon.
> >>
> >>
> >>
> >>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/265dafd1/attachment.html
> >
>
> ------------------------------
>
> Message: 14
> Date: Thu, 5 Dec 2019 06:35:18 +0000
> From: Wayne Joseph <[email protected]>
> To: [email protected]
> Subject: Re: Alphazero / Deepmind backgammon project
> Message-ID:
>         <CAH=
> [email protected]>
> Content-Type: text/plain; charset="utf-8"
>
> Hi Tim / Hi all,
>
> It might be worth reaching out to Jens Averkamp who I believe was in
> contact with a Dev team working this avenue.
>
> I also tried to get in touch with Demis, CEO of Deepmind (who almost
> certainly can play bg) a while ago, but I don't think my message completed
> its intended journey to him (via his P.A).
>
> After seeing what Deepmind has done to publicize Go and StarCraft, I was
> hoping the same might be possible for backgammon. Does anybody else fancy
> seeing Mochy beat Deepmind? ;)
>
> Good luck!
>
> -- Sent from my Android phone
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/2f5aab15/attachment.html
> >
>
> ------------------------------
>
> Message: 15
> Date: Thu, 5 Dec 2019 19:44:51 +1300
> From: Joseph Heled <[email protected]>
> To: Wayne Joseph <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Subject: Re: Alphazero / Deepmind backgammon project
> Message-ID:
>         <CAG8x8-3GQqDSe=
> [email protected]>
> Content-Type: text/plain; charset="utf-8"
>
> Sounds good, Wayne!!
>
> Personal opinion: Mochy will be lucky to win one match, ah-la Lee Sedol, if
> matches are long enough :)
>
> -Joseph
>
>
> On Thu, 5 Dec 2019 at 19:35, Wayne Joseph <[email protected]> wrote:
>
> > Hi Tim / Hi all,
> >
> > It might be worth reaching out to Jens Averkamp who I believe was in
> > contact with a Dev team working this avenue.
> >
> > I also tried to get in touch with Demis, CEO of Deepmind (who almost
> > certainly can play bg) a while ago, but I don't think my message
> completed
> > its intended journey to him (via his P.A).
> >
> > After seeing what Deepmind has done to publicize Go and StarCraft, I was
> > hoping the same might be possible for backgammon. Does anybody else fancy
> > seeing Mochy beat Deepmind? ;)
> >
> > Good luck!
> >
> > -- Sent from my Android phone
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/f14abd8d/attachment.html
> >
>
> ------------------------------
>
> Message: 16
> Date: Thu, 5 Dec 2019 13:30:17 +0200
> From: Nikos Papachristou <[email protected]>
> To: Joseph Heled <[email protected]>
> Cc: Philippe Michel <[email protected]>, [email protected],
>         "[email protected]" <[email protected]>
> Subject: Re: current development
> Message-ID:
>         <CAPF31MRT=
> [email protected]>
> Content-Type: text/plain; charset="utf-8"
>
> Hi everybody!
>
> You can view my research publications on backgammon variants at my website:
> https://nikpapa.com , or alternatively you can download my PhD thesis
> from:
> https://www.didaktorika.gr/eadd/handle/10442/43622?locale=en
>
> My personal view on improving GNUBG: Why not try to "upgrade" your existing
> supervised learning approach? There have been lots of advances in
> optimization/regularization algorithms for neural networks in the past
> years and it might be less demanding that trying a new RL self-play
> approach from scratch.
>
> Regarding expected results, I also believe that backgammon bots are very
> close to perfection and whatever improvements (from any approach) will be
> marginal.
>
>
>
> On Thu, Dec 5, 2019 at 12:14 AM Joseph Heled <[email protected]> wrote:
>
> > A link to something? article? software? did they use alpha-like
> strategies?
> >
> > -Joseph
> >
> > On Thu, 5 Dec 2019 at 11:04, Philippe Michel <[email protected]>
> > wrote:
> >
> >> On Wed, Dec 04, 2019 at 02:07:18PM -0500, Timothy Y. Chow wrote:
> >>
> >> > Also, it's my impression that many people *don't* think this is even a
> >> > worthwhile idea to pursue.  Backgammon is already "solved," is what
> >> they
> >> > will say.  It's true that "AlphaGammon" will surely not crush existing
> >> > bots in a series of (say) 11-point matches.  At most I would expect a
> >> > slight advantage.  But to me, that is the wrong way to look at the
> >> issue.
> >> > I would like to understand superbackgames for their own sake, even
> >> though
> >> > they arise rarely in practice.  Furthermore, if we know that bots
> don't
> >> > understand superbackgames, then the closer a position gets to being a
> >> > superbackgame, the less we can trust the bot verdict.
> >>
> >> I'm not sure how related it may be, but there is a group of Greek
> >> academics that have published some articles on their work on a bot,
> >> Palamedes, that plays backgammon but also variants that have different
> >> rules and starting positions and lead to positions that would be very
> >> uncommon in backgammon.
> >>
> >>
> >>
> >>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/0a61ede8/attachment.html
> >
>
> ------------------------------
>
> Message: 17
> Date: Thu, 5 Dec 2019 13:28:04 +0100
> From: Øystein Schønning-Johansen <[email protected]>
> To: Nikos Papachristou <[email protected]>
> Cc: Joseph Heled <[email protected]>, [email protected],
>         "[email protected]" <[email protected]>
> Subject: Re: current development
> Message-ID:
>         <CAOzpFnRQj7f7qpomYD9dNM3QPkjhHqLAKiQvtwQVocXn=
> [email protected]>
> Content-Type: text/plain; charset="utf-8"
>
> I have tried some experiments, and it looks like the training dataset (for
> contact positions) with the current input features, do indeed like some of
> the more modern methods. Briefly summarized:
>
> Things that improves supervised learning on the dataset:
> * Deeper nets, 5-6 hidden layers combined with ReLU activation functions.
> * Adam (and AdamW) optimizer.
> * A tiny bit of weight decay.
> * Mini-batch training.
>
> Things that does not work:
> * Dropout.
> * PCA of inputs.
> * RMSProp optimizer (About the same performance as SGD).
>
> I've tried training with Keras and on GPU's, and the training is really
> fast. However a plain CPU implementation of modern neural network training
> algorithms is actually not much slower for me. Also porting GPU code over
> into the GNU Backgammon application might not be faster as a lot of cycles
> will be used shuffling data back and forth between main memory and GPU
> memory.
>
> So the process I ended up using was:
> 1. Test out what works with Keras+GPU
> 2. implement that working method in C code for CPU.
> 3. Train NN with that code.
>
> I've only worked with the contact neural network, as I see some strange
> issues with race dataset, and I think it require a re-rollout.
>
> -Øystein
>
> On Thu, Dec 5, 2019 at 12:38 PM Nikos Papachristou <[email protected]>
> wrote:
>
> > Hi everybody!
> >
> > You can view my research publications on backgammon variants at my
> > website: https://nikpapa.com , or alternatively you can download my PhD
> > thesis from:
> > https://www.didaktorika.gr/eadd/handle/10442/43622?locale=en
> >
> > My personal view on improving GNUBG: Why not try to "upgrade" your
> > existing supervised learning approach? There have been lots of advances
> in
> > optimization/regularization algorithms for neural networks in the past
> > years and it might be less demanding that trying a new RL self-play
> > approach from scratch.
> >
> > Regarding expected results, I also believe that backgammon bots are very
> > close to perfection and whatever improvements (from any approach) will be
> > marginal.
> >
> >
> >
> > On Thu, Dec 5, 2019 at 12:14 AM Joseph Heled <[email protected]> wrote:
> >
> >> A link to something? article? software? did they use alpha-like
> >> strategies?
> >>
> >> -Joseph
> >>
> >> On Thu, 5 Dec 2019 at 11:04, Philippe Michel <[email protected]>
> >> wrote:
> >>
> >>> On Wed, Dec 04, 2019 at 02:07:18PM -0500, Timothy Y. Chow wrote:
> >>>
> >>> > Also, it's my impression that many people *don't* think this is even
> a
> >>> > worthwhile idea to pursue.  Backgammon is already "solved," is what
> >>> they
> >>> > will say.  It's true that "AlphaGammon" will surely not crush
> existing
> >>> > bots in a series of (say) 11-point matches.  At most I would expect a
> >>> > slight advantage.  But to me, that is the wrong way to look at the
> >>> issue.
> >>> > I would like to understand superbackgames for their own sake, even
> >>> though
> >>> > they arise rarely in practice.  Furthermore, if we know that bots
> >>> don't
> >>> > understand superbackgames, then the closer a position gets to being a
> >>> > superbackgame, the less we can trust the bot verdict.
> >>>
> >>> I'm not sure how related it may be, but there is a group of Greek
> >>> academics that have published some articles on their work on a bot,
> >>> Palamedes, that plays backgammon but also variants that have different
> >>> rules and starting positions and lead to positions that would be very
> >>> uncommon in backgammon.
> >>>
> >>>
> >>>
> >>>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/75b93a17/attachment.html
> >
>
> ------------------------------
>
> Message: 18
> Date: Thu, 5 Dec 2019 11:32:00 -0500 (EST)
> From: "Timothy Y. Chow" <[email protected]>
> To: "[email protected]" <[email protected]>
> Subject: Re: current development
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=US-ASCII; format=flowed
>
> On Thu, 5 Dec 2019, Nikos Papachristou wrote:
> > My personal view on improving GNUBG: Why not try to "upgrade" your
> > existing supervised learning approach? There have been lots of advances
> > in optimization/regularization algorithms for neural networks in the
> > past years and it might be less demanding that trying a new RL self-play
> > approach from scratch.
> >
> > Regarding expected results, I also believe that backgammon bots are very
> > close to perfection and whatever improvements (from any approach) will
> > be marginal.
>
> In order to determine whether a new network is doing better than the old
> network, it helps to have examples of positions where the old network is
> clearly playing poorly.  Here's one example of a game that I played
> against eXtreme Gammon where the bot made a lot of obvious blunders:
>
> http://timothychow.net/cg/Games/7pt2015-05-24e%20Game%202.htm
>
> For example, search for "10/8 6/4(3)".  The bot's ridiculous play here
> would not be among the top 50 plays of any halfway decent human player.
> Admittedly this was XG but I would expect GNU to behave similarly, if not
> in these specific positions then in similar ones.
>
> Playing around with positions like this will quickly disabuse anyone of
> the illusion that "backgammon bots are very close to perfection."
>
> As I recall, in the past, people have tried specifically training neural
> nets on positions like these, as well as "snake" positions where you have
> to roll a prime for a long distance, and the problem was that it seemed to
> degrade performance on other types of positions.  It's possible that, as
> Papachristou suggests, recent incremental improvements in regularization
> algorithms might be good enough to overcome these difficulties.  Anecdotal
> evidence from Robert Wachtel's revised version of "In the Game Until the
> End" suggests that Xavier was able to improve eXtreme Gammon's post-coup
> classique play significantly, without a wholesale switch to modern deep
> learning methods.
>
> Tim
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Bug-gnubg mailing list
> [email protected]
> https://lists.gnu.org/mailman/listinfo/bug-gnubg
>
>
> ------------------------------
>
> End of Bug-gnubg Digest, Vol 201, Issue 5
> *****************************************
>

"upgrade" your existing supervised learning approach

Reply via email to