Re: [Computer-go] C++11; threads

2014-06-19 Thread Chun Sun
Thanks David,

I ask this because I can only get 2k/s on 19x19 with 200 move per playout,
single thread, no playout termination check, and I saw people on this
forum usually have much better numbers.

Looks like I need to work on my performance more :)

Thanks,
Chun

From the empty board until the playout terminates.  I think for 9x9 it’s
typically around 110 moves.



David



*From:* computer-go-boun...@dvandva.org [mailto:
computer-go-boun...@dvandva.org] *On Behalf Of *Chun Sun
*Sent:* Wednesday, June 18, 2014 8:21 AM
*To:* computer-go@dvandva.org
*Subject:* Re: [Computer-go] C++11; threads



Hi all,



Sorry to ask this beginner question in this thread:



When you say playouts per second, how many moves does each playout have?
on average? Do you play from empty until the board is full for each playout?



Thank you,

Chun



___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-06-19 Thread Erik van der Werf
2k/s doesn't sound too bad (if it's a rather heavy playout policy),
but I would expect a lot more than 200 moves for 19x19. On my phone I
only get a few hundred 19x19 playouts per thread per second...

Erik


On Thu, Jun 19, 2014 at 12:15 PM, Chun Sun sunchu...@gmail.com wrote:
 Thanks David,

 I ask this because I can only get 2k/s on 19x19 with 200 move per playout,
 single thread, no playout termination check, and I saw people on this
 forum usually have much better numbers.

 Looks like I need to work on my performance more :)

 Thanks,
 Chun

 From the empty board until the playout terminates.  I think for 9x9 it’s
 typically around 110 moves.



 David



 From: computer-go-boun...@dvandva.org
 [mailto:computer-go-boun...@dvandva.org] On Behalf Of Chun Sun
 Sent: Wednesday, June 18, 2014 8:21 AM
 To: computer-go@dvandva.org
 Subject: Re: [Computer-go] C++11; threads



 Hi all,



 Sorry to ask this beginner question in this thread:



 When you say playouts per second, how many moves does each playout have?
 on average? Do you play from empty until the board is full for each playout?



 Thank you,

 Chun




 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-06-19 Thread Ben Ellis
I've got a very similar but less feature rich implementation of Orego.

I only get 4kpps single threaded on my power-saver ultra book.
Unfortunately I've not got access to a beefier machine to benchmark with
but like to -think- *dream* that I'm getting play out speeds closer to
Orego on a better machine.

 Ben


On Thu, Jun 19, 2014 at 11:15 AM, Chun Sun sunchu...@gmail.com wrote:

 Thanks David,

 I ask this because I can only get 2k/s on 19x19 with 200 move per playout,
 single thread, no playout termination check, and I saw people on this
 forum usually have much better numbers.

 Looks like I need to work on my performance more :)

 Thanks,
 Chun

 From the empty board until the playout terminates.  I think for 9x9 it’s
 typically around 110 moves.



 David



 *From:* computer-go-boun...@dvandva.org [mailto:
 computer-go-boun...@dvandva.org] *On Behalf Of *Chun Sun
 *Sent:* Wednesday, June 18, 2014 8:21 AM
 *To:* computer-go@dvandva.org
 *Subject:* Re: [Computer-go] C++11; threads



 Hi all,



 Sorry to ask this beginner question in this thread:



 When you say playouts per second, how many moves does each playout have?
 on average? Do you play from empty until the board is full for each playout?



 Thank you,

 Chun



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-06-19 Thread Ben Ellis
That was 9x9, on a 19x19 I only get 0.8kpps.


On Thu, Jun 19, 2014 at 12:59 PM, Ben Ellis ben.el...@softweyr.co.uk
wrote:

 I've got a very similar but less feature rich implementation of Orego.

 I only get 4kpps single threaded on my power-saver ultra book.
 Unfortunately I've not got access to a beefier machine to benchmark with
 but like to -think- *dream* that I'm getting play out speeds closer to
 Orego on a better machine.

  Ben


 On Thu, Jun 19, 2014 at 11:15 AM, Chun Sun sunchu...@gmail.com wrote:

 Thanks David,

 I ask this because I can only get 2k/s on 19x19 with 200 move per
 playout, single thread, no playout termination check, and I saw people on
 this forum usually have much better numbers.

 Looks like I need to work on my performance more :)

 Thanks,
 Chun

 From the empty board until the playout terminates.  I think for 9x9 it’s
 typically around 110 moves.



 David



 *From:* computer-go-boun...@dvandva.org [mailto:
 computer-go-boun...@dvandva.org] *On Behalf Of *Chun Sun
 *Sent:* Wednesday, June 18, 2014 8:21 AM
 *To:* computer-go@dvandva.org
 *Subject:* Re: [Computer-go] C++11; threads



 Hi all,



 Sorry to ask this beginner question in this thread:



 When you say playouts per second, how many moves does each playout
 have? on average? Do you play from empty until the board is full for each
 playout?



 Thank you,

 Chun



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-06-18 Thread folkert
It would be interesting to see what llvm/clang does for it.

On Thu, May 01, 2014 at 01:00:01PM +0200, Marc Landgraf wrote:
 Hey,
 I'm not talking about 20% speedloss here with VC++.
 Just the times for 1000 empty playouts on 9x9, not using any sort of
 multithreading:
 VS debug configuration: 15257
 VS release config (optimized): 756
 C::B mingw-w64 no optimizations: 498
 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108
 
 This of course clearly looks as this is certainly my fault... But right now
 I can't find what I'm doing wrong here... and so I have to miss out those
 handy VS-comfort features and continue with C::B + mingw-w64.
 And the VS profiler results looks pretty much like what I got, when I last
 used VerySleepy on my code compiled with mingw. No super drastic
 bottlenecks just general slowness it seems.
 Mingw-w64 makes it impossible to profile the code, but mingw has
 performance issues as well for me, so I'm using it only when i need profile
 data (not as drastic as VC++, but about factor 3).
 
 
 
 2014-04-30 23:24 GMT+02:00 Aja Huang ajahu...@gmail.com:
 
  I wrote my Go program Erica completely in Visual Studio and had no problem
  at all. It might be around 20% slower on Windows than on Linux, but
  compared to other more important factors 20% loss in speed is not really
  significant. Maybe VS profiler can tell why your program ran awfully slow
  in debug mode.
 
  Aja
 
  2014-04-30 21:38 GMT+01:00 Marc Landgraf mahrgel...@gmail.com:
 
  Hey,
  in the past I tried VS again and again, and in the end always returned
  back to Code::Blocks... It really feels like VS and me won't find together.
  Actually, after your comment I tried it again today, but even after
  spending a decent amount of time of porting it, the program ran awfully
  slow in debug mode, and crashed, as soon as the VC++ compiler tried to
  optimize it. (For reasonable performance I need optimization with mingw-w64
  as well)
  Maybe it is just me and my terrible way of coding... But Visual Studio
  and Visual C++ I can't handle properly.
  And with Code::Blocks, I fooled around with various versions of GCC, and
  ended with mingw-w64, which gave me by far the best performance among those
  supporting the for me relevant C++11-features.
 
  Marc
 
 
  2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com:
 
  Hey Marc,
 
  2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com:
 
  Hi,
  my bot is still under construction, but written entirely under C++11.
  So few comments:
  General:
  Most compilers, especially if you are using Windows, still have
  problems with C++11 and it's new multithreading library. Right now I'm
  using mingw-w64-4.8.1 as it has the required support for thread, even 
  so
  it is done with some workaround via winpthreads, and gives a decently 
  fast
  code. But I'm also interested if anyone else can share his experience 
  with
  other compilers. (for windows)
 
 
  Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new
  C++11 features.
 
 
  http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx
 
  Aja
 
 
  ___
  Computer-go mailing list
  Computer-go@dvandva.org
  http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
 
 
 
  ___
  Computer-go mailing list
  Computer-go@dvandva.org
  http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
 
 
 
  ___
  Computer-go mailing list
  Computer-go@dvandva.org
  http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
 

 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



Folkert van Heusden

-- 
Feeling generous? - http://www.vanheusden.com/wishlist.php
--
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] C++11; threads

2014-06-18 Thread Chun Sun
Hi all,

Sorry to ask this beginner question in this thread:

When you say playouts per second, how many moves does each playout have?
on average? Do you play from empty until the board is full for each playout?

Thank you,
Chun
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-05-12 Thread Ben Ellis
Marc,

Managed to get mine down to 54ms per 1000 playouts, if I run 1 thread it
runs in 94ms (still looking for positional/situational superko for now).

My CPU has 2 physical and 4 logical cores, when I use 2 threads it uses up
50% of the CPU and runs in 55ms, and if I use 4 threads, it uses 100% CPU
and still runs in 55ms.

I suspect either the logical cores aren't being fully utilized, and/or the
bottleneck is in memory access speed not CPU cycles, has anyone else run
into a similar situation when utilizing hyper-threading?

I think I'll move onto implementing GTP and then monte-carlo so I can track
win/loss and accuracy, then try to adjust my random play outs to take
saving/capturing and identifying dead/alive groups into consideration.

What are the mogo 3x3 style patterns you mentioned?

 Ben


On Fri, May 9, 2014 at 10:56 PM, Marc Landgraf mahrgel...@gmail.com wrote:

 oh, my benchmarked numbers are singlethreaded... quad threaded it actually
 runs about 3-3.5 times as fast


 2014-05-09 23:55 GMT+02:00 Marc Landgraf mahrgel...@gmail.com:

 simple ko checks are sufficient in random playouts ;) limit the number of
 moves to something reasonable (like 3 times the fields on the board) and
 you will catch that one in a billion games superko. This should save a fair
 amount of time.
 In any case... Your speed sounds reasonable. You can optimize it further
 later on, once you know what exactly you need from your board
 implementation. Have fun with your tree :)

 My current implementation runs at 100k playouts 9x9 in 7 sec on an
 i7-3630, 8GB, but has a bit heavier playouts. (saving/capturing, mogo style
 3x3 patterns, basic dead shapes, some fun about keeping/destroying eyes
 properly)

 Marc


 2014-05-09 23:35 GMT+02:00 Ben Ellis ben.el...@softweyr.co.uk:

 I've made a start on my first attempt at writing a go playing program
 using the .NET framework, and with 1000 empty 9x9 random playouts I'm
 getting the following benchmarks,

 Single Threaded (uses about 30% CPU)
 VS DEBUG 64bit (with Debugger) - 266ms per 1000 playouts.
 VS RELEASE 64bit - 182ms per 1000 playouts

 Thread Per Core (Uses 100% CPU) (Thread Per Core - 1 yielded similar
 results)
 VS DEBUG 64bit (with Debugger) - 154ms per 1000 playouts.
 VS RELEASE 64bit - 111ms per 1000 playouts

 System Specifications:
 Processor: Intel Core i7-4600U CPU @ 2.10Ghz
 8GB DDR3 RAM

 The random player won't play,

 - Positional super-ko moves (or optionally situational)
 - Suicide moves
 - Eye filling moves

 and continues to play until there are no good moves left (i.e. all empty
 intersections are an eye, suicide or unplayable due to ko)

 Am I missing any other checks/features to the random play outs that
 would normally be implemented?

 What sort of play out speeds are normal, should I spend any more
 optimizing the random play outs before moving into a Monty-Carlo
 implementation?

 Regards,

 Ben


 On Wed, May 7, 2014 at 11:09 PM, Jason House 
 jason.james.ho...@gmail.com wrote:

 Simple ko checks are required in playouts. Advanced ko checks are
 typically restricted to inside the search tree. With simple ko checks, I've
 had playouts get stuck in a 3 ko cycle. Ko cycles can be caught with a
 maximum playout length.
  On May 7, 2014 10:46 AM, Álvaro Begué alvaro.be...@gmail.com
 wrote:

 I believe you *have to* check for simple ko in playouts. Otherwise
 you'll end up with infinite playouts quite easily.





 On Wed, May 7, 2014 at 9:09 AM, Ben Ellis ben.el...@softweyr.co.ukwrote:

 All,

 When playing random playouts, do you (anyone) bother checking for
 KO or super KO? Does this have a negative impact on accuracy of the
 win:loss outcomes?

 Ben


 On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf 
 mahrgel...@gmail.comwrote:

 Now I feel stupid :(
 Thanks...
 So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the
 usual fluff)
 This is still about 15% slower then mingw-w64, but this is just for
 singlethreaded playouts.
 And it looks like, that when using 4 threads on the same tree, this
 gets compensated, and we arrive at pretty much the same speed.





 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net:

 Le 01/05/2014 13:00, Marc Landgraf a écrit :

  Hey,
 I'm not talking about 20% speedloss here with VC++.
 Just the times for 1000 empty playouts on 9x9, not using any sort
 of multithreading:
 VS debug configuration: 15257
 VS release config (optimized): 756
 C::B mingw-w64 no optimizations: 498
 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108

 This of course clearly looks as this is certainly my fault... But
 right now I can't find what I'm doing wrong here... and so I have to 
 miss
 out those handy VS-comfort features and continue with C::B + 
 mingw-w64.
 And the VS profiler results looks pretty much like what I got,
 when I last used VerySleepy on my code compiled with mingw. No super
 drastic bottlenecks just general slowness it seems.
 Mingw-w64 makes it impossible 

Re: [Computer-go] C++11; threads

2014-05-12 Thread Marc Landgraf
You can read about it here:
http://hal.inria.fr/docs/00/12/15/16/PDF/RR-6062.pdf

And don't bother too much about your speed right now ;) You will see things
different, once you have the bigger picture anyway.


2014-05-12 13:51 GMT+02:00 Ben Ellis ben.el...@softweyr.co.uk:

 Marc,

 Managed to get mine down to 54ms per 1000 playouts, if I run 1 thread it
 runs in 94ms (still looking for positional/situational superko for now).

 My CPU has 2 physical and 4 logical cores, when I use 2 threads it uses up
 50% of the CPU and runs in 55ms, and if I use 4 threads, it uses 100% CPU
 and still runs in 55ms.

 I suspect either the logical cores aren't being fully utilized, and/or the
 bottleneck is in memory access speed not CPU cycles, has anyone else run
 into a similar situation when utilizing hyper-threading?

 I think I'll move onto implementing GTP and then monte-carlo so I can
 track win/loss and accuracy, then try to adjust my random play outs to take
 saving/capturing and identifying dead/alive groups into consideration.

 What are the mogo 3x3 style patterns you mentioned?

  Ben


 On Fri, May 9, 2014 at 10:56 PM, Marc Landgraf mahrgel...@gmail.comwrote:

 oh, my benchmarked numbers are singlethreaded... quad threaded it
 actually runs about 3-3.5 times as fast


 2014-05-09 23:55 GMT+02:00 Marc Landgraf mahrgel...@gmail.com:

 simple ko checks are sufficient in random playouts ;) limit the number of
 moves to something reasonable (like 3 times the fields on the board) and
 you will catch that one in a billion games superko. This should save a fair
 amount of time.
 In any case... Your speed sounds reasonable. You can optimize it further
 later on, once you know what exactly you need from your board
 implementation. Have fun with your tree :)

 My current implementation runs at 100k playouts 9x9 in 7 sec on an
 i7-3630, 8GB, but has a bit heavier playouts. (saving/capturing, mogo style
 3x3 patterns, basic dead shapes, some fun about keeping/destroying eyes
 properly)

 Marc


 2014-05-09 23:35 GMT+02:00 Ben Ellis ben.el...@softweyr.co.uk:

 I've made a start on my first attempt at writing a go playing program
 using the .NET framework, and with 1000 empty 9x9 random playouts I'm
 getting the following benchmarks,

 Single Threaded (uses about 30% CPU)
 VS DEBUG 64bit (with Debugger) - 266ms per 1000 playouts.
 VS RELEASE 64bit - 182ms per 1000 playouts

 Thread Per Core (Uses 100% CPU) (Thread Per Core - 1 yielded similar
 results)
 VS DEBUG 64bit (with Debugger) - 154ms per 1000 playouts.
 VS RELEASE 64bit - 111ms per 1000 playouts

 System Specifications:
 Processor: Intel Core i7-4600U CPU @ 2.10Ghz
 8GB DDR3 RAM

 The random player won't play,

 - Positional super-ko moves (or optionally situational)
 - Suicide moves
 - Eye filling moves

 and continues to play until there are no good moves left (i.e. all
 empty intersections are an eye, suicide or unplayable due to ko)

 Am I missing any other checks/features to the random play outs that
 would normally be implemented?

 What sort of play out speeds are normal, should I spend any more
 optimizing the random play outs before moving into a Monty-Carlo
 implementation?

 Regards,

 Ben


 On Wed, May 7, 2014 at 11:09 PM, Jason House 
 jason.james.ho...@gmail.com wrote:

 Simple ko checks are required in playouts. Advanced ko checks are
 typically restricted to inside the search tree. With simple ko checks, 
 I've
 had playouts get stuck in a 3 ko cycle. Ko cycles can be caught with a
 maximum playout length.
  On May 7, 2014 10:46 AM, Álvaro Begué alvaro.be...@gmail.com
 wrote:

 I believe you *have to* check for simple ko in playouts. Otherwise
 you'll end up with infinite playouts quite easily.





 On Wed, May 7, 2014 at 9:09 AM, Ben Ellis 
 ben.el...@softweyr.co.ukwrote:

 All,

 When playing random playouts, do you (anyone) bother checking
 for KO or super KO? Does this have a negative impact on accuracy of the
 win:loss outcomes?

 Ben


 On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf 
 mahrgel...@gmail.comwrote:

 Now I feel stupid :(
 Thanks...
 So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the
 usual fluff)
 This is still about 15% slower then mingw-w64, but this is just for
 singlethreaded playouts.
 And it looks like, that when using 4 threads on the same tree, this
 gets compensated, and we arrive at pretty much the same speed.





 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net:

 Le 01/05/2014 13:00, Marc Landgraf a écrit :

  Hey,
 I'm not talking about 20% speedloss here with VC++.
 Just the times for 1000 empty playouts on 9x9, not using any sort
 of multithreading:
 VS debug configuration: 15257
 VS release config (optimized): 756
 C::B mingw-w64 no optimizations: 498
 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx:
 108

 This of course clearly looks as this is certainly my fault... But
 right now I can't find what I'm doing wrong here... and so I have to 
 

Re: [Computer-go] C++11; threads

2014-05-12 Thread Mikko Aarnos
CPU cores are meant to be used by a single thread only. You can use 
more, but this rests on the assumption that two(or more) threads can 
effectively utilize a single core without too much competition over 
resources. This assumption is true in most situations, e.g. when we have 
to wait for IO or server queries or things like that often, and in these 
cases using HT can give a small performance boost. Now, here we are only 
utilizing the CPU. In this case the threads are only getting into each 
other's way. The effects of this can be seen very clearly with your 
program. It seems to scale perfectly, or very nearly so, as long as you 
don't use more threads than you have actual cores on your computer. When 
you go above that limit the scaling goes to hell and you get no 
improvement at all. The only way to solve your scaling problem is to get 
rid of the competing threads by turning off HT. I did this and have 
never looked back.


-Mikko Aarnos
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] C++11; threads

2014-05-12 Thread Erik van der Werf
This is not accurate. In my experience you should expect a substantial
performance increase from hyperthreading. (For my program on an
i7-3930 it was something like a 40%, Zen got a similar number, others
on this list have claimed even higher numbers, e.g., see:
http://dvandva.org/pipermail/computer-go/2012-August/005298.html).

Erik


On Mon, May 12, 2014 at 5:06 PM, Mikko Aarnos mikko.aar...@kolumbus.fi wrote:
 CPU cores are meant to be used by a single thread only. You can use more,
 but this rests on the assumption that two(or more) threads can effectively
 utilize a single core without too much competition over resources. This
 assumption is true in most situations, e.g. when we have to wait for IO or
 server queries or things like that often, and in these cases using HT can
 give a small performance boost. Now, here we are only utilizing the CPU. In
 this case the threads are only getting into each other's way. The effects of
 this can be seen very clearly with your program. It seems to scale
 perfectly, or very nearly so, as long as you don't use more threads than you
 have actual cores on your computer. When you go above that limit the scaling
 goes to hell and you get no improvement at all. The only way to solve your
 scaling problem is to get rid of the competing threads by turning off HT. I
 did this and have never looked back.

 -Mikko Aarnos

 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] C++11; threads

2014-05-12 Thread Brian Sheppard
I have the same experience as Erik. My quad core CPU gets about 40% to 50%
more output from 8 threads as from 4.

-Original Message-
From: computer-go-boun...@dvandva.org
[mailto:computer-go-boun...@dvandva.org] On Behalf Of Erik van der Werf
Sent: Monday, May 12, 2014 11:39 AM
To: computer-go@dvandva.org
Subject: Re: [Computer-go] C++11; threads

This is not accurate. In my experience you should expect a substantial
performance increase from hyperthreading. (For my program on an
i7-3930 it was something like a 40%, Zen got a similar number, others on
this list have claimed even higher numbers, e.g., see:
http://dvandva.org/pipermail/computer-go/2012-August/005298.html).

Erik


On Mon, May 12, 2014 at 5:06 PM, Mikko Aarnos mikko.aar...@kolumbus.fi
wrote:
 CPU cores are meant to be used by a single thread only. You can use 
 more, but this rests on the assumption that two(or more) threads can 
 effectively utilize a single core without too much competition over 
 resources. This assumption is true in most situations, e.g. when we 
 have to wait for IO or server queries or things like that often, and 
 in these cases using HT can give a small performance boost. Now, here 
 we are only utilizing the CPU. In this case the threads are only 
 getting into each other's way. The effects of this can be seen very 
 clearly with your program. It seems to scale perfectly, or very nearly 
 so, as long as you don't use more threads than you have actual cores 
 on your computer. When you go above that limit the scaling goes to 
 hell and you get no improvement at all. The only way to solve your 
 scaling problem is to get rid of the competing threads by turning off HT.
I did this and have never looked back.

 -Mikko Aarnos

 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] C++11; threads

2014-05-12 Thread Detlef Schmicker
oakfoam too, but I remember it was different before large patterns and
other features.

My working hypothesis is: if one thread on a core can wait for memory
read the other thread can work (as far as I remember two HTs share the
same integer and floating point units e.g.).

If there are no bottle lecks there might be no improvement.

By the way: The display 50% CPU in case of 4 threads on a 4 core 8 HT
processor is nonsense of cause, this is because the operating system
calculates as if all 8 HTs are independent CPUs, but they are not!

Detlef


Am Montag, den 12.05.2014, 11:43 -0400 schrieb Brian Sheppard:
 I have the same experience as Erik. My quad core CPU gets about 40% to 50%
 more output from 8 threads as from 4.
 
 -Original Message-
 From: computer-go-boun...@dvandva.org
 [mailto:computer-go-boun...@dvandva.org] On Behalf Of Erik van der Werf
 Sent: Monday, May 12, 2014 11:39 AM
 To: computer-go@dvandva.org
 Subject: Re: [Computer-go] C++11; threads
 
 This is not accurate. In my experience you should expect a substantial
 performance increase from hyperthreading. (For my program on an
 i7-3930 it was something like a 40%, Zen got a similar number, others on
 this list have claimed even higher numbers, e.g., see:
 http://dvandva.org/pipermail/computer-go/2012-August/005298.html).
 
 Erik
 
 
 On Mon, May 12, 2014 at 5:06 PM, Mikko Aarnos mikko.aar...@kolumbus.fi
 wrote:
  CPU cores are meant to be used by a single thread only. You can use 
  more, but this rests on the assumption that two(or more) threads can 
  effectively utilize a single core without too much competition over 
  resources. This assumption is true in most situations, e.g. when we 
  have to wait for IO or server queries or things like that often, and 
  in these cases using HT can give a small performance boost. Now, here 
  we are only utilizing the CPU. In this case the threads are only 
  getting into each other's way. The effects of this can be seen very 
  clearly with your program. It seems to scale perfectly, or very nearly 
  so, as long as you don't use more threads than you have actual cores 
  on your computer. When you go above that limit the scaling goes to 
  hell and you get no improvement at all. The only way to solve your 
  scaling problem is to get rid of the competing threads by turning off HT.
 I did this and have never looked back.
 
  -Mikko Aarnos
 
  ___
  Computer-go mailing list
  Computer-go@dvandva.org
  http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
 
 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
 


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] C++11; threads

2014-05-12 Thread Ben Ellis
Mikko, This is pretty much what I expected, thanks for re-affirming this.

I agree that from my experience, because I'm doing pure random playouts
with very little lookups on patterns, MCTS trees, etc my CPU utilization is
higher until I start adding these features so I will see less of a gain
when using HT.


On Mon, May 12, 2014 at 6:54 PM, Mikko Aarnos mikko.aar...@kolumbus.fiwrote:

 There is a big difference here: Ellis's program can only do light
 playouts. He doesn't have MCTS or patterns. That is parallelized extremely
 simply by just giving each thread an internal board state, doing a playout
 from that, resetting the board state to the original, doing a playout etc.
 There are no bottlenecks there, and that shouldn't get any increase in
 performance from HT as far as I know(also see the first sentence of
 Schmicker's comment). On the other hand, your programs all have MCTS and
 patterns(with emphasis being on patterns) and they both need constant
 memory reads. Thanks to that it's not a huge surprise that HT works better.
 Still, that doesn't change the fact that I was a bit off. I never actually
 expected that there was so much memory reading that it would actually make
 HT work. Guess I should implement patterns and see if I get similar results.

 Regards,

 Mikko Aarnos

 PS. Of course, all this rests on the assumption that you don't do your
 playouts exactly like Ellis, and if you do I am really, REALLY surprised
 with the performance of HT.

 PPS. And on the assumption that the 40%-50% performance increase was from
 going from 4 threads to 8 threads with HT on all the time, not from going
 from 4 threads with no HT to 8 threads with HT. Here as well, if the latter
 is true I am again honestly surprised.

 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-05-12 Thread Matthew Woodcraft
Mikko Aarnos wrote:
 There is a big difference here: Ellis's program can only do light
 playouts. He doesn't have MCTS or patterns. That is parallelized
 extremely simply by just giving each thread an internal board state,
 doing a playout from that, resetting the board state to the
 original, doing a playout etc. There are no bottlenecks there, and
 that shouldn't get any increase in performance from HT as far as I
 know(also see the first sentence of Schmicker's comment).

I don't think that's right.

I tried an experiment once with hyperthreading and 'light playouts' and
I got a 40% improvement from using two threads per core.


There are plenty of bottlenecks even in such simple code.

For example, any time you do something equivalent to following a linked
list (eg, finding the stones in a group that you're joining to another
group) the thread will have to wait three or four cycles per 'link' even
if all the data is in level-1 cache.


One way to tell whether code is likely to benefit from hyperthreading is
to use a tool that reports the processor's performance counters and look
at the 'instructions per clock' measure. If it's somewhere around 1 then
there are excellent chances of getting good results from hyperthreading.

-M-
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] C++11; threads

2014-05-12 Thread Marc Landgraf
hm... I can't confirm that... I just ran a few tests, and I can do 10k
empty playouts 19x19 per second on 4 threats, and about 11-12k on 8 threats.
I did the test with only playouts, no treesearch, additional
patternmatching or anything. With treesearch, going to more then 4
searchthreats actually slows down my code, due to some blocking issues...
(This will be fixed, at some point)
But also during treesearch right now I'm using 4 dedicated search/playout
threats, and some other threats doing minor work. (nonblocking cleanup and
managment of the tree, checking gtpconsole etc)

So I guess it really depends on the implementation and everyone has to make
their own tests, how their bot does best.


2014-05-12 22:10 GMT+02:00 Matthew Woodcraft matt...@woodcraft.me.uk:

 Mikko Aarnos wrote:
  There is a big difference here: Ellis's program can only do light
  playouts. He doesn't have MCTS or patterns. That is parallelized
  extremely simply by just giving each thread an internal board state,
  doing a playout from that, resetting the board state to the
  original, doing a playout etc. There are no bottlenecks there, and
  that shouldn't get any increase in performance from HT as far as I
  know(also see the first sentence of Schmicker's comment).

 I don't think that's right.

 I tried an experiment once with hyperthreading and 'light playouts' and
 I got a 40% improvement from using two threads per core.


 There are plenty of bottlenecks even in such simple code.

 For example, any time you do something equivalent to following a linked
 list (eg, finding the stones in a group that you're joining to another
 group) the thread will have to wait three or four cycles per 'link' even
 if all the data is in level-1 cache.


 One way to tell whether code is likely to benefit from hyperthreading is
 to use a tool that reports the processor's performance counters and look
 at the 'instructions per clock' measure. If it's somewhere around 1 then
 there are excellent chances of getting good results from hyperthreading.

 -M-
 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-05-12 Thread Ben Ellis
from the few articles I've read on hyperthreading (in partilcular this one
http://msdn.microsoft.com/en-us/magazine/cc300701.aspx). Each logical core
can have two concurrent instruction streams and if one is waiting for a
resource (i.e. either in use by another thread or accessing the main
memory) then the other instruction stream continues while the first stream
is blocked.

But because I use independent board states for each thread and the total
memory used is less than my L3 cache size, none of my threads wait long for
access to main memory or contend with each other so they don't easily give
up control to the second stream. Also, because I'm using .NET, the CLR
arranges the memory in such a way that concurrent threads won't (or will be
less likely) to cross the cache line resulting in less resource contention
in the L1/L2/L3 caches.

Unless the C/C++ compiler optimizes for a CPU with a specific number of
logical cores and physical cores, I doubt it would be as efficient as the
compiled .NET code in regards to how the memory is mapped to the L1/L2/L3
cache?

This is all new to me, so I'm likely wrong but feel like sharing my
thoughts :)




On Mon, May 12, 2014 at 9:10 PM, Matthew Woodcraft
matt...@woodcraft.me.ukwrote:

 Mikko Aarnos wrote:
  There is a big difference here: Ellis's program can only do light
  playouts. He doesn't have MCTS or patterns. That is parallelized
  extremely simply by just giving each thread an internal board state,
  doing a playout from that, resetting the board state to the
  original, doing a playout etc. There are no bottlenecks there, and
  that shouldn't get any increase in performance from HT as far as I
  know(also see the first sentence of Schmicker's comment).

 I don't think that's right.

 I tried an experiment once with hyperthreading and 'light playouts' and
 I got a 40% improvement from using two threads per core.


 There are plenty of bottlenecks even in such simple code.

 For example, any time you do something equivalent to following a linked
 list (eg, finding the stones in a group that you're joining to another
 group) the thread will have to wait three or four cycles per 'link' even
 if all the data is in level-1 cache.


 One way to tell whether code is likely to benefit from hyperthreading is
 to use a tool that reports the processor's performance counters and look
 at the 'instructions per clock' measure. If it's somewhere around 1 then
 there are excellent chances of getting good results from hyperthreading.

 -M-
 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-05-12 Thread Matthew Woodcraft
Ben Ellis wrote:
 from the few articles I've read on hyperthreading (in partilcular this one
 http://msdn.microsoft.com/en-us/magazine/cc300701.aspx). Each logical core
 can have two concurrent instruction streams and if one is waiting for a
 resource (i.e. either in use by another thread or accessing the main
 memory) then the other instruction stream continues while the first stream
 is blocked.

 But because I use independent board states for each thread and the total
 memory used is less than my L3 cache size, none of my threads wait long for
 access to main memory or contend with each other so they don't easily give
 up control to the second stream.

The two streams don't really 'give up control' to each other; they both
run at once. A modern Intel processor can in principle execute something
like four operations per cycle, and in each cycle they can be a mixture
of the two streams.

So it doesn't take anything nearly as 'heavy' as an access to main
memory to mean that you can get value from the second stream.

Even an access to L1 cache takes 4 cycles to complete, and if the
following instructions depend on the value being read then the processor
won't be doing anything else from that stream until the read completes.
And even if there are no reads from memory at all, it's pretty rare that
the processor can find enough parallel work to get close to keeping
four-ish execution units busy from only one instruction stream.


The main reasons why in practice you often don't get value from
hyperthreading are that the two threads are having to share the L1 and
L2 caches, and they also share the resources for instruction fetch and
decode (which can turn out to be a bottleneck disappointingly
frequently).


As long as you're doing light playouts you shouldn't have to worry about
the L3 cache; everything should fit very comfortably in L2 (and quite
possibly in L1, though I don't know what the CLR overhead is like).

-M-
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] C++11; threads

2014-05-09 Thread Ben Ellis
I've made a start on my first attempt at writing a go playing program using
the .NET framework, and with 1000 empty 9x9 random playouts I'm getting the
following benchmarks,

Single Threaded (uses about 30% CPU)
VS DEBUG 64bit (with Debugger) - 266ms per 1000 playouts.
VS RELEASE 64bit - 182ms per 1000 playouts

Thread Per Core (Uses 100% CPU) (Thread Per Core - 1 yielded similar
results)
VS DEBUG 64bit (with Debugger) - 154ms per 1000 playouts.
VS RELEASE 64bit - 111ms per 1000 playouts

System Specifications:
Processor: Intel Core i7-4600U CPU @ 2.10Ghz
8GB DDR3 RAM

The random player won't play,

- Positional super-ko moves (or optionally situational)
- Suicide moves
- Eye filling moves

and continues to play until there are no good moves left (i.e. all empty
intersections are an eye, suicide or unplayable due to ko)

Am I missing any other checks/features to the random play outs that would
normally be implemented?

What sort of play out speeds are normal, should I spend any more optimizing
the random play outs before moving into a Monty-Carlo implementation?

Regards,

Ben


On Wed, May 7, 2014 at 11:09 PM, Jason House jason.james.ho...@gmail.comwrote:

 Simple ko checks are required in playouts. Advanced ko checks are
 typically restricted to inside the search tree. With simple ko checks, I've
 had playouts get stuck in a 3 ko cycle. Ko cycles can be caught with a
 maximum playout length.
 On May 7, 2014 10:46 AM, Álvaro Begué alvaro.be...@gmail.com wrote:

 I believe you *have to* check for simple ko in playouts. Otherwise you'll
 end up with infinite playouts quite easily.





 On Wed, May 7, 2014 at 9:09 AM, Ben Ellis ben.el...@softweyr.co.ukwrote:

 All,

 When playing random playouts, do you (anyone) bother checking for KO
 or super KO? Does this have a negative impact on accuracy of the win:loss
 outcomes?

 Ben


 On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.comwrote:

 Now I feel stupid :(
 Thanks...
 So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the
 usual fluff)
 This is still about 15% slower then mingw-w64, but this is just for
 singlethreaded playouts.
 And it looks like, that when using 4 threads on the same tree, this
 gets compensated, and we arrive at pretty much the same speed.





 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net:

 Le 01/05/2014 13:00, Marc Landgraf a écrit :

  Hey,
 I'm not talking about 20% speedloss here with VC++.
 Just the times for 1000 empty playouts on 9x9, not using any sort of
 multithreading:
 VS debug configuration: 15257
 VS release config (optimized): 756
 C::B mingw-w64 no optimizations: 498
 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108

 This of course clearly looks as this is certainly my fault... But
 right now I can't find what I'm doing wrong here... and so I have to miss
 out those handy VS-comfort features and continue with C::B + mingw-w64.
 And the VS profiler results looks pretty much like what I got, when I
 last used VerySleepy on my code compiled with mingw. No super drastic
 bottlenecks just general slowness it seems.
 Mingw-w64 makes it impossible to profile the code, but mingw has
 performance issues as well for me, so I'm using it only when i need 
 profile
 data (not as drastic as VC++, but about factor 3).


  Are you doing any memory allocation or input/outputs ? If that's the
 case then you should not start the code with F5 but shift F5 from inside 
 VS.

 hj.


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-05-09 Thread Marc Landgraf
simple ko checks are sufficient in random playouts ;) limit the number of
moves to something reasonable (like 3 times the fields on the board) and
you will catch that one in a billion games superko. This should save a fair
amount of time.
In any case... Your speed sounds reasonable. You can optimize it further
later on, once you know what exactly you need from your board
implementation. Have fun with your tree :)

My current implementation runs at 100k playouts 9x9 in 7 sec on an i7-3630,
8GB, but has a bit heavier playouts. (saving/capturing, mogo style 3x3
patterns, basic dead shapes, some fun about keeping/destroying eyes
properly)

Marc


2014-05-09 23:35 GMT+02:00 Ben Ellis ben.el...@softweyr.co.uk:

 I've made a start on my first attempt at writing a go playing program
 using the .NET framework, and with 1000 empty 9x9 random playouts I'm
 getting the following benchmarks,

 Single Threaded (uses about 30% CPU)
 VS DEBUG 64bit (with Debugger) - 266ms per 1000 playouts.
 VS RELEASE 64bit - 182ms per 1000 playouts

 Thread Per Core (Uses 100% CPU) (Thread Per Core - 1 yielded similar
 results)
 VS DEBUG 64bit (with Debugger) - 154ms per 1000 playouts.
 VS RELEASE 64bit - 111ms per 1000 playouts

 System Specifications:
 Processor: Intel Core i7-4600U CPU @ 2.10Ghz
 8GB DDR3 RAM

 The random player won't play,

 - Positional super-ko moves (or optionally situational)
 - Suicide moves
 - Eye filling moves

 and continues to play until there are no good moves left (i.e. all empty
 intersections are an eye, suicide or unplayable due to ko)

 Am I missing any other checks/features to the random play outs that would
 normally be implemented?

 What sort of play out speeds are normal, should I spend any more
 optimizing the random play outs before moving into a Monty-Carlo
 implementation?

 Regards,

 Ben


 On Wed, May 7, 2014 at 11:09 PM, Jason House 
 jason.james.ho...@gmail.comwrote:

 Simple ko checks are required in playouts. Advanced ko checks are
 typically restricted to inside the search tree. With simple ko checks, I've
 had playouts get stuck in a 3 ko cycle. Ko cycles can be caught with a
 maximum playout length.
  On May 7, 2014 10:46 AM, Álvaro Begué alvaro.be...@gmail.com wrote:

 I believe you *have to* check for simple ko in playouts. Otherwise
 you'll end up with infinite playouts quite easily.





 On Wed, May 7, 2014 at 9:09 AM, Ben Ellis ben.el...@softweyr.co.ukwrote:

 All,

 When playing random playouts, do you (anyone) bother checking for
 KO or super KO? Does this have a negative impact on accuracy of the
 win:loss outcomes?

 Ben


 On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.comwrote:

 Now I feel stupid :(
 Thanks...
 So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the
 usual fluff)
 This is still about 15% slower then mingw-w64, but this is just for
 singlethreaded playouts.
 And it looks like, that when using 4 threads on the same tree, this
 gets compensated, and we arrive at pretty much the same speed.





 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net:

 Le 01/05/2014 13:00, Marc Landgraf a écrit :

  Hey,
 I'm not talking about 20% speedloss here with VC++.
 Just the times for 1000 empty playouts on 9x9, not using any sort of
 multithreading:
 VS debug configuration: 15257
 VS release config (optimized): 756
 C::B mingw-w64 no optimizations: 498
 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108

 This of course clearly looks as this is certainly my fault... But
 right now I can't find what I'm doing wrong here... and so I have to 
 miss
 out those handy VS-comfort features and continue with C::B + mingw-w64.
 And the VS profiler results looks pretty much like what I got, when
 I last used VerySleepy on my code compiled with mingw. No super drastic
 bottlenecks just general slowness it seems.
 Mingw-w64 makes it impossible to profile the code, but mingw has
 performance issues as well for me, so I'm using it only when i need 
 profile
 data (not as drastic as VC++, but about factor 3).


  Are you doing any memory allocation or input/outputs ? If that's
 the case then you should not start the code with F5 but shift F5 from
 inside VS.

 hj.


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 

Re: [Computer-go] C++11; threads

2014-05-09 Thread Marc Landgraf
oh, my benchmarked numbers are singlethreaded... quad threaded it actually
runs about 3-3.5 times as fast


2014-05-09 23:55 GMT+02:00 Marc Landgraf mahrgel...@gmail.com:

 simple ko checks are sufficient in random playouts ;) limit the number of
 moves to something reasonable (like 3 times the fields on the board) and
 you will catch that one in a billion games superko. This should save a fair
 amount of time.
 In any case... Your speed sounds reasonable. You can optimize it further
 later on, once you know what exactly you need from your board
 implementation. Have fun with your tree :)

 My current implementation runs at 100k playouts 9x9 in 7 sec on an
 i7-3630, 8GB, but has a bit heavier playouts. (saving/capturing, mogo style
 3x3 patterns, basic dead shapes, some fun about keeping/destroying eyes
 properly)

 Marc


 2014-05-09 23:35 GMT+02:00 Ben Ellis ben.el...@softweyr.co.uk:

 I've made a start on my first attempt at writing a go playing program
 using the .NET framework, and with 1000 empty 9x9 random playouts I'm
 getting the following benchmarks,

 Single Threaded (uses about 30% CPU)
 VS DEBUG 64bit (with Debugger) - 266ms per 1000 playouts.
 VS RELEASE 64bit - 182ms per 1000 playouts

 Thread Per Core (Uses 100% CPU) (Thread Per Core - 1 yielded similar
 results)
 VS DEBUG 64bit (with Debugger) - 154ms per 1000 playouts.
 VS RELEASE 64bit - 111ms per 1000 playouts

 System Specifications:
 Processor: Intel Core i7-4600U CPU @ 2.10Ghz
 8GB DDR3 RAM

 The random player won't play,

 - Positional super-ko moves (or optionally situational)
 - Suicide moves
 - Eye filling moves

 and continues to play until there are no good moves left (i.e. all empty
 intersections are an eye, suicide or unplayable due to ko)

 Am I missing any other checks/features to the random play outs that would
 normally be implemented?

 What sort of play out speeds are normal, should I spend any more
 optimizing the random play outs before moving into a Monty-Carlo
 implementation?

 Regards,

 Ben


 On Wed, May 7, 2014 at 11:09 PM, Jason House jason.james.ho...@gmail.com
  wrote:

 Simple ko checks are required in playouts. Advanced ko checks are
 typically restricted to inside the search tree. With simple ko checks, I've
 had playouts get stuck in a 3 ko cycle. Ko cycles can be caught with a
 maximum playout length.
  On May 7, 2014 10:46 AM, Álvaro Begué alvaro.be...@gmail.com wrote:

 I believe you *have to* check for simple ko in playouts. Otherwise
 you'll end up with infinite playouts quite easily.





 On Wed, May 7, 2014 at 9:09 AM, Ben Ellis ben.el...@softweyr.co.ukwrote:

 All,

 When playing random playouts, do you (anyone) bother checking for
 KO or super KO? Does this have a negative impact on accuracy of the
 win:loss outcomes?

 Ben


 On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.comwrote:

 Now I feel stupid :(
 Thanks...
 So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the
 usual fluff)
 This is still about 15% slower then mingw-w64, but this is just for
 singlethreaded playouts.
 And it looks like, that when using 4 threads on the same tree, this
 gets compensated, and we arrive at pretty much the same speed.





 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net:

 Le 01/05/2014 13:00, Marc Landgraf a écrit :

  Hey,
 I'm not talking about 20% speedloss here with VC++.
 Just the times for 1000 empty playouts on 9x9, not using any sort
 of multithreading:
 VS debug configuration: 15257
 VS release config (optimized): 756
 C::B mingw-w64 no optimizations: 498
 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108

 This of course clearly looks as this is certainly my fault... But
 right now I can't find what I'm doing wrong here... and so I have to 
 miss
 out those handy VS-comfort features and continue with C::B + mingw-w64.
 And the VS profiler results looks pretty much like what I got, when
 I last used VerySleepy on my code compiled with mingw. No super drastic
 bottlenecks just general slowness it seems.
 Mingw-w64 makes it impossible to profile the code, but mingw has
 performance issues as well for me, so I'm using it only when i need 
 profile
 data (not as drastic as VC++, but about factor 3).


  Are you doing any memory allocation or input/outputs ? If that's
 the case then you should not start the code with F5 but shift F5 from
 inside VS.

 hj.


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 

Re: [Computer-go] C++11; threads

2014-05-07 Thread Ben Ellis
All,

When playing random playouts, do you (anyone) bother checking for KO or
super KO? Does this have a negative impact on accuracy of the win:loss
outcomes?

Ben


On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.com wrote:

 Now I feel stupid :(
 Thanks...
 So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the usual
 fluff)
 This is still about 15% slower then mingw-w64, but this is just for
 singlethreaded playouts.
 And it looks like, that when using 4 threads on the same tree, this gets
 compensated, and we arrive at pretty much the same speed.





 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net:

 Le 01/05/2014 13:00, Marc Landgraf a écrit :

  Hey,
 I'm not talking about 20% speedloss here with VC++.
 Just the times for 1000 empty playouts on 9x9, not using any sort of
 multithreading:
 VS debug configuration: 15257
 VS release config (optimized): 756
 C::B mingw-w64 no optimizations: 498
 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108

 This of course clearly looks as this is certainly my fault... But right
 now I can't find what I'm doing wrong here... and so I have to miss out
 those handy VS-comfort features and continue with C::B + mingw-w64.
 And the VS profiler results looks pretty much like what I got, when I
 last used VerySleepy on my code compiled with mingw. No super drastic
 bottlenecks just general slowness it seems.
 Mingw-w64 makes it impossible to profile the code, but mingw has
 performance issues as well for me, so I'm using it only when i need profile
 data (not as drastic as VC++, but about factor 3).


  Are you doing any memory allocation or input/outputs ? If that's the
 case then you should not start the code with F5 but shift F5 from inside VS.

 hj.


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-05-07 Thread Álvaro Begué
I believe you *have to* check for simple ko in playouts. Otherwise you'll
end up with infinite playouts quite easily.





On Wed, May 7, 2014 at 9:09 AM, Ben Ellis ben.el...@softweyr.co.uk wrote:

 All,

 When playing random playouts, do you (anyone) bother checking for KO
 or super KO? Does this have a negative impact on accuracy of the win:loss
 outcomes?

 Ben


 On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.comwrote:

 Now I feel stupid :(
 Thanks...
 So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the usual
 fluff)
 This is still about 15% slower then mingw-w64, but this is just for
 singlethreaded playouts.
 And it looks like, that when using 4 threads on the same tree, this gets
 compensated, and we arrive at pretty much the same speed.





 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net:

 Le 01/05/2014 13:00, Marc Landgraf a écrit :

  Hey,
 I'm not talking about 20% speedloss here with VC++.
 Just the times for 1000 empty playouts on 9x9, not using any sort of
 multithreading:
 VS debug configuration: 15257
 VS release config (optimized): 756
 C::B mingw-w64 no optimizations: 498
 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108

 This of course clearly looks as this is certainly my fault... But right
 now I can't find what I'm doing wrong here... and so I have to miss out
 those handy VS-comfort features and continue with C::B + mingw-w64.
 And the VS profiler results looks pretty much like what I got, when I
 last used VerySleepy on my code compiled with mingw. No super drastic
 bottlenecks just general slowness it seems.
 Mingw-w64 makes it impossible to profile the code, but mingw has
 performance issues as well for me, so I'm using it only when i need profile
 data (not as drastic as VC++, but about factor 3).


  Are you doing any memory allocation or input/outputs ? If that's the
 case then you should not start the code with F5 but shift F5 from inside VS.

 hj.


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-05-07 Thread Jason House
Simple ko checks are required in playouts. Advanced ko checks are typically
restricted to inside the search tree. With simple ko checks, I've had
playouts get stuck in a 3 ko cycle. Ko cycles can be caught with a maximum
playout length.
On May 7, 2014 10:46 AM, Álvaro Begué alvaro.be...@gmail.com wrote:

 I believe you *have to* check for simple ko in playouts. Otherwise you'll
 end up with infinite playouts quite easily.





 On Wed, May 7, 2014 at 9:09 AM, Ben Ellis ben.el...@softweyr.co.ukwrote:

 All,

 When playing random playouts, do you (anyone) bother checking for KO
 or super KO? Does this have a negative impact on accuracy of the win:loss
 outcomes?

 Ben


 On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.comwrote:

 Now I feel stupid :(
 Thanks...
 So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the
 usual fluff)
 This is still about 15% slower then mingw-w64, but this is just for
 singlethreaded playouts.
 And it looks like, that when using 4 threads on the same tree, this gets
 compensated, and we arrive at pretty much the same speed.





 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net:

 Le 01/05/2014 13:00, Marc Landgraf a écrit :

  Hey,
 I'm not talking about 20% speedloss here with VC++.
 Just the times for 1000 empty playouts on 9x9, not using any sort of
 multithreading:
 VS debug configuration: 15257
 VS release config (optimized): 756
 C::B mingw-w64 no optimizations: 498
 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108

 This of course clearly looks as this is certainly my fault... But
 right now I can't find what I'm doing wrong here... and so I have to miss
 out those handy VS-comfort features and continue with C::B + mingw-w64.
 And the VS profiler results looks pretty much like what I got, when I
 last used VerySleepy on my code compiled with mingw. No super drastic
 bottlenecks just general slowness it seems.
 Mingw-w64 makes it impossible to profile the code, but mingw has
 performance issues as well for me, so I'm using it only when i need 
 profile
 data (not as drastic as VC++, but about factor 3).


  Are you doing any memory allocation or input/outputs ? If that's the
 case then you should not start the code with F5 but shift F5 from inside 
 VS.

 hj.


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-05-01 Thread Marc Landgraf
Hey,
I'm not talking about 20% speedloss here with VC++.
Just the times for 1000 empty playouts on 9x9, not using any sort of
multithreading:
VS debug configuration: 15257
VS release config (optimized): 756
C::B mingw-w64 no optimizations: 498
C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108

This of course clearly looks as this is certainly my fault... But right now
I can't find what I'm doing wrong here... and so I have to miss out those
handy VS-comfort features and continue with C::B + mingw-w64.
And the VS profiler results looks pretty much like what I got, when I last
used VerySleepy on my code compiled with mingw. No super drastic
bottlenecks just general slowness it seems.
Mingw-w64 makes it impossible to profile the code, but mingw has
performance issues as well for me, so I'm using it only when i need profile
data (not as drastic as VC++, but about factor 3).



2014-04-30 23:24 GMT+02:00 Aja Huang ajahu...@gmail.com:

 I wrote my Go program Erica completely in Visual Studio and had no problem
 at all. It might be around 20% slower on Windows than on Linux, but
 compared to other more important factors 20% loss in speed is not really
 significant. Maybe VS profiler can tell why your program ran awfully slow
 in debug mode.

 Aja

 2014-04-30 21:38 GMT+01:00 Marc Landgraf mahrgel...@gmail.com:

 Hey,
 in the past I tried VS again and again, and in the end always returned
 back to Code::Blocks... It really feels like VS and me won't find together.
 Actually, after your comment I tried it again today, but even after
 spending a decent amount of time of porting it, the program ran awfully
 slow in debug mode, and crashed, as soon as the VC++ compiler tried to
 optimize it. (For reasonable performance I need optimization with mingw-w64
 as well)
 Maybe it is just me and my terrible way of coding... But Visual Studio
 and Visual C++ I can't handle properly.
 And with Code::Blocks, I fooled around with various versions of GCC, and
 ended with mingw-w64, which gave me by far the best performance among those
 supporting the for me relevant C++11-features.

 Marc


 2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com:

 Hey Marc,

 2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com:

 Hi,
 my bot is still under construction, but written entirely under C++11.
 So few comments:
 General:
 Most compilers, especially if you are using Windows, still have
 problems with C++11 and it's new multithreading library. Right now I'm
 using mingw-w64-4.8.1 as it has the required support for thread, even so
 it is done with some workaround via winpthreads, and gives a decently fast
 code. But I'm also interested if anyone else can share his experience with
 other compilers. (for windows)


 Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new
 C++11 features.


 http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx

 Aja


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-05-01 Thread uurtamo .
That is amazing.
On May 1, 2014 4:00 AM, Marc Landgraf mahrgel...@gmail.com wrote:

 Hey,
 I'm not talking about 20% speedloss here with VC++.
 Just the times for 1000 empty playouts on 9x9, not using any sort of
 multithreading:
 VS debug configuration: 15257
 VS release config (optimized): 756
 C::B mingw-w64 no optimizations: 498
 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108

 This of course clearly looks as this is certainly my fault... But right
 now I can't find what I'm doing wrong here... and so I have to miss out
 those handy VS-comfort features and continue with C::B + mingw-w64.
 And the VS profiler results looks pretty much like what I got, when I last
 used VerySleepy on my code compiled with mingw. No super drastic
 bottlenecks just general slowness it seems.
 Mingw-w64 makes it impossible to profile the code, but mingw has
 performance issues as well for me, so I'm using it only when i need profile
 data (not as drastic as VC++, but about factor 3).



 2014-04-30 23:24 GMT+02:00 Aja Huang ajahu...@gmail.com:

 I wrote my Go program Erica completely in Visual Studio and had no
 problem at all. It might be around 20% slower on Windows than on Linux, but
 compared to other more important factors 20% loss in speed is not really
 significant. Maybe VS profiler can tell why your program ran awfully slow
 in debug mode.

 Aja

 2014-04-30 21:38 GMT+01:00 Marc Landgraf mahrgel...@gmail.com:

 Hey,
 in the past I tried VS again and again, and in the end always returned
 back to Code::Blocks... It really feels like VS and me won't find together.
 Actually, after your comment I tried it again today, but even after
 spending a decent amount of time of porting it, the program ran awfully
 slow in debug mode, and crashed, as soon as the VC++ compiler tried to
 optimize it. (For reasonable performance I need optimization with mingw-w64
 as well)
 Maybe it is just me and my terrible way of coding... But Visual Studio
 and Visual C++ I can't handle properly.
 And with Code::Blocks, I fooled around with various versions of GCC, and
 ended with mingw-w64, which gave me by far the best performance among those
 supporting the for me relevant C++11-features.

 Marc


 2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com:

 Hey Marc,

 2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com:

 Hi,
 my bot is still under construction, but written entirely under C++11.
 So few comments:
 General:
 Most compilers, especially if you are using Windows, still have
 problems with C++11 and it's new multithreading library. Right now I'm
 using mingw-w64-4.8.1 as it has the required support for thread, even so
 it is done with some workaround via winpthreads, and gives a decently fast
 code. But I'm also interested if anyone else can share his experience with
 other compilers. (for windows)


 Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new
 C++11 features.


 http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx

 Aja


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-05-01 Thread Harald Johnsen

Le 01/05/2014 13:00, Marc Landgraf a écrit :

Hey,
I'm not talking about 20% speedloss here with VC++.
Just the times for 1000 empty playouts on 9x9, not using any sort of 
multithreading:

VS debug configuration: 15257
VS release config (optimized): 756
C::B mingw-w64 no optimizations: 498
C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108

This of course clearly looks as this is certainly my fault... But 
right now I can't find what I'm doing wrong here... and so I have to 
miss out those handy VS-comfort features and continue with C::B + 
mingw-w64.
And the VS profiler results looks pretty much like what I got, when I 
last used VerySleepy on my code compiled with mingw. No super drastic 
bottlenecks just general slowness it seems.
Mingw-w64 makes it impossible to profile the code, but mingw has 
performance issues as well for me, so I'm using it only when i need 
profile data (not as drastic as VC++, but about factor 3).



Are you doing any memory allocation or input/outputs ? If that's the 
case then you should not start the code with F5 but shift F5 from inside VS.


hj.

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] C++11; threads

2014-05-01 Thread Brian Sheppard
Debug configurations in VS have a tremendous amount of code-verification built 
in. Buffer overwrites, API usage, use of uninitialized data, and the coverage 
is very extensive. Expensive, too, but totally worthwhile because of the time 
saved for developers. The speed of a debug build is not really that important 
anyway.

 

Check your optimization settings for the Release build. There are a lot of 
optional choices, some of which are not 'safe' in general, and you have to 
measure each one. I recall ~2:1 ratio between the best and worst speeds based 
just on tweaking settings for the Release build of Pebbles.

 

From: computer-go-boun...@dvandva.org [mailto:computer-go-boun...@dvandva.org] 
On Behalf Of uurtamo .
Sent: Thursday, May 01, 2014 9:34 AM
To: computer-go@dvandva.org
Subject: Re: [Computer-go] C++11; threads

 

That is amazing.

On May 1, 2014 4:00 AM, Marc Landgraf mahrgel...@gmail.com wrote:

Hey, 

I'm not talking about 20% speedloss here with VC++. 

Just the times for 1000 empty playouts on 9x9, not using any sort of 
multithreading:

VS debug configuration: 15257

VS release config (optimized): 756

C::B mingw-w64 no optimizations: 498

C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108

 

This of course clearly looks as this is certainly my fault... But right now I 
can't find what I'm doing wrong here... and so I have to miss out those handy 
VS-comfort features and continue with C::B + mingw-w64.

And the VS profiler results looks pretty much like what I got, when I last used 
VerySleepy on my code compiled with mingw. No super drastic bottlenecks just 
general slowness it seems.

Mingw-w64 makes it impossible to profile the code, but mingw has performance 
issues as well for me, so I'm using it only when i need profile data (not as 
drastic as VC++, but about factor 3).

 

 

2014-04-30 23:24 GMT+02:00 Aja Huang ajahu...@gmail.com:

I wrote my Go program Erica completely in Visual Studio and had no problem at 
all. It might be around 20% slower on Windows than on Linux, but compared to 
other more important factors 20% loss in speed is not really significant. Maybe 
VS profiler can tell why your program ran awfully slow in debug mode.

 

Aja

2014-04-30 21:38 GMT+01:00 Marc Landgraf mahrgel...@gmail.com:

 

Hey,

in the past I tried VS again and again, and in the end always returned back to 
Code::Blocks... It really feels like VS and me won't find together. Actually, 
after your comment I tried it again today, but even after spending a decent 
amount of time of porting it, the program ran awfully slow in debug mode, and 
crashed, as soon as the VC++ compiler tried to optimize it. (For reasonable 
performance I need optimization with mingw-w64 as well)

Maybe it is just me and my terrible way of coding... But Visual Studio and 
Visual C++ I can't handle properly. 

And with Code::Blocks, I fooled around with various versions of GCC, and ended 
with mingw-w64, which gave me by far the best performance among those 
supporting the for me relevant C++11-features.

 

Marc

 

2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com:

Hey Marc,

 

2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com:

 

Hi,

my bot is still under construction, but written entirely under C++11. So few 
comments:

General:

Most compilers, especially if you are using Windows, still have problems with 
C++11 and it's new multithreading library. Right now I'm using mingw-w64-4.8.1 
as it has the required support for thread, even so it is done with some 
workaround via winpthreads, and gives a decently fast code. But I'm also 
interested if anyone else can share his experience with other compilers. (for 
windows)

 

Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new C++11 
features.

 

http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx

 

Aja

 

 

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

 


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

 


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

 


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-05-01 Thread Marc Landgraf
Now I feel stupid :(
Thanks...
So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the usual
fluff)
This is still about 15% slower then mingw-w64, but this is just for
singlethreaded playouts.
And it looks like, that when using 4 threads on the same tree, this gets
compensated, and we arrive at pretty much the same speed.





2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net:

 Le 01/05/2014 13:00, Marc Landgraf a écrit :

  Hey,
 I'm not talking about 20% speedloss here with VC++.
 Just the times for 1000 empty playouts on 9x9, not using any sort of
 multithreading:
 VS debug configuration: 15257
 VS release config (optimized): 756
 C::B mingw-w64 no optimizations: 498
 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108

 This of course clearly looks as this is certainly my fault... But right
 now I can't find what I'm doing wrong here... and so I have to miss out
 those handy VS-comfort features and continue with C::B + mingw-w64.
 And the VS profiler results looks pretty much like what I got, when I
 last used VerySleepy on my code compiled with mingw. No super drastic
 bottlenecks just general slowness it seems.
 Mingw-w64 makes it impossible to profile the code, but mingw has
 performance issues as well for me, so I'm using it only when i need profile
 data (not as drastic as VC++, but about factor 3).


  Are you doing any memory allocation or input/outputs ? If that's the
 case then you should not start the code with F5 but shift F5 from inside VS.

 hj.


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-04-30 Thread Marc Landgraf
Hi,
my bot is still under construction, but written entirely under C++11. So
few comments:
General:
Most compilers, especially if you are using Windows, still have problems
with C++11 and it's new multithreading library. Right now I'm using
mingw-w64-4.8.1 as it has the required support for thread, even so it is
done with some workaround via winpthreads, and gives a decently fast code.
But I'm also interested if anyone else can share his experience with other
compilers. (for windows)
Multithreading:
The new thread library fulfills all my requirements for my Bot. But I
also haven't tested it with boost::thread, so I can't draw conclusions
concerning which is faster.
Synchronization:
The improved atomic for base types are as fast as the base types itself
and really useful, when running with multiple threads on the same tree.
This is not true for mutex. Those can cause a dramatic slowdown, if used
too often. For that reason I reduced the usage of mutex to an absolute
minimum. With those reduction, condition_variable disappeared entirely
from my code, and I can't tell how well they performed. (I initially had
some consumer producer queue, to spread out playouts over multiple threads,
now i switched to all threads doing full exploration+playout-runs)
Other:
To me, the most noticeable other feature is the reworked auto-keyword,
which mostly improves code readability, and more important the new
for-loops iterating over any STL-container. Those new loops not only
improve readability even more, but on some occasions can cause drastic
speed improvements compared to the old style looping.

Best Regards, Marc


2014-04-30 3:10 GMT+02:00 Darren Cook dar...@dcook.org:

 A round of compiler/OS upgrades mean I'm finally able to use C++11 in
 real-world projects, so I've been re-studying all the new features. I
 know the computer go community really cares about their CPU cycles, and
 also that C++ is widely used, so I'd love to hear stories from the
 frontline about which C++11 things you are using (or not) and why.

 I'm especially interested in if the threading library is good enough for
 the high-performance parallel tree searches people are using. It is
 mostly boost::thread, which I was already using, but futures are new
 (*), and then there is the low-level memory model and atomics.

 Darren

 *: I've been using futures to handle the async complexity in
 JavaScript/jQuery. For certain kinds of problems they can be a very
 graceful solution.

 --
 Darren Cook, Software Researcher/Developer
 My new book: Data Push Apps with HTML5 SSE
 Published by O'Reilly: (ask me for a discount code!)
   http://shop.oreilly.com/product/0636920030928.do
 Also on Amazon and at all good booksellers!
 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-04-30 Thread Aja Huang
Hey Marc,

2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com:

 Hi,
 my bot is still under construction, but written entirely under C++11. So
 few comments:
 General:
 Most compilers, especially if you are using Windows, still have problems
 with C++11 and it's new multithreading library. Right now I'm using
 mingw-w64-4.8.1 as it has the required support for thread, even so it is
 done with some workaround via winpthreads, and gives a decently fast code.
 But I'm also interested if anyone else can share his experience with other
 compilers. (for windows)


Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new
C++11 features.

http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx

Aja
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-04-30 Thread Marc Landgraf
Hey,
in the past I tried VS again and again, and in the end always returned back
to Code::Blocks... It really feels like VS and me won't find together.
Actually, after your comment I tried it again today, but even after
spending a decent amount of time of porting it, the program ran awfully
slow in debug mode, and crashed, as soon as the VC++ compiler tried to
optimize it. (For reasonable performance I need optimization with mingw-w64
as well)
Maybe it is just me and my terrible way of coding... But Visual Studio and
Visual C++ I can't handle properly.
And with Code::Blocks, I fooled around with various versions of GCC, and
ended with mingw-w64, which gave me by far the best performance among those
supporting the for me relevant C++11-features.

Marc


2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com:

 Hey Marc,

 2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com:

 Hi,
 my bot is still under construction, but written entirely under C++11. So
 few comments:
 General:
 Most compilers, especially if you are using Windows, still have problems
 with C++11 and it's new multithreading library. Right now I'm using
 mingw-w64-4.8.1 as it has the required support for thread, even so it is
 done with some workaround via winpthreads, and gives a decently fast code.
 But I'm also interested if anyone else can share his experience with other
 compilers. (for windows)


 Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new
 C++11 features.


 http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx

 Aja


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-04-30 Thread Aja Huang
I wrote my Go program Erica completely in Visual Studio and had no problem
at all. It might be around 20% slower on Windows than on Linux, but
compared to other more important factors 20% loss in speed is not really
significant. Maybe VS profiler can tell why your program ran awfully slow
in debug mode.

Aja

2014-04-30 21:38 GMT+01:00 Marc Landgraf mahrgel...@gmail.com:

 Hey,
 in the past I tried VS again and again, and in the end always returned
 back to Code::Blocks... It really feels like VS and me won't find together.
 Actually, after your comment I tried it again today, but even after
 spending a decent amount of time of porting it, the program ran awfully
 slow in debug mode, and crashed, as soon as the VC++ compiler tried to
 optimize it. (For reasonable performance I need optimization with mingw-w64
 as well)
 Maybe it is just me and my terrible way of coding... But Visual Studio and
 Visual C++ I can't handle properly.
 And with Code::Blocks, I fooled around with various versions of GCC, and
 ended with mingw-w64, which gave me by far the best performance among those
 supporting the for me relevant C++11-features.

 Marc


 2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com:

 Hey Marc,

 2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com:

 Hi,
 my bot is still under construction, but written entirely under C++11. So
 few comments:
 General:
 Most compilers, especially if you are using Windows, still have problems
 with C++11 and it's new multithreading library. Right now I'm using
 mingw-w64-4.8.1 as it has the required support for thread, even so it is
 done with some workaround via winpthreads, and gives a decently fast code.
 But I'm also interested if anyone else can share his experience with other
 compilers. (for windows)


 Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new
 C++11 features.


 http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx

 Aja


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] C++11; threads

2014-04-30 Thread uurtamo .
What are the extra features? I ask because if you consider something like
openssl, it tries crazily hard to optimize against your cpu. We should be
so lucky..

s.
On Apr 30, 2014 1:38 PM, Marc Landgraf mahrgel...@gmail.com wrote:

 Hey,
 in the past I tried VS again and again, and in the end always returned
 back to Code::Blocks... It really feels like VS and me won't find together.
 Actually, after your comment I tried it again today, but even after
 spending a decent amount of time of porting it, the program ran awfully
 slow in debug mode, and crashed, as soon as the VC++ compiler tried to
 optimize it. (For reasonable performance I need optimization with mingw-w64
 as well)
 Maybe it is just me and my terrible way of coding... But Visual Studio and
 Visual C++ I can't handle properly.
 And with Code::Blocks, I fooled around with various versions of GCC, and
 ended with mingw-w64, which gave me by far the best performance among those
 supporting the for me relevant C++11-features.

 Marc


 2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com:

 Hey Marc,

 2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com:

 Hi,
 my bot is still under construction, but written entirely under C++11. So
 few comments:
 General:
 Most compilers, especially if you are using Windows, still have problems
 with C++11 and it's new multithreading library. Right now I'm using
 mingw-w64-4.8.1 as it has the required support for thread, even so it is
 done with some workaround via winpthreads, and gives a decently fast code.
 But I'm also interested if anyone else can share his experience with other
 compilers. (for windows)


 Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new
 C++11 features.


 http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx

 Aja


 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

[Computer-go] C++11; threads

2014-04-29 Thread Darren Cook
A round of compiler/OS upgrades mean I'm finally able to use C++11 in
real-world projects, so I've been re-studying all the new features. I
know the computer go community really cares about their CPU cycles, and
also that C++ is widely used, so I'd love to hear stories from the
frontline about which C++11 things you are using (or not) and why.

I'm especially interested in if the threading library is good enough for
the high-performance parallel tree searches people are using. It is
mostly boost::thread, which I was already using, but futures are new
(*), and then there is the low-level memory model and atomics.

Darren

*: I've been using futures to handle the async complexity in
JavaScript/jQuery. For certain kinds of problems they can be a very
graceful solution.

-- 
Darren Cook, Software Researcher/Developer
My new book: Data Push Apps with HTML5 SSE
Published by O'Reilly: (ask me for a discount code!)
  http://shop.oreilly.com/product/0636920030928.do
Also on Amazon and at all good booksellers!
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go