Re: [Computer-go] C++11; threads
Thanks David, I ask this because I can only get 2k/s on 19x19 with 200 move per playout, single thread, no playout termination check, and I saw people on this forum usually have much better numbers. Looks like I need to work on my performance more :) Thanks, Chun From the empty board until the playout terminates. I think for 9x9 it’s typically around 110 moves. David *From:* computer-go-boun...@dvandva.org [mailto: computer-go-boun...@dvandva.org] *On Behalf Of *Chun Sun *Sent:* Wednesday, June 18, 2014 8:21 AM *To:* computer-go@dvandva.org *Subject:* Re: [Computer-go] C++11; threads Hi all, Sorry to ask this beginner question in this thread: When you say playouts per second, how many moves does each playout have? on average? Do you play from empty until the board is full for each playout? Thank you, Chun ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
2k/s doesn't sound too bad (if it's a rather heavy playout policy), but I would expect a lot more than 200 moves for 19x19. On my phone I only get a few hundred 19x19 playouts per thread per second... Erik On Thu, Jun 19, 2014 at 12:15 PM, Chun Sun sunchu...@gmail.com wrote: Thanks David, I ask this because I can only get 2k/s on 19x19 with 200 move per playout, single thread, no playout termination check, and I saw people on this forum usually have much better numbers. Looks like I need to work on my performance more :) Thanks, Chun From the empty board until the playout terminates. I think for 9x9 it’s typically around 110 moves. David From: computer-go-boun...@dvandva.org [mailto:computer-go-boun...@dvandva.org] On Behalf Of Chun Sun Sent: Wednesday, June 18, 2014 8:21 AM To: computer-go@dvandva.org Subject: Re: [Computer-go] C++11; threads Hi all, Sorry to ask this beginner question in this thread: When you say playouts per second, how many moves does each playout have? on average? Do you play from empty until the board is full for each playout? Thank you, Chun ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
I've got a very similar but less feature rich implementation of Orego. I only get 4kpps single threaded on my power-saver ultra book. Unfortunately I've not got access to a beefier machine to benchmark with but like to -think- *dream* that I'm getting play out speeds closer to Orego on a better machine. Ben On Thu, Jun 19, 2014 at 11:15 AM, Chun Sun sunchu...@gmail.com wrote: Thanks David, I ask this because I can only get 2k/s on 19x19 with 200 move per playout, single thread, no playout termination check, and I saw people on this forum usually have much better numbers. Looks like I need to work on my performance more :) Thanks, Chun From the empty board until the playout terminates. I think for 9x9 it’s typically around 110 moves. David *From:* computer-go-boun...@dvandva.org [mailto: computer-go-boun...@dvandva.org] *On Behalf Of *Chun Sun *Sent:* Wednesday, June 18, 2014 8:21 AM *To:* computer-go@dvandva.org *Subject:* Re: [Computer-go] C++11; threads Hi all, Sorry to ask this beginner question in this thread: When you say playouts per second, how many moves does each playout have? on average? Do you play from empty until the board is full for each playout? Thank you, Chun ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
That was 9x9, on a 19x19 I only get 0.8kpps. On Thu, Jun 19, 2014 at 12:59 PM, Ben Ellis ben.el...@softweyr.co.uk wrote: I've got a very similar but less feature rich implementation of Orego. I only get 4kpps single threaded on my power-saver ultra book. Unfortunately I've not got access to a beefier machine to benchmark with but like to -think- *dream* that I'm getting play out speeds closer to Orego on a better machine. Ben On Thu, Jun 19, 2014 at 11:15 AM, Chun Sun sunchu...@gmail.com wrote: Thanks David, I ask this because I can only get 2k/s on 19x19 with 200 move per playout, single thread, no playout termination check, and I saw people on this forum usually have much better numbers. Looks like I need to work on my performance more :) Thanks, Chun From the empty board until the playout terminates. I think for 9x9 it’s typically around 110 moves. David *From:* computer-go-boun...@dvandva.org [mailto: computer-go-boun...@dvandva.org] *On Behalf Of *Chun Sun *Sent:* Wednesday, June 18, 2014 8:21 AM *To:* computer-go@dvandva.org *Subject:* Re: [Computer-go] C++11; threads Hi all, Sorry to ask this beginner question in this thread: When you say playouts per second, how many moves does each playout have? on average? Do you play from empty until the board is full for each playout? Thank you, Chun ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
It would be interesting to see what llvm/clang does for it. On Thu, May 01, 2014 at 01:00:01PM +0200, Marc Landgraf wrote: Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to miss out those handy VS-comfort features and continue with C::B + mingw-w64. And the VS profiler results looks pretty much like what I got, when I last used VerySleepy on my code compiled with mingw. No super drastic bottlenecks just general slowness it seems. Mingw-w64 makes it impossible to profile the code, but mingw has performance issues as well for me, so I'm using it only when i need profile data (not as drastic as VC++, but about factor 3). 2014-04-30 23:24 GMT+02:00 Aja Huang ajahu...@gmail.com: I wrote my Go program Erica completely in Visual Studio and had no problem at all. It might be around 20% slower on Windows than on Linux, but compared to other more important factors 20% loss in speed is not really significant. Maybe VS profiler can tell why your program ran awfully slow in debug mode. Aja 2014-04-30 21:38 GMT+01:00 Marc Landgraf mahrgel...@gmail.com: Hey, in the past I tried VS again and again, and in the end always returned back to Code::Blocks... It really feels like VS and me won't find together. Actually, after your comment I tried it again today, but even after spending a decent amount of time of porting it, the program ran awfully slow in debug mode, and crashed, as soon as the VC++ compiler tried to optimize it. (For reasonable performance I need optimization with mingw-w64 as well) Maybe it is just me and my terrible way of coding... But Visual Studio and Visual C++ I can't handle properly. And with Code::Blocks, I fooled around with various versions of GCC, and ended with mingw-w64, which gave me by far the best performance among those supporting the for me relevant C++11-features. Marc 2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com: Hey Marc, 2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com: Hi, my bot is still under construction, but written entirely under C++11. So few comments: General: Most compilers, especially if you are using Windows, still have problems with C++11 and it's new multithreading library. Right now I'm using mingw-w64-4.8.1 as it has the required support for thread, even so it is done with some workaround via winpthreads, and gives a decently fast code. But I'm also interested if anyone else can share his experience with other compilers. (for windows) Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new C++11 features. http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go Folkert van Heusden -- Feeling generous? - http://www.vanheusden.com/wishlist.php -- Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
Hi all, Sorry to ask this beginner question in this thread: When you say playouts per second, how many moves does each playout have? on average? Do you play from empty until the board is full for each playout? Thank you, Chun ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
Marc, Managed to get mine down to 54ms per 1000 playouts, if I run 1 thread it runs in 94ms (still looking for positional/situational superko for now). My CPU has 2 physical and 4 logical cores, when I use 2 threads it uses up 50% of the CPU and runs in 55ms, and if I use 4 threads, it uses 100% CPU and still runs in 55ms. I suspect either the logical cores aren't being fully utilized, and/or the bottleneck is in memory access speed not CPU cycles, has anyone else run into a similar situation when utilizing hyper-threading? I think I'll move onto implementing GTP and then monte-carlo so I can track win/loss and accuracy, then try to adjust my random play outs to take saving/capturing and identifying dead/alive groups into consideration. What are the mogo 3x3 style patterns you mentioned? Ben On Fri, May 9, 2014 at 10:56 PM, Marc Landgraf mahrgel...@gmail.com wrote: oh, my benchmarked numbers are singlethreaded... quad threaded it actually runs about 3-3.5 times as fast 2014-05-09 23:55 GMT+02:00 Marc Landgraf mahrgel...@gmail.com: simple ko checks are sufficient in random playouts ;) limit the number of moves to something reasonable (like 3 times the fields on the board) and you will catch that one in a billion games superko. This should save a fair amount of time. In any case... Your speed sounds reasonable. You can optimize it further later on, once you know what exactly you need from your board implementation. Have fun with your tree :) My current implementation runs at 100k playouts 9x9 in 7 sec on an i7-3630, 8GB, but has a bit heavier playouts. (saving/capturing, mogo style 3x3 patterns, basic dead shapes, some fun about keeping/destroying eyes properly) Marc 2014-05-09 23:35 GMT+02:00 Ben Ellis ben.el...@softweyr.co.uk: I've made a start on my first attempt at writing a go playing program using the .NET framework, and with 1000 empty 9x9 random playouts I'm getting the following benchmarks, Single Threaded (uses about 30% CPU) VS DEBUG 64bit (with Debugger) - 266ms per 1000 playouts. VS RELEASE 64bit - 182ms per 1000 playouts Thread Per Core (Uses 100% CPU) (Thread Per Core - 1 yielded similar results) VS DEBUG 64bit (with Debugger) - 154ms per 1000 playouts. VS RELEASE 64bit - 111ms per 1000 playouts System Specifications: Processor: Intel Core i7-4600U CPU @ 2.10Ghz 8GB DDR3 RAM The random player won't play, - Positional super-ko moves (or optionally situational) - Suicide moves - Eye filling moves and continues to play until there are no good moves left (i.e. all empty intersections are an eye, suicide or unplayable due to ko) Am I missing any other checks/features to the random play outs that would normally be implemented? What sort of play out speeds are normal, should I spend any more optimizing the random play outs before moving into a Monty-Carlo implementation? Regards, Ben On Wed, May 7, 2014 at 11:09 PM, Jason House jason.james.ho...@gmail.com wrote: Simple ko checks are required in playouts. Advanced ko checks are typically restricted to inside the search tree. With simple ko checks, I've had playouts get stuck in a 3 ko cycle. Ko cycles can be caught with a maximum playout length. On May 7, 2014 10:46 AM, Álvaro Begué alvaro.be...@gmail.com wrote: I believe you *have to* check for simple ko in playouts. Otherwise you'll end up with infinite playouts quite easily. On Wed, May 7, 2014 at 9:09 AM, Ben Ellis ben.el...@softweyr.co.ukwrote: All, When playing random playouts, do you (anyone) bother checking for KO or super KO? Does this have a negative impact on accuracy of the win:loss outcomes? Ben On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.comwrote: Now I feel stupid :( Thanks... So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the usual fluff) This is still about 15% slower then mingw-w64, but this is just for singlethreaded playouts. And it looks like, that when using 4 threads on the same tree, this gets compensated, and we arrive at pretty much the same speed. 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net: Le 01/05/2014 13:00, Marc Landgraf a écrit : Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to miss out those handy VS-comfort features and continue with C::B + mingw-w64. And the VS profiler results looks pretty much like what I got, when I last used VerySleepy on my code compiled with mingw. No super drastic bottlenecks just general slowness it seems. Mingw-w64 makes it impossible
Re: [Computer-go] C++11; threads
You can read about it here: http://hal.inria.fr/docs/00/12/15/16/PDF/RR-6062.pdf And don't bother too much about your speed right now ;) You will see things different, once you have the bigger picture anyway. 2014-05-12 13:51 GMT+02:00 Ben Ellis ben.el...@softweyr.co.uk: Marc, Managed to get mine down to 54ms per 1000 playouts, if I run 1 thread it runs in 94ms (still looking for positional/situational superko for now). My CPU has 2 physical and 4 logical cores, when I use 2 threads it uses up 50% of the CPU and runs in 55ms, and if I use 4 threads, it uses 100% CPU and still runs in 55ms. I suspect either the logical cores aren't being fully utilized, and/or the bottleneck is in memory access speed not CPU cycles, has anyone else run into a similar situation when utilizing hyper-threading? I think I'll move onto implementing GTP and then monte-carlo so I can track win/loss and accuracy, then try to adjust my random play outs to take saving/capturing and identifying dead/alive groups into consideration. What are the mogo 3x3 style patterns you mentioned? Ben On Fri, May 9, 2014 at 10:56 PM, Marc Landgraf mahrgel...@gmail.comwrote: oh, my benchmarked numbers are singlethreaded... quad threaded it actually runs about 3-3.5 times as fast 2014-05-09 23:55 GMT+02:00 Marc Landgraf mahrgel...@gmail.com: simple ko checks are sufficient in random playouts ;) limit the number of moves to something reasonable (like 3 times the fields on the board) and you will catch that one in a billion games superko. This should save a fair amount of time. In any case... Your speed sounds reasonable. You can optimize it further later on, once you know what exactly you need from your board implementation. Have fun with your tree :) My current implementation runs at 100k playouts 9x9 in 7 sec on an i7-3630, 8GB, but has a bit heavier playouts. (saving/capturing, mogo style 3x3 patterns, basic dead shapes, some fun about keeping/destroying eyes properly) Marc 2014-05-09 23:35 GMT+02:00 Ben Ellis ben.el...@softweyr.co.uk: I've made a start on my first attempt at writing a go playing program using the .NET framework, and with 1000 empty 9x9 random playouts I'm getting the following benchmarks, Single Threaded (uses about 30% CPU) VS DEBUG 64bit (with Debugger) - 266ms per 1000 playouts. VS RELEASE 64bit - 182ms per 1000 playouts Thread Per Core (Uses 100% CPU) (Thread Per Core - 1 yielded similar results) VS DEBUG 64bit (with Debugger) - 154ms per 1000 playouts. VS RELEASE 64bit - 111ms per 1000 playouts System Specifications: Processor: Intel Core i7-4600U CPU @ 2.10Ghz 8GB DDR3 RAM The random player won't play, - Positional super-ko moves (or optionally situational) - Suicide moves - Eye filling moves and continues to play until there are no good moves left (i.e. all empty intersections are an eye, suicide or unplayable due to ko) Am I missing any other checks/features to the random play outs that would normally be implemented? What sort of play out speeds are normal, should I spend any more optimizing the random play outs before moving into a Monty-Carlo implementation? Regards, Ben On Wed, May 7, 2014 at 11:09 PM, Jason House jason.james.ho...@gmail.com wrote: Simple ko checks are required in playouts. Advanced ko checks are typically restricted to inside the search tree. With simple ko checks, I've had playouts get stuck in a 3 ko cycle. Ko cycles can be caught with a maximum playout length. On May 7, 2014 10:46 AM, Álvaro Begué alvaro.be...@gmail.com wrote: I believe you *have to* check for simple ko in playouts. Otherwise you'll end up with infinite playouts quite easily. On Wed, May 7, 2014 at 9:09 AM, Ben Ellis ben.el...@softweyr.co.ukwrote: All, When playing random playouts, do you (anyone) bother checking for KO or super KO? Does this have a negative impact on accuracy of the win:loss outcomes? Ben On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.comwrote: Now I feel stupid :( Thanks... So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the usual fluff) This is still about 15% slower then mingw-w64, but this is just for singlethreaded playouts. And it looks like, that when using 4 threads on the same tree, this gets compensated, and we arrive at pretty much the same speed. 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net: Le 01/05/2014 13:00, Marc Landgraf a écrit : Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to
Re: [Computer-go] C++11; threads
CPU cores are meant to be used by a single thread only. You can use more, but this rests on the assumption that two(or more) threads can effectively utilize a single core without too much competition over resources. This assumption is true in most situations, e.g. when we have to wait for IO or server queries or things like that often, and in these cases using HT can give a small performance boost. Now, here we are only utilizing the CPU. In this case the threads are only getting into each other's way. The effects of this can be seen very clearly with your program. It seems to scale perfectly, or very nearly so, as long as you don't use more threads than you have actual cores on your computer. When you go above that limit the scaling goes to hell and you get no improvement at all. The only way to solve your scaling problem is to get rid of the competing threads by turning off HT. I did this and have never looked back. -Mikko Aarnos ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
This is not accurate. In my experience you should expect a substantial performance increase from hyperthreading. (For my program on an i7-3930 it was something like a 40%, Zen got a similar number, others on this list have claimed even higher numbers, e.g., see: http://dvandva.org/pipermail/computer-go/2012-August/005298.html). Erik On Mon, May 12, 2014 at 5:06 PM, Mikko Aarnos mikko.aar...@kolumbus.fi wrote: CPU cores are meant to be used by a single thread only. You can use more, but this rests on the assumption that two(or more) threads can effectively utilize a single core without too much competition over resources. This assumption is true in most situations, e.g. when we have to wait for IO or server queries or things like that often, and in these cases using HT can give a small performance boost. Now, here we are only utilizing the CPU. In this case the threads are only getting into each other's way. The effects of this can be seen very clearly with your program. It seems to scale perfectly, or very nearly so, as long as you don't use more threads than you have actual cores on your computer. When you go above that limit the scaling goes to hell and you get no improvement at all. The only way to solve your scaling problem is to get rid of the competing threads by turning off HT. I did this and have never looked back. -Mikko Aarnos ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
I have the same experience as Erik. My quad core CPU gets about 40% to 50% more output from 8 threads as from 4. -Original Message- From: computer-go-boun...@dvandva.org [mailto:computer-go-boun...@dvandva.org] On Behalf Of Erik van der Werf Sent: Monday, May 12, 2014 11:39 AM To: computer-go@dvandva.org Subject: Re: [Computer-go] C++11; threads This is not accurate. In my experience you should expect a substantial performance increase from hyperthreading. (For my program on an i7-3930 it was something like a 40%, Zen got a similar number, others on this list have claimed even higher numbers, e.g., see: http://dvandva.org/pipermail/computer-go/2012-August/005298.html). Erik On Mon, May 12, 2014 at 5:06 PM, Mikko Aarnos mikko.aar...@kolumbus.fi wrote: CPU cores are meant to be used by a single thread only. You can use more, but this rests on the assumption that two(or more) threads can effectively utilize a single core without too much competition over resources. This assumption is true in most situations, e.g. when we have to wait for IO or server queries or things like that often, and in these cases using HT can give a small performance boost. Now, here we are only utilizing the CPU. In this case the threads are only getting into each other's way. The effects of this can be seen very clearly with your program. It seems to scale perfectly, or very nearly so, as long as you don't use more threads than you have actual cores on your computer. When you go above that limit the scaling goes to hell and you get no improvement at all. The only way to solve your scaling problem is to get rid of the competing threads by turning off HT. I did this and have never looked back. -Mikko Aarnos ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
oakfoam too, but I remember it was different before large patterns and other features. My working hypothesis is: if one thread on a core can wait for memory read the other thread can work (as far as I remember two HTs share the same integer and floating point units e.g.). If there are no bottle lecks there might be no improvement. By the way: The display 50% CPU in case of 4 threads on a 4 core 8 HT processor is nonsense of cause, this is because the operating system calculates as if all 8 HTs are independent CPUs, but they are not! Detlef Am Montag, den 12.05.2014, 11:43 -0400 schrieb Brian Sheppard: I have the same experience as Erik. My quad core CPU gets about 40% to 50% more output from 8 threads as from 4. -Original Message- From: computer-go-boun...@dvandva.org [mailto:computer-go-boun...@dvandva.org] On Behalf Of Erik van der Werf Sent: Monday, May 12, 2014 11:39 AM To: computer-go@dvandva.org Subject: Re: [Computer-go] C++11; threads This is not accurate. In my experience you should expect a substantial performance increase from hyperthreading. (For my program on an i7-3930 it was something like a 40%, Zen got a similar number, others on this list have claimed even higher numbers, e.g., see: http://dvandva.org/pipermail/computer-go/2012-August/005298.html). Erik On Mon, May 12, 2014 at 5:06 PM, Mikko Aarnos mikko.aar...@kolumbus.fi wrote: CPU cores are meant to be used by a single thread only. You can use more, but this rests on the assumption that two(or more) threads can effectively utilize a single core without too much competition over resources. This assumption is true in most situations, e.g. when we have to wait for IO or server queries or things like that often, and in these cases using HT can give a small performance boost. Now, here we are only utilizing the CPU. In this case the threads are only getting into each other's way. The effects of this can be seen very clearly with your program. It seems to scale perfectly, or very nearly so, as long as you don't use more threads than you have actual cores on your computer. When you go above that limit the scaling goes to hell and you get no improvement at all. The only way to solve your scaling problem is to get rid of the competing threads by turning off HT. I did this and have never looked back. -Mikko Aarnos ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
Mikko, This is pretty much what I expected, thanks for re-affirming this. I agree that from my experience, because I'm doing pure random playouts with very little lookups on patterns, MCTS trees, etc my CPU utilization is higher until I start adding these features so I will see less of a gain when using HT. On Mon, May 12, 2014 at 6:54 PM, Mikko Aarnos mikko.aar...@kolumbus.fiwrote: There is a big difference here: Ellis's program can only do light playouts. He doesn't have MCTS or patterns. That is parallelized extremely simply by just giving each thread an internal board state, doing a playout from that, resetting the board state to the original, doing a playout etc. There are no bottlenecks there, and that shouldn't get any increase in performance from HT as far as I know(also see the first sentence of Schmicker's comment). On the other hand, your programs all have MCTS and patterns(with emphasis being on patterns) and they both need constant memory reads. Thanks to that it's not a huge surprise that HT works better. Still, that doesn't change the fact that I was a bit off. I never actually expected that there was so much memory reading that it would actually make HT work. Guess I should implement patterns and see if I get similar results. Regards, Mikko Aarnos PS. Of course, all this rests on the assumption that you don't do your playouts exactly like Ellis, and if you do I am really, REALLY surprised with the performance of HT. PPS. And on the assumption that the 40%-50% performance increase was from going from 4 threads to 8 threads with HT on all the time, not from going from 4 threads with no HT to 8 threads with HT. Here as well, if the latter is true I am again honestly surprised. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
Mikko Aarnos wrote: There is a big difference here: Ellis's program can only do light playouts. He doesn't have MCTS or patterns. That is parallelized extremely simply by just giving each thread an internal board state, doing a playout from that, resetting the board state to the original, doing a playout etc. There are no bottlenecks there, and that shouldn't get any increase in performance from HT as far as I know(also see the first sentence of Schmicker's comment). I don't think that's right. I tried an experiment once with hyperthreading and 'light playouts' and I got a 40% improvement from using two threads per core. There are plenty of bottlenecks even in such simple code. For example, any time you do something equivalent to following a linked list (eg, finding the stones in a group that you're joining to another group) the thread will have to wait three or four cycles per 'link' even if all the data is in level-1 cache. One way to tell whether code is likely to benefit from hyperthreading is to use a tool that reports the processor's performance counters and look at the 'instructions per clock' measure. If it's somewhere around 1 then there are excellent chances of getting good results from hyperthreading. -M- ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
hm... I can't confirm that... I just ran a few tests, and I can do 10k empty playouts 19x19 per second on 4 threats, and about 11-12k on 8 threats. I did the test with only playouts, no treesearch, additional patternmatching or anything. With treesearch, going to more then 4 searchthreats actually slows down my code, due to some blocking issues... (This will be fixed, at some point) But also during treesearch right now I'm using 4 dedicated search/playout threats, and some other threats doing minor work. (nonblocking cleanup and managment of the tree, checking gtpconsole etc) So I guess it really depends on the implementation and everyone has to make their own tests, how their bot does best. 2014-05-12 22:10 GMT+02:00 Matthew Woodcraft matt...@woodcraft.me.uk: Mikko Aarnos wrote: There is a big difference here: Ellis's program can only do light playouts. He doesn't have MCTS or patterns. That is parallelized extremely simply by just giving each thread an internal board state, doing a playout from that, resetting the board state to the original, doing a playout etc. There are no bottlenecks there, and that shouldn't get any increase in performance from HT as far as I know(also see the first sentence of Schmicker's comment). I don't think that's right. I tried an experiment once with hyperthreading and 'light playouts' and I got a 40% improvement from using two threads per core. There are plenty of bottlenecks even in such simple code. For example, any time you do something equivalent to following a linked list (eg, finding the stones in a group that you're joining to another group) the thread will have to wait three or four cycles per 'link' even if all the data is in level-1 cache. One way to tell whether code is likely to benefit from hyperthreading is to use a tool that reports the processor's performance counters and look at the 'instructions per clock' measure. If it's somewhere around 1 then there are excellent chances of getting good results from hyperthreading. -M- ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
from the few articles I've read on hyperthreading (in partilcular this one http://msdn.microsoft.com/en-us/magazine/cc300701.aspx). Each logical core can have two concurrent instruction streams and if one is waiting for a resource (i.e. either in use by another thread or accessing the main memory) then the other instruction stream continues while the first stream is blocked. But because I use independent board states for each thread and the total memory used is less than my L3 cache size, none of my threads wait long for access to main memory or contend with each other so they don't easily give up control to the second stream. Also, because I'm using .NET, the CLR arranges the memory in such a way that concurrent threads won't (or will be less likely) to cross the cache line resulting in less resource contention in the L1/L2/L3 caches. Unless the C/C++ compiler optimizes for a CPU with a specific number of logical cores and physical cores, I doubt it would be as efficient as the compiled .NET code in regards to how the memory is mapped to the L1/L2/L3 cache? This is all new to me, so I'm likely wrong but feel like sharing my thoughts :) On Mon, May 12, 2014 at 9:10 PM, Matthew Woodcraft matt...@woodcraft.me.ukwrote: Mikko Aarnos wrote: There is a big difference here: Ellis's program can only do light playouts. He doesn't have MCTS or patterns. That is parallelized extremely simply by just giving each thread an internal board state, doing a playout from that, resetting the board state to the original, doing a playout etc. There are no bottlenecks there, and that shouldn't get any increase in performance from HT as far as I know(also see the first sentence of Schmicker's comment). I don't think that's right. I tried an experiment once with hyperthreading and 'light playouts' and I got a 40% improvement from using two threads per core. There are plenty of bottlenecks even in such simple code. For example, any time you do something equivalent to following a linked list (eg, finding the stones in a group that you're joining to another group) the thread will have to wait three or four cycles per 'link' even if all the data is in level-1 cache. One way to tell whether code is likely to benefit from hyperthreading is to use a tool that reports the processor's performance counters and look at the 'instructions per clock' measure. If it's somewhere around 1 then there are excellent chances of getting good results from hyperthreading. -M- ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
Ben Ellis wrote: from the few articles I've read on hyperthreading (in partilcular this one http://msdn.microsoft.com/en-us/magazine/cc300701.aspx). Each logical core can have two concurrent instruction streams and if one is waiting for a resource (i.e. either in use by another thread or accessing the main memory) then the other instruction stream continues while the first stream is blocked. But because I use independent board states for each thread and the total memory used is less than my L3 cache size, none of my threads wait long for access to main memory or contend with each other so they don't easily give up control to the second stream. The two streams don't really 'give up control' to each other; they both run at once. A modern Intel processor can in principle execute something like four operations per cycle, and in each cycle they can be a mixture of the two streams. So it doesn't take anything nearly as 'heavy' as an access to main memory to mean that you can get value from the second stream. Even an access to L1 cache takes 4 cycles to complete, and if the following instructions depend on the value being read then the processor won't be doing anything else from that stream until the read completes. And even if there are no reads from memory at all, it's pretty rare that the processor can find enough parallel work to get close to keeping four-ish execution units busy from only one instruction stream. The main reasons why in practice you often don't get value from hyperthreading are that the two threads are having to share the L1 and L2 caches, and they also share the resources for instruction fetch and decode (which can turn out to be a bottleneck disappointingly frequently). As long as you're doing light playouts you shouldn't have to worry about the L3 cache; everything should fit very comfortably in L2 (and quite possibly in L1, though I don't know what the CLR overhead is like). -M- ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
I've made a start on my first attempt at writing a go playing program using the .NET framework, and with 1000 empty 9x9 random playouts I'm getting the following benchmarks, Single Threaded (uses about 30% CPU) VS DEBUG 64bit (with Debugger) - 266ms per 1000 playouts. VS RELEASE 64bit - 182ms per 1000 playouts Thread Per Core (Uses 100% CPU) (Thread Per Core - 1 yielded similar results) VS DEBUG 64bit (with Debugger) - 154ms per 1000 playouts. VS RELEASE 64bit - 111ms per 1000 playouts System Specifications: Processor: Intel Core i7-4600U CPU @ 2.10Ghz 8GB DDR3 RAM The random player won't play, - Positional super-ko moves (or optionally situational) - Suicide moves - Eye filling moves and continues to play until there are no good moves left (i.e. all empty intersections are an eye, suicide or unplayable due to ko) Am I missing any other checks/features to the random play outs that would normally be implemented? What sort of play out speeds are normal, should I spend any more optimizing the random play outs before moving into a Monty-Carlo implementation? Regards, Ben On Wed, May 7, 2014 at 11:09 PM, Jason House jason.james.ho...@gmail.comwrote: Simple ko checks are required in playouts. Advanced ko checks are typically restricted to inside the search tree. With simple ko checks, I've had playouts get stuck in a 3 ko cycle. Ko cycles can be caught with a maximum playout length. On May 7, 2014 10:46 AM, Álvaro Begué alvaro.be...@gmail.com wrote: I believe you *have to* check for simple ko in playouts. Otherwise you'll end up with infinite playouts quite easily. On Wed, May 7, 2014 at 9:09 AM, Ben Ellis ben.el...@softweyr.co.ukwrote: All, When playing random playouts, do you (anyone) bother checking for KO or super KO? Does this have a negative impact on accuracy of the win:loss outcomes? Ben On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.comwrote: Now I feel stupid :( Thanks... So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the usual fluff) This is still about 15% slower then mingw-w64, but this is just for singlethreaded playouts. And it looks like, that when using 4 threads on the same tree, this gets compensated, and we arrive at pretty much the same speed. 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net: Le 01/05/2014 13:00, Marc Landgraf a écrit : Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to miss out those handy VS-comfort features and continue with C::B + mingw-w64. And the VS profiler results looks pretty much like what I got, when I last used VerySleepy on my code compiled with mingw. No super drastic bottlenecks just general slowness it seems. Mingw-w64 makes it impossible to profile the code, but mingw has performance issues as well for me, so I'm using it only when i need profile data (not as drastic as VC++, but about factor 3). Are you doing any memory allocation or input/outputs ? If that's the case then you should not start the code with F5 but shift F5 from inside VS. hj. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
simple ko checks are sufficient in random playouts ;) limit the number of moves to something reasonable (like 3 times the fields on the board) and you will catch that one in a billion games superko. This should save a fair amount of time. In any case... Your speed sounds reasonable. You can optimize it further later on, once you know what exactly you need from your board implementation. Have fun with your tree :) My current implementation runs at 100k playouts 9x9 in 7 sec on an i7-3630, 8GB, but has a bit heavier playouts. (saving/capturing, mogo style 3x3 patterns, basic dead shapes, some fun about keeping/destroying eyes properly) Marc 2014-05-09 23:35 GMT+02:00 Ben Ellis ben.el...@softweyr.co.uk: I've made a start on my first attempt at writing a go playing program using the .NET framework, and with 1000 empty 9x9 random playouts I'm getting the following benchmarks, Single Threaded (uses about 30% CPU) VS DEBUG 64bit (with Debugger) - 266ms per 1000 playouts. VS RELEASE 64bit - 182ms per 1000 playouts Thread Per Core (Uses 100% CPU) (Thread Per Core - 1 yielded similar results) VS DEBUG 64bit (with Debugger) - 154ms per 1000 playouts. VS RELEASE 64bit - 111ms per 1000 playouts System Specifications: Processor: Intel Core i7-4600U CPU @ 2.10Ghz 8GB DDR3 RAM The random player won't play, - Positional super-ko moves (or optionally situational) - Suicide moves - Eye filling moves and continues to play until there are no good moves left (i.e. all empty intersections are an eye, suicide or unplayable due to ko) Am I missing any other checks/features to the random play outs that would normally be implemented? What sort of play out speeds are normal, should I spend any more optimizing the random play outs before moving into a Monty-Carlo implementation? Regards, Ben On Wed, May 7, 2014 at 11:09 PM, Jason House jason.james.ho...@gmail.comwrote: Simple ko checks are required in playouts. Advanced ko checks are typically restricted to inside the search tree. With simple ko checks, I've had playouts get stuck in a 3 ko cycle. Ko cycles can be caught with a maximum playout length. On May 7, 2014 10:46 AM, Álvaro Begué alvaro.be...@gmail.com wrote: I believe you *have to* check for simple ko in playouts. Otherwise you'll end up with infinite playouts quite easily. On Wed, May 7, 2014 at 9:09 AM, Ben Ellis ben.el...@softweyr.co.ukwrote: All, When playing random playouts, do you (anyone) bother checking for KO or super KO? Does this have a negative impact on accuracy of the win:loss outcomes? Ben On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.comwrote: Now I feel stupid :( Thanks... So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the usual fluff) This is still about 15% slower then mingw-w64, but this is just for singlethreaded playouts. And it looks like, that when using 4 threads on the same tree, this gets compensated, and we arrive at pretty much the same speed. 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net: Le 01/05/2014 13:00, Marc Landgraf a écrit : Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to miss out those handy VS-comfort features and continue with C::B + mingw-w64. And the VS profiler results looks pretty much like what I got, when I last used VerySleepy on my code compiled with mingw. No super drastic bottlenecks just general slowness it seems. Mingw-w64 makes it impossible to profile the code, but mingw has performance issues as well for me, so I'm using it only when i need profile data (not as drastic as VC++, but about factor 3). Are you doing any memory allocation or input/outputs ? If that's the case then you should not start the code with F5 but shift F5 from inside VS. hj. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org
Re: [Computer-go] C++11; threads
oh, my benchmarked numbers are singlethreaded... quad threaded it actually runs about 3-3.5 times as fast 2014-05-09 23:55 GMT+02:00 Marc Landgraf mahrgel...@gmail.com: simple ko checks are sufficient in random playouts ;) limit the number of moves to something reasonable (like 3 times the fields on the board) and you will catch that one in a billion games superko. This should save a fair amount of time. In any case... Your speed sounds reasonable. You can optimize it further later on, once you know what exactly you need from your board implementation. Have fun with your tree :) My current implementation runs at 100k playouts 9x9 in 7 sec on an i7-3630, 8GB, but has a bit heavier playouts. (saving/capturing, mogo style 3x3 patterns, basic dead shapes, some fun about keeping/destroying eyes properly) Marc 2014-05-09 23:35 GMT+02:00 Ben Ellis ben.el...@softweyr.co.uk: I've made a start on my first attempt at writing a go playing program using the .NET framework, and with 1000 empty 9x9 random playouts I'm getting the following benchmarks, Single Threaded (uses about 30% CPU) VS DEBUG 64bit (with Debugger) - 266ms per 1000 playouts. VS RELEASE 64bit - 182ms per 1000 playouts Thread Per Core (Uses 100% CPU) (Thread Per Core - 1 yielded similar results) VS DEBUG 64bit (with Debugger) - 154ms per 1000 playouts. VS RELEASE 64bit - 111ms per 1000 playouts System Specifications: Processor: Intel Core i7-4600U CPU @ 2.10Ghz 8GB DDR3 RAM The random player won't play, - Positional super-ko moves (or optionally situational) - Suicide moves - Eye filling moves and continues to play until there are no good moves left (i.e. all empty intersections are an eye, suicide or unplayable due to ko) Am I missing any other checks/features to the random play outs that would normally be implemented? What sort of play out speeds are normal, should I spend any more optimizing the random play outs before moving into a Monty-Carlo implementation? Regards, Ben On Wed, May 7, 2014 at 11:09 PM, Jason House jason.james.ho...@gmail.com wrote: Simple ko checks are required in playouts. Advanced ko checks are typically restricted to inside the search tree. With simple ko checks, I've had playouts get stuck in a 3 ko cycle. Ko cycles can be caught with a maximum playout length. On May 7, 2014 10:46 AM, Álvaro Begué alvaro.be...@gmail.com wrote: I believe you *have to* check for simple ko in playouts. Otherwise you'll end up with infinite playouts quite easily. On Wed, May 7, 2014 at 9:09 AM, Ben Ellis ben.el...@softweyr.co.ukwrote: All, When playing random playouts, do you (anyone) bother checking for KO or super KO? Does this have a negative impact on accuracy of the win:loss outcomes? Ben On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.comwrote: Now I feel stupid :( Thanks... So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the usual fluff) This is still about 15% slower then mingw-w64, but this is just for singlethreaded playouts. And it looks like, that when using 4 threads on the same tree, this gets compensated, and we arrive at pretty much the same speed. 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net: Le 01/05/2014 13:00, Marc Landgraf a écrit : Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to miss out those handy VS-comfort features and continue with C::B + mingw-w64. And the VS profiler results looks pretty much like what I got, when I last used VerySleepy on my code compiled with mingw. No super drastic bottlenecks just general slowness it seems. Mingw-w64 makes it impossible to profile the code, but mingw has performance issues as well for me, so I'm using it only when i need profile data (not as drastic as VC++, but about factor 3). Are you doing any memory allocation or input/outputs ? If that's the case then you should not start the code with F5 but shift F5 from inside VS. hj. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list
Re: [Computer-go] C++11; threads
All, When playing random playouts, do you (anyone) bother checking for KO or super KO? Does this have a negative impact on accuracy of the win:loss outcomes? Ben On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.com wrote: Now I feel stupid :( Thanks... So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the usual fluff) This is still about 15% slower then mingw-w64, but this is just for singlethreaded playouts. And it looks like, that when using 4 threads on the same tree, this gets compensated, and we arrive at pretty much the same speed. 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net: Le 01/05/2014 13:00, Marc Landgraf a écrit : Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to miss out those handy VS-comfort features and continue with C::B + mingw-w64. And the VS profiler results looks pretty much like what I got, when I last used VerySleepy on my code compiled with mingw. No super drastic bottlenecks just general slowness it seems. Mingw-w64 makes it impossible to profile the code, but mingw has performance issues as well for me, so I'm using it only when i need profile data (not as drastic as VC++, but about factor 3). Are you doing any memory allocation or input/outputs ? If that's the case then you should not start the code with F5 but shift F5 from inside VS. hj. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
I believe you *have to* check for simple ko in playouts. Otherwise you'll end up with infinite playouts quite easily. On Wed, May 7, 2014 at 9:09 AM, Ben Ellis ben.el...@softweyr.co.uk wrote: All, When playing random playouts, do you (anyone) bother checking for KO or super KO? Does this have a negative impact on accuracy of the win:loss outcomes? Ben On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.comwrote: Now I feel stupid :( Thanks... So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the usual fluff) This is still about 15% slower then mingw-w64, but this is just for singlethreaded playouts. And it looks like, that when using 4 threads on the same tree, this gets compensated, and we arrive at pretty much the same speed. 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net: Le 01/05/2014 13:00, Marc Landgraf a écrit : Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to miss out those handy VS-comfort features and continue with C::B + mingw-w64. And the VS profiler results looks pretty much like what I got, when I last used VerySleepy on my code compiled with mingw. No super drastic bottlenecks just general slowness it seems. Mingw-w64 makes it impossible to profile the code, but mingw has performance issues as well for me, so I'm using it only when i need profile data (not as drastic as VC++, but about factor 3). Are you doing any memory allocation or input/outputs ? If that's the case then you should not start the code with F5 but shift F5 from inside VS. hj. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
Simple ko checks are required in playouts. Advanced ko checks are typically restricted to inside the search tree. With simple ko checks, I've had playouts get stuck in a 3 ko cycle. Ko cycles can be caught with a maximum playout length. On May 7, 2014 10:46 AM, Álvaro Begué alvaro.be...@gmail.com wrote: I believe you *have to* check for simple ko in playouts. Otherwise you'll end up with infinite playouts quite easily. On Wed, May 7, 2014 at 9:09 AM, Ben Ellis ben.el...@softweyr.co.ukwrote: All, When playing random playouts, do you (anyone) bother checking for KO or super KO? Does this have a negative impact on accuracy of the win:loss outcomes? Ben On Thu, May 1, 2014 at 4:52 PM, Marc Landgraf mahrgel...@gmail.comwrote: Now I feel stupid :( Thanks... So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the usual fluff) This is still about 15% slower then mingw-w64, but this is just for singlethreaded playouts. And it looks like, that when using 4 threads on the same tree, this gets compensated, and we arrive at pretty much the same speed. 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net: Le 01/05/2014 13:00, Marc Landgraf a écrit : Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to miss out those handy VS-comfort features and continue with C::B + mingw-w64. And the VS profiler results looks pretty much like what I got, when I last used VerySleepy on my code compiled with mingw. No super drastic bottlenecks just general slowness it seems. Mingw-w64 makes it impossible to profile the code, but mingw has performance issues as well for me, so I'm using it only when i need profile data (not as drastic as VC++, but about factor 3). Are you doing any memory allocation or input/outputs ? If that's the case then you should not start the code with F5 but shift F5 from inside VS. hj. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to miss out those handy VS-comfort features and continue with C::B + mingw-w64. And the VS profiler results looks pretty much like what I got, when I last used VerySleepy on my code compiled with mingw. No super drastic bottlenecks just general slowness it seems. Mingw-w64 makes it impossible to profile the code, but mingw has performance issues as well for me, so I'm using it only when i need profile data (not as drastic as VC++, but about factor 3). 2014-04-30 23:24 GMT+02:00 Aja Huang ajahu...@gmail.com: I wrote my Go program Erica completely in Visual Studio and had no problem at all. It might be around 20% slower on Windows than on Linux, but compared to other more important factors 20% loss in speed is not really significant. Maybe VS profiler can tell why your program ran awfully slow in debug mode. Aja 2014-04-30 21:38 GMT+01:00 Marc Landgraf mahrgel...@gmail.com: Hey, in the past I tried VS again and again, and in the end always returned back to Code::Blocks... It really feels like VS and me won't find together. Actually, after your comment I tried it again today, but even after spending a decent amount of time of porting it, the program ran awfully slow in debug mode, and crashed, as soon as the VC++ compiler tried to optimize it. (For reasonable performance I need optimization with mingw-w64 as well) Maybe it is just me and my terrible way of coding... But Visual Studio and Visual C++ I can't handle properly. And with Code::Blocks, I fooled around with various versions of GCC, and ended with mingw-w64, which gave me by far the best performance among those supporting the for me relevant C++11-features. Marc 2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com: Hey Marc, 2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com: Hi, my bot is still under construction, but written entirely under C++11. So few comments: General: Most compilers, especially if you are using Windows, still have problems with C++11 and it's new multithreading library. Right now I'm using mingw-w64-4.8.1 as it has the required support for thread, even so it is done with some workaround via winpthreads, and gives a decently fast code. But I'm also interested if anyone else can share his experience with other compilers. (for windows) Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new C++11 features. http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
That is amazing. On May 1, 2014 4:00 AM, Marc Landgraf mahrgel...@gmail.com wrote: Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to miss out those handy VS-comfort features and continue with C::B + mingw-w64. And the VS profiler results looks pretty much like what I got, when I last used VerySleepy on my code compiled with mingw. No super drastic bottlenecks just general slowness it seems. Mingw-w64 makes it impossible to profile the code, but mingw has performance issues as well for me, so I'm using it only when i need profile data (not as drastic as VC++, but about factor 3). 2014-04-30 23:24 GMT+02:00 Aja Huang ajahu...@gmail.com: I wrote my Go program Erica completely in Visual Studio and had no problem at all. It might be around 20% slower on Windows than on Linux, but compared to other more important factors 20% loss in speed is not really significant. Maybe VS profiler can tell why your program ran awfully slow in debug mode. Aja 2014-04-30 21:38 GMT+01:00 Marc Landgraf mahrgel...@gmail.com: Hey, in the past I tried VS again and again, and in the end always returned back to Code::Blocks... It really feels like VS and me won't find together. Actually, after your comment I tried it again today, but even after spending a decent amount of time of porting it, the program ran awfully slow in debug mode, and crashed, as soon as the VC++ compiler tried to optimize it. (For reasonable performance I need optimization with mingw-w64 as well) Maybe it is just me and my terrible way of coding... But Visual Studio and Visual C++ I can't handle properly. And with Code::Blocks, I fooled around with various versions of GCC, and ended with mingw-w64, which gave me by far the best performance among those supporting the for me relevant C++11-features. Marc 2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com: Hey Marc, 2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com: Hi, my bot is still under construction, but written entirely under C++11. So few comments: General: Most compilers, especially if you are using Windows, still have problems with C++11 and it's new multithreading library. Right now I'm using mingw-w64-4.8.1 as it has the required support for thread, even so it is done with some workaround via winpthreads, and gives a decently fast code. But I'm also interested if anyone else can share his experience with other compilers. (for windows) Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new C++11 features. http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
Le 01/05/2014 13:00, Marc Landgraf a écrit : Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to miss out those handy VS-comfort features and continue with C::B + mingw-w64. And the VS profiler results looks pretty much like what I got, when I last used VerySleepy on my code compiled with mingw. No super drastic bottlenecks just general slowness it seems. Mingw-w64 makes it impossible to profile the code, but mingw has performance issues as well for me, so I'm using it only when i need profile data (not as drastic as VC++, but about factor 3). Are you doing any memory allocation or input/outputs ? If that's the case then you should not start the code with F5 but shift F5 from inside VS. hj. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
Debug configurations in VS have a tremendous amount of code-verification built in. Buffer overwrites, API usage, use of uninitialized data, and the coverage is very extensive. Expensive, too, but totally worthwhile because of the time saved for developers. The speed of a debug build is not really that important anyway. Check your optimization settings for the Release build. There are a lot of optional choices, some of which are not 'safe' in general, and you have to measure each one. I recall ~2:1 ratio between the best and worst speeds based just on tweaking settings for the Release build of Pebbles. From: computer-go-boun...@dvandva.org [mailto:computer-go-boun...@dvandva.org] On Behalf Of uurtamo . Sent: Thursday, May 01, 2014 9:34 AM To: computer-go@dvandva.org Subject: Re: [Computer-go] C++11; threads That is amazing. On May 1, 2014 4:00 AM, Marc Landgraf mahrgel...@gmail.com wrote: Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to miss out those handy VS-comfort features and continue with C::B + mingw-w64. And the VS profiler results looks pretty much like what I got, when I last used VerySleepy on my code compiled with mingw. No super drastic bottlenecks just general slowness it seems. Mingw-w64 makes it impossible to profile the code, but mingw has performance issues as well for me, so I'm using it only when i need profile data (not as drastic as VC++, but about factor 3). 2014-04-30 23:24 GMT+02:00 Aja Huang ajahu...@gmail.com: I wrote my Go program Erica completely in Visual Studio and had no problem at all. It might be around 20% slower on Windows than on Linux, but compared to other more important factors 20% loss in speed is not really significant. Maybe VS profiler can tell why your program ran awfully slow in debug mode. Aja 2014-04-30 21:38 GMT+01:00 Marc Landgraf mahrgel...@gmail.com: Hey, in the past I tried VS again and again, and in the end always returned back to Code::Blocks... It really feels like VS and me won't find together. Actually, after your comment I tried it again today, but even after spending a decent amount of time of porting it, the program ran awfully slow in debug mode, and crashed, as soon as the VC++ compiler tried to optimize it. (For reasonable performance I need optimization with mingw-w64 as well) Maybe it is just me and my terrible way of coding... But Visual Studio and Visual C++ I can't handle properly. And with Code::Blocks, I fooled around with various versions of GCC, and ended with mingw-w64, which gave me by far the best performance among those supporting the for me relevant C++11-features. Marc 2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com: Hey Marc, 2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com: Hi, my bot is still under construction, but written entirely under C++11. So few comments: General: Most compilers, especially if you are using Windows, still have problems with C++11 and it's new multithreading library. Right now I'm using mingw-w64-4.8.1 as it has the required support for thread, even so it is done with some workaround via winpthreads, and gives a decently fast code. But I'm also interested if anyone else can share his experience with other compilers. (for windows) Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new C++11 features. http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
Now I feel stupid :( Thanks... So now I'm down to 126 on average with /O2 /Ot /favor:INTEL64 (+the usual fluff) This is still about 15% slower then mingw-w64, but this is just for singlethreaded playouts. And it looks like, that when using 4 threads on the same tree, this gets compensated, and we arrive at pretty much the same speed. 2014-05-01 15:36 GMT+02:00 Harald Johnsen hjohn...@evc.net: Le 01/05/2014 13:00, Marc Landgraf a écrit : Hey, I'm not talking about 20% speedloss here with VC++. Just the times for 1000 empty playouts on 9x9, not using any sort of multithreading: VS debug configuration: 15257 VS release config (optimized): 756 C::B mingw-w64 no optimizations: 498 C::B mingw-w64 -O3 -fexpensive-optimizations -march=corei7-avx: 108 This of course clearly looks as this is certainly my fault... But right now I can't find what I'm doing wrong here... and so I have to miss out those handy VS-comfort features and continue with C::B + mingw-w64. And the VS profiler results looks pretty much like what I got, when I last used VerySleepy on my code compiled with mingw. No super drastic bottlenecks just general slowness it seems. Mingw-w64 makes it impossible to profile the code, but mingw has performance issues as well for me, so I'm using it only when i need profile data (not as drastic as VC++, but about factor 3). Are you doing any memory allocation or input/outputs ? If that's the case then you should not start the code with F5 but shift F5 from inside VS. hj. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
Hi, my bot is still under construction, but written entirely under C++11. So few comments: General: Most compilers, especially if you are using Windows, still have problems with C++11 and it's new multithreading library. Right now I'm using mingw-w64-4.8.1 as it has the required support for thread, even so it is done with some workaround via winpthreads, and gives a decently fast code. But I'm also interested if anyone else can share his experience with other compilers. (for windows) Multithreading: The new thread library fulfills all my requirements for my Bot. But I also haven't tested it with boost::thread, so I can't draw conclusions concerning which is faster. Synchronization: The improved atomic for base types are as fast as the base types itself and really useful, when running with multiple threads on the same tree. This is not true for mutex. Those can cause a dramatic slowdown, if used too often. For that reason I reduced the usage of mutex to an absolute minimum. With those reduction, condition_variable disappeared entirely from my code, and I can't tell how well they performed. (I initially had some consumer producer queue, to spread out playouts over multiple threads, now i switched to all threads doing full exploration+playout-runs) Other: To me, the most noticeable other feature is the reworked auto-keyword, which mostly improves code readability, and more important the new for-loops iterating over any STL-container. Those new loops not only improve readability even more, but on some occasions can cause drastic speed improvements compared to the old style looping. Best Regards, Marc 2014-04-30 3:10 GMT+02:00 Darren Cook dar...@dcook.org: A round of compiler/OS upgrades mean I'm finally able to use C++11 in real-world projects, so I've been re-studying all the new features. I know the computer go community really cares about their CPU cycles, and also that C++ is widely used, so I'd love to hear stories from the frontline about which C++11 things you are using (or not) and why. I'm especially interested in if the threading library is good enough for the high-performance parallel tree searches people are using. It is mostly boost::thread, which I was already using, but futures are new (*), and then there is the low-level memory model and atomics. Darren *: I've been using futures to handle the async complexity in JavaScript/jQuery. For certain kinds of problems they can be a very graceful solution. -- Darren Cook, Software Researcher/Developer My new book: Data Push Apps with HTML5 SSE Published by O'Reilly: (ask me for a discount code!) http://shop.oreilly.com/product/0636920030928.do Also on Amazon and at all good booksellers! ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
Hey Marc, 2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com: Hi, my bot is still under construction, but written entirely under C++11. So few comments: General: Most compilers, especially if you are using Windows, still have problems with C++11 and it's new multithreading library. Right now I'm using mingw-w64-4.8.1 as it has the required support for thread, even so it is done with some workaround via winpthreads, and gives a decently fast code. But I'm also interested if anyone else can share his experience with other compilers. (for windows) Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new C++11 features. http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
Hey, in the past I tried VS again and again, and in the end always returned back to Code::Blocks... It really feels like VS and me won't find together. Actually, after your comment I tried it again today, but even after spending a decent amount of time of porting it, the program ran awfully slow in debug mode, and crashed, as soon as the VC++ compiler tried to optimize it. (For reasonable performance I need optimization with mingw-w64 as well) Maybe it is just me and my terrible way of coding... But Visual Studio and Visual C++ I can't handle properly. And with Code::Blocks, I fooled around with various versions of GCC, and ended with mingw-w64, which gave me by far the best performance among those supporting the for me relevant C++11-features. Marc 2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com: Hey Marc, 2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com: Hi, my bot is still under construction, but written entirely under C++11. So few comments: General: Most compilers, especially if you are using Windows, still have problems with C++11 and it's new multithreading library. Right now I'm using mingw-w64-4.8.1 as it has the required support for thread, even so it is done with some workaround via winpthreads, and gives a decently fast code. But I'm also interested if anyone else can share his experience with other compilers. (for windows) Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new C++11 features. http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
I wrote my Go program Erica completely in Visual Studio and had no problem at all. It might be around 20% slower on Windows than on Linux, but compared to other more important factors 20% loss in speed is not really significant. Maybe VS profiler can tell why your program ran awfully slow in debug mode. Aja 2014-04-30 21:38 GMT+01:00 Marc Landgraf mahrgel...@gmail.com: Hey, in the past I tried VS again and again, and in the end always returned back to Code::Blocks... It really feels like VS and me won't find together. Actually, after your comment I tried it again today, but even after spending a decent amount of time of porting it, the program ran awfully slow in debug mode, and crashed, as soon as the VC++ compiler tried to optimize it. (For reasonable performance I need optimization with mingw-w64 as well) Maybe it is just me and my terrible way of coding... But Visual Studio and Visual C++ I can't handle properly. And with Code::Blocks, I fooled around with various versions of GCC, and ended with mingw-w64, which gave me by far the best performance among those supporting the for me relevant C++11-features. Marc 2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com: Hey Marc, 2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com: Hi, my bot is still under construction, but written entirely under C++11. So few comments: General: Most compilers, especially if you are using Windows, still have problems with C++11 and it's new multithreading library. Right now I'm using mingw-w64-4.8.1 as it has the required support for thread, even so it is done with some workaround via winpthreads, and gives a decently fast code. But I'm also interested if anyone else can share his experience with other compilers. (for windows) Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new C++11 features. http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] C++11; threads
What are the extra features? I ask because if you consider something like openssl, it tries crazily hard to optimize against your cpu. We should be so lucky.. s. On Apr 30, 2014 1:38 PM, Marc Landgraf mahrgel...@gmail.com wrote: Hey, in the past I tried VS again and again, and in the end always returned back to Code::Blocks... It really feels like VS and me won't find together. Actually, after your comment I tried it again today, but even after spending a decent amount of time of porting it, the program ran awfully slow in debug mode, and crashed, as soon as the VC++ compiler tried to optimize it. (For reasonable performance I need optimization with mingw-w64 as well) Maybe it is just me and my terrible way of coding... But Visual Studio and Visual C++ I can't handle properly. And with Code::Blocks, I fooled around with various versions of GCC, and ended with mingw-w64, which gave me by far the best performance among those supporting the for me relevant C++11-features. Marc 2014-04-30 11:01 GMT+02:00 Aja Huang ajahu...@gmail.com: Hey Marc, 2014-04-30 8:37 GMT+01:00 Marc Landgraf mahrgel...@gmail.com: Hi, my bot is still under construction, but written entirely under C++11. So few comments: General: Most compilers, especially if you are using Windows, still have problems with C++11 and it's new multithreading library. Right now I'm using mingw-w64-4.8.1 as it has the required support for thread, even so it is done with some workaround via winpthreads, and gives a decently fast code. But I'm also interested if anyone else can share his experience with other compilers. (for windows) Why don't you use Visual Studio 2013? CTP_Nov2013 supports a lot of new C++11 features. http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
[Computer-go] C++11; threads
A round of compiler/OS upgrades mean I'm finally able to use C++11 in real-world projects, so I've been re-studying all the new features. I know the computer go community really cares about their CPU cycles, and also that C++ is widely used, so I'd love to hear stories from the frontline about which C++11 things you are using (or not) and why. I'm especially interested in if the threading library is good enough for the high-performance parallel tree searches people are using. It is mostly boost::thread, which I was already using, but futures are new (*), and then there is the low-level memory model and atomics. Darren *: I've been using futures to handle the async complexity in JavaScript/jQuery. For certain kinds of problems they can be a very graceful solution. -- Darren Cook, Software Researcher/Developer My new book: Data Push Apps with HTML5 SSE Published by O'Reilly: (ask me for a discount code!) http://shop.oreilly.com/product/0636920030928.do Also on Amazon and at all good booksellers! ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go