RE: [racket-users] Re: Sequential vs. Parallel 13-Queens program

Jos Koot Mon, 14 Mar 2016 10:59:52 -0700

I have looked up a N-queens program I made a long time ago. It finds all
solutions.
My timings are faster I think. I need 2 seconds (in DrRacket) for N=12.
14200 solutions of which 1787 are not symmetrically equivalent.
If you want I send you the code privately. 83 lines.
Finding the solutions takes less lines, but after finding the solutions
an analysys of symmetries is made, requiring more lines and more time.
This allows me to report one solution only for each set of up to 8 solutions
that can be transformed into each other by symmetry operations.


It would not be difficult to parallelize the basic code such as to use up to
N processors.
I try every rank in the first file to start with.
In fact the code can be speeded up by almost a factor 2 by
trying the first (ceiling (/ N 2)) ranks only.
The solutions starting from another rank in the first file
are easily found by symmetry (reflection) for solutions that do not have
this symmetry.
These trials can be done parallelly.
Jos
 

-----Original Message-----
From: racket-users@googlegroups.com [mailto:racket-users@googlegroups.com]
On Behalf Of Brian Adkins
Sent: sábado, 12 de marzo de 2016 18:51
To: Racket Users
Subject: [racket-users] Re: Sequential vs. Parallel 13-Queens program

On Friday, March 11, 2016 at 9:01:38 PM UTC-5, Brian Adkins wrote:
> On Friday, March 11, 2016 at 7:42:22 PM UTC-5, Brian Adkins wrote:
> > I coded up a sequential and parallel version of N-Queens, then did a ton
of benchmark runs of 13-Queens to compare the time. For each configuration
(sequential or parallel w/ M workers), I ran the programs 6 times, threw out
the high two & low two and averaged the middle two numbers.
> > 
> > The spreadsheet with timings is here:
> > 
> >
https://docs.google.com/spreadsheets/d/1LFwdZbBveaARY_AquGXY9jgaSJlOA03NQCV9
TeYQ-l8/edit?usp=sharing
> > 
> > The code is here:
> > 
> > https://gist.github.com/lojic/aef0aec491d3dc9cb40b
> > 
> > I didn't spend any time refining/optimizing, so it's fairly crude, but
informative nonetheless.
> > 
> > The executive summary of timings for the parallel version:
> > 
> > # Places    Time
> > 1   34.9
> > 2   19.7
> > 3   13.8
> > 4   12.3
> > 5   11.9
> > 6   12.9
> > 7   12.1
> > 8   12.2
> > 
> > The sequential version took 31.3 seconds.
> > 
> > The basic idea for the parallel version is to place the first 13
starting positions in a queue that the place workers will pull from and
produce a set of solutions with that starting position. Both the parallel
and sequential versions collect all 73,712 solutions in a list and compute
the length of it. I hardcoded the number of solutions as a quick & dirty way
of determining completion in the parallel version just to allow me to get
the timings easily.
> > 
> > It was great to finally write some Racket code that got all my cores
heated up :)
> > 
> > Brian
> 
> Interesting. My first parallel version had the workers write each
solutions (list of N queen positions) to the parent as they were computed. I
just changed the program to collect all the solutions for a given prefix
(closer to the way the sequential version works) and send that list to the
parent.
> 
> The result is a modest 4% speeedup. I had assumed that creating the extra
intermediate lists would be slower.
> 
> I thought I read something about the implementation of Places that allows
data to be moved efficiently from one Place to another via virtual memory
pages - maybe something like that is going on. Regardless, I'm happy that
the more natural algorithm (to me anyway) is also more efficient. I guess
the overhead of many place-channel-put/get calls is more expensive than the
allocation & garbage collection of the intermediate lists.
> 
> The second parallel version referred to above is here:
https://gist.github.com/lojic/957828ce7c67c0376a23

I cleaned up the code a bit (removed hardcoded completion checks, etc.) and
moved the code to a git repo:

https://github.com/lojic/LearningRacket/tree/master/miscellaneous

I was hoping to get more than a 3x speedup, but my final results show a 2.8x
speedup on 4 cores. I suppose the amount of work per solution isn't large
enough for a better speedup.

In the N=13 case that takes 11 seconds, there are 6,700 solutions per second
being sent across a place-channel with each solution being a list of 13
structs of 2 elements.

Still, 2.8x is appreciated :)

-- 
You received this message because you are subscribed to the Google Groups
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

RE: [racket-users] Re: Sequential vs. Parallel 13-Queens program

Reply via email to