On 23/03/2012 20:47, Jonathan Street wrote:
I've just put some code together to test the advantage seen with
skipping combinations which aren't sorted. It's significantly slower
compared to pass-ing everything but putting a quick method together to
do some actual work shows the improvement. Unfortunately doing some
actual work pushes up the time taken. My quick check (running this code
https://gist.github.com/2174656 ) suggests skipping unsorted iterations
brings down the total running time from 7000 to 4000 days.
My computer shows a similar speedup from 4500 days to 2000 days. Still
too slow. Then I discovered, once again that itertools is awesome (in
this case in 2.7 (and probably 3.x) only). Replacing sorted generator with
def sorted_exhaustive(num):
for h in itertools.combinations_with_replacement(range(1,7), num):
stub_comparison(h)
Cuts the time down to about a day.
There is some bad news however. The way you're using this to generate
dice sets, building one list for the whole set and splitting it, doesn't
work with this optimization. You'll get a 'low dice', 'higher dice' and
'highest dice', not what's needed at all.
I suspect to see 'reasonable' run times we would need to fundamentally
rethink the approach we're using. At the moment I don't think we can
even brute force it by 'bursting' to the cloud. I don't know how the
other providers compare but taking aws
http://aws.amazon.com/ec2/#instance as an example the individual cores
aren't anything special. The advantage comes when you can use lots of
them, either on a instance with lots of cores or on multiple instances.
Keeping the fast die generation technique and generating sets of dice with
def split_dice(num, high=7):
pool = itertools.combinations_with_replacement(range(1,high), num)
for a, b in itertools.combinations(pool, 2):
dice.simple(a, b)
should let us trade some ram for speed.
On my computer running this with num = 9 (which would be sets of two
nine sided dice with numbers from 1-6 on each side) takes about 35 seconds.
Trying something similar with sets of three dice:
def full_split_dice(num, high=7, size=3):
pool = itertools.combinations_with_replacement(range(1,high), num)
for selection in itertools.combinations(pool, size):
for a, b in itertools.combinations(selection, 2)
dice.simple(a, b)
on my computer takes:
num time
3 0.3sec
4 5sec
5 50 sec
and running it with num=3 and high=10 (which the example dice
effectively had as each number from 1 to 9 was used twice the same die)
takes 7 seconds.
Fast enough for you? :)
https://gist.github.com/2176168
It still needs a way to identify valid sets. Also scaling it up to all
sets of six sided dice (num=6 and high=19) or sets of more than three
dice is probably going to take more work. If I can find the time those
are problems for Sunday.
Given the discussion here I'm tempted by the tutorials on high
performance python but at 3 hours they might be a little long.
http://pyvideo.org/video/614/high-performance-python-i
http://pyvideo.org/video/620/high-performance-python-ii
http://pyvideo.org/video/618/optimize-performance-and-scalability-with-paralle
That depends, it's only two of us so far. Are other people following
along interestedly?
Ed
--
To post: [email protected]
To unsubscribe: [email protected]
Feeds: http://groups.google.com/group/python-north-west/feeds
More options: http://groups.google.com/group/python-north-west