Hi, I am working through Nim track in exercism. The task to solve is: "Given a
number, find the sum of all the unique multiples of particular numbers up to
but not including that number." I am curious about the performance of HashSet.
Here is the summary of running times for a range of input numbers; the size of
the HashSet[int] was approx 7_000_000 items. I compiled the code (see below)
using Nim Compiler Version 1.2.0 [Linux: amd64] with -d:release and -d:danger
flags. I expected O(log N) performance, but this is not what happened:
CPU Time [8350000] 0.531s
CPU Time [8400000] 0.444s
CPU Time [8450000] 0.795s
CPU Time [8500000] 1.828s
CPU Time [8550000] 3.533s
CPU Time [8600000] 5.789s
CPU Time [8650000] 8.634s
CPU Time [8700000] 12.119s
CPU Time [8750000] 16.176s
CPU Time [8800000] 20.857s
CPU Time [8850000] 26.108s
CPU Time [8900000] 32.016s
CPU Time [8950000] 38.528s
CPU Time [9000000] 45.560s
Run
What is the reason for the decreasing performance?
## Given a number, find the sum of all the unique(!) multiples of
particular numbers up to but not including that number.
## If we list all the natural numbers below 20 that are multiples of 3 or
5, we get 3, 5, 6, 9, 10, 12, 15, and 18.
## The sum of these multiples is 78.
import sets, intsets
import times,strutils
proc sumA*(x:int, numbers:seq[int]):int =
var multiples:HashSet[int]
#var multiples:IntSet
for n in numbers:
if n == 0:
continue
for m in countup(n, x-1, n):
if not multiples.containsOrIncl(m):
result += m
template benchmark(benchmarkName: string, code: untyped) =
block:
let t0 = epochTime()
code
let elapsed = epochTime() - t0
let elapsedStr = elapsed.formatFloat(format = ffDecimal, precision = 3)
echo "CPU Time [", benchmarkName, "] ", elapsedStr, "s"
when isMainModule:
for x in countup(8_350_000, 9_000_000, 50000):
benchmark $x:
discard sumA(x, @[2, 3, 5, 7, 11])
Run
Similar test with Python 3.6 code yields more reasonable running times:
8350000 cpu= 1.9215092658996582
8400000 cpu= 1.8640367984771729
8450000 cpu= 1.8687024116516113
8500000 cpu= 1.867283821105957
8550000 cpu= 1.8652291297912598
8600000 cpu= 1.9187562465667725
8650000 cpu= 1.8985896110534668
8700000 cpu= 1.8902392387390137
8750000 cpu= 1.8779652118682861
8800000 cpu= 1.940896987915039
8850000 cpu= 1.9335484504699707
8900000 cpu= 1.9375865459442139
8950000 cpu= 1.9178028106689453
9000000 cpu= 1.9903161525726318
Run
Here is the Python code
import time
def sum(x, numbers):
collected = set()
result = 0
for n in numbers:
if n == 0:
continue
for m in range(n, x-1, n):
if not m in collected:
result += m
collected.add(m)
return result
for x in range(8350000, 9050000, 50000):
t0 = time.time()
s= sum(x, [2,3,5,7,11])
t1 = time.time()
print(f"{x} cpu= {t1-t0}")
Run