[Pharo-dev] Re: Array sum. is very slow

Sven Van Caekenberghe Tue, 11 Jan 2022 02:18:29 -0800

Hi Andrei,

That is a good catch, indeed, that makes all the difference and is an unfair 
comparison.


If I take Jimmie's code and add

def sum2(l):
  sum = 0
  for i in range(0,len(l)):
    sum = sum + l[i]
  return sum

def average(l):
  return sum2(l)/len(l)

and replace the other calls of sum to sum2 in loop1 and loop2, I get the 
following for 1 iteration:

>>> doit(1)
Tue Jan 11 10:34:24 2022
Creating list
createList(n), na[-1]:   0.28800000000000003
reps:  1
inside at top loop1: start:  Tue Jan 11 10:34:24 2022
Loop1 time:  1.5645889163017273
nsum: 11242.949400371168
navg: 0.3903801875128878
loop2: start:  Tue Jan 11 10:35:58 2022
Loop2 time:  -27364895.977849767
nsum: 10816.16871440453
navg: 0.3755614136946017
finished:  Tue Jan 11 10:37:33 2022
start time: 1641893664.795651
end time: 1641893853.597397
total time: 1614528959.1841362
nsum: 10816.16871440453
navg: 0.3755614136946017

The total time is calculated wrongly, but doing the calculation in Pharo:

(1641893853.597397 - 1641893664.795651) seconds. "0:00:03:08.80174613"

so 3 minutes.

Jimmie's unmodified Pharo code give for 1 iteration:

[ (LanguageTest newSize: 60*24*5*4 iterations: 1) run ] timeToRun. 
"0:00:01:00.438"

Starting test for array size: 28800  iterations: 1

Creating array of size: 28800   timeToRun: 0:00:00:00.035

Starting loop 1 at: 2022-01-11T10:53:53.423313+01:00
1: 2022-01-11T10:53:53   innerttr: 0:00:00:30.073   averageTime: 0:00:00:30.073
Loop 1 time: nil
nsum: 11242.949400371168
navg: 0.3903801875128878

Starting loop 2 at: 2022-01-11T10:54:23.497281+01:00
1: 2022-01-11T10:54:23   innerttr: 0:00:00:30.306   averageTime: 0:00:00:30.306
Loop 2 time: 0:00:00:30.306
nsum: 10816.168714404532
navg: 0.3755614136946018

End of test.  TotalTime: 0:00:01:00.416

which would seem to be 3 times faster !

Benchmarking is a black art.

Sven

> On 11 Jan 2022, at 10:07, Andrei Chis <chisvasileand...@gmail.com> wrote:
> 
> Hi Jimmie,
> 
> I was scanning through this thread and saw that the Python call uses
> the sum function. If I remember correctly, in Python the built-in sum
> function is directly implemented in C [1] (unless Python is compiled
> with SLOW_SUM set to true). In that case on large arrays the function
> can easily be several times faster than just iterating over the
> individual objects as the Pharo code does. The benchmark seems to
> compare summing numbers in C with summing numbers in Pharo. Would be
> interesting to modify the Python code to use a similar loop as in
> Pharo for doing the sum.
> 
> Cheers,
> Andrei
> 
> [1] 
> https://github.com/python/cpython/blob/135cabd328504e1648d17242b42b675cdbd0193b/Python/bltinmodule.c#L2461
> 
> On Mon, Jan 10, 2022 at 9:06 PM Jimmie Houchin <jlhouc...@gmail.com> wrote:
>> 
>> Some experiments and discoveries.
>> 
>> I am running my full language test every time. It is the only way I can 
>> compare results. It is also what fully stresses the language.
>> 
>> The reason I wrote the test as I did is because I wanted to know a couple of 
>> things. Is the language sufficiently performant on basic maths. I am not 
>> doing any high PolyMath level math. Simple things like moving averages over 
>> portions of arrays.
>> 
>> The other is efficiency of array iteration and access. This why #sum is the 
>> best test of this attribute. #sum iterates and accesses every element of the 
>> array. It will reveal if there are any problems.
>> 
>> The default test  Julia 1m15s, Python 24.5 minutes, Pharo 2hour 4minutes.
>> 
>> When I comment out the #sum and #average calls, Pharo completes the test in 
>> 3.5 seconds. So almost all the time is spent in those two calls.
>> 
>> So most of this conversation has focused on why #sum is as slow as it is or 
>> how to improve the performance of #sum with other implementations.
>> 
>> 
>> 
>> So I decided to breakdown the #sum and try some things.
>> 
>> Starting with the initial implementation and SequenceableCollection's 
>> default #sum  time of 02:04:03
>> 
>> 
>> "This implementation does no work. Only iterates through the array.
>> It completed in 00:10:08"
>> sum
>>    | sum |
>>     sum := 1.
>>    1 to: self size do: [ :each | ].
>>    ^ sum
>> 
>> 
>> "This implementation does no work, but adds to iteration, accessing the 
>> value of the array.
>> It completed in 00:32:32.
>> Quite a bit of time for simply iterating and accessing."
>> sum
>>    | sum |
>>    sum := 1.
>>    1 to: self size do: [ :each | self at: each ].
>>    ^ sum
>> 
>> 
>> "This implementation I had in my initial email as an experiment and also 
>> several other did the same in theirs.
>> A naive simple implementation.
>> It completed in 01:00:53.  Half the time of the original."
>> sum
>>   | sum |
>>    sum := 0.
>>    1 to: self size do: [ :each |
>>        sum := sum + (self at: each) ].
>>    ^ sum
>> 
>> 
>> 
>> "This implementation I also had in my initial email as an experiment I had 
>> done.
>> It completed in 00:50:18.
>> It reduces the iterations and increases the accesses per iteration.
>> It is the fastest implementation so far."
>> sum
>>    | sum |
>>    sum := 0.
>>    1 to: ((self size quo: 10) * 10) by: 10 do: [ :i |
>>        sum := sum + (self at: i) + (self at: (i + 1)) + (self at: (i + 2)) + 
>> (self at: (i + 3)) + (self at: (i + 4))              + (self at: (i + 5)) + 
>> (self at: (i + 6)) + (self at: (i + 7)) + (self at: (i + 8)) + (self at: (i 
>> + 9))].
>> 
>>    ((self size quo: 10) * 10 + 1) to: self size do: [ :i |
>>        sum := sum + (self at: i)].
>>      ^ sum
>> 
>> Summary
>> 
>> For whatever reason iterating and accessing on an Array is expensive. That 
>> alone took longer than Python to complete the entire test.
>> 
>> I had allowed this knowledge of how much slower Pharo was to stop me from 
>> using Pharo. Encouraged me to explore other options.
>> 
>> I have the option to use any language I want. I like Pharo. I do not like 
>> Python at all. Julia is unexciting to me. I don't like their anti-OO 
>> approach.
>> 
>> At one point I had a fairly complete Pharo implementation, which is where I 
>> got frustrated with backtesting taking days.
>> 
>> That implementation is gone. I had not switched to Iceberg. I had a problem 
>> with my hard drive. So I am starting over.
>> 
>> I am not a computer scientist, language expert, vm expert or anyone with the 
>> skills to discover and optimize arrays. So I will end my tilting at 
>> windmills here.
>> 
>> I value all the other things that Pharo brings, that I miss when I am using 
>> Julia or Python or Crystal, etc. Those languages do not have the vision to 
>> do what Pharo (or any Smalltalk) does.
>> 
>> Pharo may not optimize my app as much as x,y or z. But Pharo optimized me.
>> 
>> That said, I have made the decision to go all in with Pharo. Set aside all 
>> else.
>> In that regard I went ahead and put my money in with my decision and joined 
>> the Pharo Association last week.
>> 
>> Thanks for all of your help in exploring the problem.
>> 
>> 
>> Jimmie Houchin

[Pharo-dev] Re: Array sum. is very slow

Reply via email to