[Pharo-dev] Re: Array sum. is very slow

Sven Van Caekenberghe Wed, 12 Jan 2022 07:51:50 -0800

Yes that would certainly be useful.

But, AFAIU, FloatArray consists of 32-bit Float numbers, I think we also need a 
DoubleFloatArray since 64-bit Floats are the default nowadays.


> On 12 Jan 2022, at 16:31, Henrik Sperre Johansen 
> <[email protected]> wrote:
> 
> We could also try modifying Pharo to use C by reintroducing the FloatArray 
> plugin ;)
> 
> | fa r |
> fa := FloatArray new: 28800.
> r := Random new.
> 1 to fa size do: [ :i | fa at: i put: r next ].
> [ 1 to: fa size do: [ :i | fa sum ] ] timeToRun
> 
> Pharo 9, no plugin:
> 0:00:01:14.777
> Pharo 5, with plugin:
> 0:00:00:00.526
> 
> Cheers,
> Henry
> 
> 
>> On 11 Jan 2022, at 10:08, Andrei Chis <[email protected]> wrote:
>> 
>> Hi Jimmie,
>> 
>> I was scanning through this thread and saw that the Python call uses
>> the sum function. If I remember correctly, in Python the built-in sum
>> function is directly implemented in C [1] (unless Python is compiled
>> with SLOW_SUM set to true). In that case on large arrays the function
>> can easily be several times faster than just iterating over the
>> individual objects as the Pharo code does. The benchmark seems to
>> compare summing numbers in C with summing numbers in Pharo. Would be
>> interesting to modify the Python code to use a similar loop as in
>> Pharo for doing the sum.
>> 
>> Cheers,
>> Andrei
>> 
>> [1] 
>> https://github.com/python/cpython/blob/135cabd328504e1648d17242b42b675cdbd0193b/Python/bltinmodule.c#L2461
>> 
>>> On Mon, Jan 10, 2022 at 9:06 PM Jimmie Houchin <[email protected]> wrote:
>>> 
>>> Some experiments and discoveries.
>>> 
>>> I am running my full language test every time. It is the only way I can 
>>> compare results. It is also what fully stresses the language.
>>> 
>>> The reason I wrote the test as I did is because I wanted to know a couple 
>>> of things. Is the language sufficiently performant on basic maths. I am not 
>>> doing any high PolyMath level math. Simple things like moving averages over 
>>> portions of arrays.
>>> 
>>> The other is efficiency of array iteration and access. This why #sum is the 
>>> best test of this attribute. #sum iterates and accesses every element of 
>>> the array. It will reveal if there are any problems.
>>> 
>>> The default test  Julia 1m15s, Python 24.5 minutes, Pharo 2hour 4minutes.
>>> 
>>> When I comment out the #sum and #average calls, Pharo completes the test in 
>>> 3.5 seconds. So almost all the time is spent in those two calls.
>>> 
>>> So most of this conversation has focused on why #sum is as slow as it is or 
>>> how to improve the performance of #sum with other implementations.
>>> 
>>> 
>>> 
>>> So I decided to breakdown the #sum and try some things.
>>> 
>>> Starting with the initial implementation and SequenceableCollection's 
>>> default #sum  time of 02:04:03
>>> 
>>> 
>>> "This implementation does no work. Only iterates through the array.
>>> It completed in 00:10:08"
>>> sum
>>>   | sum |
>>>    sum := 1.
>>>   1 to: self size do: [ :each | ].
>>>   ^ sum
>>> 
>>> 
>>> "This implementation does no work, but adds to iteration, accessing the 
>>> value of the array.
>>> It completed in 00:32:32.
>>> Quite a bit of time for simply iterating and accessing."
>>> sum
>>>   | sum |
>>>   sum := 1.
>>>   1 to: self size do: [ :each | self at: each ].
>>>   ^ sum
>>> 
>>> 
>>> "This implementation I had in my initial email as an experiment and also 
>>> several other did the same in theirs.
>>> A naive simple implementation.
>>> It completed in 01:00:53.  Half the time of the original."
>>> sum
>>>  | sum |
>>>   sum := 0.
>>>   1 to: self size do: [ :each |
>>>       sum := sum + (self at: each) ].
>>>   ^ sum
>>> 
>>> 
>>> 
>>> "This implementation I also had in my initial email as an experiment I had 
>>> done.
>>> It completed in 00:50:18.
>>> It reduces the iterations and increases the accesses per iteration.
>>> It is the fastest implementation so far."
>>> sum
>>>   | sum |
>>>   sum := 0.
>>>   1 to: ((self size quo: 10) * 10) by: 10 do: [ :i |
>>>       sum := sum + (self at: i) + (self at: (i + 1)) + (self at: (i + 2)) + 
>>> (self at: (i + 3)) + (self at: (i + 4))              + (self at: (i + 5)) + 
>>> (self at: (i + 6)) + (self at: (i + 7)) + (self at: (i + 8)) + (self at: (i 
>>> + 9))].
>>> 
>>>   ((self size quo: 10) * 10 + 1) to: self size do: [ :i |
>>>       sum := sum + (self at: i)].
>>>     ^ sum
>>> 
>>> Summary
>>> 
>>> For whatever reason iterating and accessing on an Array is expensive. That 
>>> alone took longer than Python to complete the entire test.
>>> 
>>> I had allowed this knowledge of how much slower Pharo was to stop me from 
>>> using Pharo. Encouraged me to explore other options.
>>> 
>>> I have the option to use any language I want. I like Pharo. I do not like 
>>> Python at all. Julia is unexciting to me. I don't like their anti-OO 
>>> approach.
>>> 
>>> At one point I had a fairly complete Pharo implementation, which is where I 
>>> got frustrated with backtesting taking days.
>>> 
>>> That implementation is gone. I had not switched to Iceberg. I had a problem 
>>> with my hard drive. So I am starting over.
>>> 
>>> I am not a computer scientist, language expert, vm expert or anyone with 
>>> the skills to discover and optimize arrays. So I will end my tilting at 
>>> windmills here.
>>> 
>>> I value all the other things that Pharo brings, that I miss when I am using 
>>> Julia or Python or Crystal, etc. Those languages do not have the vision to 
>>> do what Pharo (or any Smalltalk) does.
>>> 
>>> Pharo may not optimize my app as much as x,y or z. But Pharo optimized me.
>>> 
>>> That said, I have made the decision to go all in with Pharo. Set aside all 
>>> else.
>>> In that regard I went ahead and put my money in with my decision and joined 
>>> the Pharo Association last week.
>>> 
>>> Thanks for all of your help in exploring the problem.
>>> 
>>> 
>>> Jimmie Houchin

[Pharo-dev] Re: Array sum. is very slow

Reply via email to