[Pharo-dev] Re: Array sum. is very slow

Henrik Sperre Johansen Wed, 12 Jan 2022 08:39:39 -0800

True!
It’s a little bit of a naming conundrum, since the «Float» in Pharo is already 
64-bit, but since we’re speaking «native» arrays,
DoubleArray
would be the best, I guess.


Speaking of, the related new (… to me, anyways) DoubleByte/DoubleWordArray 
classes have incorrect definitions in Pharo 9 AFAICT- 
variableByte/WordSubclasses, instead of 
variableDoubleByte/variableDoubleWordSubclasses…

| dwa |
dwa := DoubleWordArray new: 1.
dwa at: 1 put: 1 << 32.

and

| dba |
dba := DoubleByteArray new: 1.
dba at: 1 put: 256.

*should* work…

Cheers,
Henry

> On 12 Jan 2022, at 16:51, Sven Van Caekenberghe <[email protected]> wrote:
> 
> Yes that would certainly be useful.
> 
> But, AFAIU, FloatArray consists of 32-bit Float numbers, I think we also need 
> a DoubleFloatArray since 64-bit Floats are the default nowadays.
> 
>> On 12 Jan 2022, at 16:31, Henrik Sperre Johansen 
>> <[email protected]> wrote:
>> 
>> We could also try modifying Pharo to use C by reintroducing the FloatArray 
>> plugin ;)
>> 
>> | fa r |
>> fa := FloatArray new: 28800.
>> r := Random new.
>> 1 to fa size do: [ :i | fa at: i put: r next ].
>> [ 1 to: fa size do: [ :i | fa sum ] ] timeToRun
>> 
>> Pharo 9, no plugin:
>> 0:00:01:14.777
>> Pharo 5, with plugin:
>> 0:00:00:00.526
>> 
>> Cheers,
>> Henry
>> 
>> 
>>>> On 11 Jan 2022, at 10:08, Andrei Chis <[email protected]> wrote:
>>> 
>>> Hi Jimmie,
>>> 
>>> I was scanning through this thread and saw that the Python call uses
>>> the sum function. If I remember correctly, in Python the built-in sum
>>> function is directly implemented in C [1] (unless Python is compiled
>>> with SLOW_SUM set to true). In that case on large arrays the function
>>> can easily be several times faster than just iterating over the
>>> individual objects as the Pharo code does. The benchmark seems to
>>> compare summing numbers in C with summing numbers in Pharo. Would be
>>> interesting to modify the Python code to use a similar loop as in
>>> Pharo for doing the sum.
>>> 
>>> Cheers,
>>> Andrei
>>> 
>>> [1] 
>>> https://github.com/python/cpython/blob/135cabd328504e1648d17242b42b675cdbd0193b/Python/bltinmodule.c#L2461
>>> 
>>>> On Mon, Jan 10, 2022 at 9:06 PM Jimmie Houchin <[email protected]> wrote:
>>>> 
>>>> Some experiments and discoveries.
>>>> 
>>>> I am running my full language test every time. It is the only way I can 
>>>> compare results. It is also what fully stresses the language.
>>>> 
>>>> The reason I wrote the test as I did is because I wanted to know a couple 
>>>> of things. Is the language sufficiently performant on basic maths. I am 
>>>> not doing any high PolyMath level math. Simple things like moving averages 
>>>> over portions of arrays.
>>>> 
>>>> The other is efficiency of array iteration and access. This why #sum is 
>>>> the best test of this attribute. #sum iterates and accesses every element 
>>>> of the array. It will reveal if there are any problems.
>>>> 
>>>> The default test  Julia 1m15s, Python 24.5 minutes, Pharo 2hour 4minutes.
>>>> 
>>>> When I comment out the #sum and #average calls, Pharo completes the test 
>>>> in 3.5 seconds. So almost all the time is spent in those two calls.
>>>> 
>>>> So most of this conversation has focused on why #sum is as slow as it is 
>>>> or how to improve the performance of #sum with other implementations.
>>>> 
>>>> 
>>>> 
>>>> So I decided to breakdown the #sum and try some things.
>>>> 
>>>> Starting with the initial implementation and SequenceableCollection's 
>>>> default #sum  time of 02:04:03
>>>> 
>>>> 
>>>> "This implementation does no work. Only iterates through the array.
>>>> It completed in 00:10:08"
>>>> sum
>>>>  | sum |
>>>>   sum := 1.
>>>>  1 to: self size do: [ :each | ].
>>>>  ^ sum
>>>> 
>>>> 
>>>> "This implementation does no work, but adds to iteration, accessing the 
>>>> value of the array.
>>>> It completed in 00:32:32.
>>>> Quite a bit of time for simply iterating and accessing."
>>>> sum
>>>>  | sum |
>>>>  sum := 1.
>>>>  1 to: self size do: [ :each | self at: each ].
>>>>  ^ sum
>>>> 
>>>> 
>>>> "This implementation I had in my initial email as an experiment and also 
>>>> several other did the same in theirs.
>>>> A naive simple implementation.
>>>> It completed in 01:00:53.  Half the time of the original."
>>>> sum
>>>> | sum |
>>>>  sum := 0.
>>>>  1 to: self size do: [ :each |
>>>>      sum := sum + (self at: each) ].
>>>>  ^ sum
>>>> 
>>>> 
>>>> 
>>>> "This implementation I also had in my initial email as an experiment I had 
>>>> done.
>>>> It completed in 00:50:18.
>>>> It reduces the iterations and increases the accesses per iteration.
>>>> It is the fastest implementation so far."
>>>> sum
>>>>  | sum |
>>>>  sum := 0.
>>>>  1 to: ((self size quo: 10) * 10) by: 10 do: [ :i |
>>>>      sum := sum + (self at: i) + (self at: (i + 1)) + (self at: (i + 2)) + 
>>>> (self at: (i + 3)) + (self at: (i + 4))              + (self at: (i + 5)) 
>>>> + (self at: (i + 6)) + (self at: (i + 7)) + (self at: (i + 8)) + (self at: 
>>>> (i + 9))].
>>>> 
>>>>  ((self size quo: 10) * 10 + 1) to: self size do: [ :i |
>>>>      sum := sum + (self at: i)].
>>>>    ^ sum
>>>> 
>>>> Summary
>>>> 
>>>> For whatever reason iterating and accessing on an Array is expensive. That 
>>>> alone took longer than Python to complete the entire test.
>>>> 
>>>> I had allowed this knowledge of how much slower Pharo was to stop me from 
>>>> using Pharo. Encouraged me to explore other options.
>>>> 
>>>> I have the option to use any language I want. I like Pharo. I do not like 
>>>> Python at all. Julia is unexciting to me. I don't like their anti-OO 
>>>> approach.
>>>> 
>>>> At one point I had a fairly complete Pharo implementation, which is where 
>>>> I got frustrated with backtesting taking days.
>>>> 
>>>> That implementation is gone. I had not switched to Iceberg. I had a 
>>>> problem with my hard drive. So I am starting over.
>>>> 
>>>> I am not a computer scientist, language expert, vm expert or anyone with 
>>>> the skills to discover and optimize arrays. So I will end my tilting at 
>>>> windmills here.
>>>> 
>>>> I value all the other things that Pharo brings, that I miss when I am 
>>>> using Julia or Python or Crystal, etc. Those languages do not have the 
>>>> vision to do what Pharo (or any Smalltalk) does.
>>>> 
>>>> Pharo may not optimize my app as much as x,y or z. But Pharo optimized me.
>>>> 
>>>> That said, I have made the decision to go all in with Pharo. Set aside all 
>>>> else.
>>>> In that regard I went ahead and put my money in with my decision and 
>>>> joined the Pharo Association last week.
>>>> 
>>>> Thanks for all of your help in exploring the problem.
>>>> 
>>>> 
>>>> Jimmie Houchin

[Pharo-dev] Re: Array sum. is very slow

Reply via email to