We could also try modifying Pharo to use C by reintroducing the FloatArray 
plugin ;)

| fa r |
fa := FloatArray new: 28800.
r := Random new.
1 to fa size do: [ :i | fa at: i put: r next ].
[ 1 to: fa size do: [ :i | fa sum ] ] timeToRun

Pharo 9, no plugin:
0:00:01:14.777
Pharo 5, with plugin:
0:00:00:00.526

Cheers,
Henry


> On 11 Jan 2022, at 10:08, Andrei Chis <[email protected]> wrote:
> 
> Hi Jimmie,
> 
> I was scanning through this thread and saw that the Python call uses
> the sum function. If I remember correctly, in Python the built-in sum
> function is directly implemented in C [1] (unless Python is compiled
> with SLOW_SUM set to true). In that case on large arrays the function
> can easily be several times faster than just iterating over the
> individual objects as the Pharo code does. The benchmark seems to
> compare summing numbers in C with summing numbers in Pharo. Would be
> interesting to modify the Python code to use a similar loop as in
> Pharo for doing the sum.
> 
> Cheers,
> Andrei
> 
> [1] 
> https://github.com/python/cpython/blob/135cabd328504e1648d17242b42b675cdbd0193b/Python/bltinmodule.c#L2461
> 
>> On Mon, Jan 10, 2022 at 9:06 PM Jimmie Houchin <[email protected]> wrote:
>> 
>> Some experiments and discoveries.
>> 
>> I am running my full language test every time. It is the only way I can 
>> compare results. It is also what fully stresses the language.
>> 
>> The reason I wrote the test as I did is because I wanted to know a couple of 
>> things. Is the language sufficiently performant on basic maths. I am not 
>> doing any high PolyMath level math. Simple things like moving averages over 
>> portions of arrays.
>> 
>> The other is efficiency of array iteration and access. This why #sum is the 
>> best test of this attribute. #sum iterates and accesses every element of the 
>> array. It will reveal if there are any problems.
>> 
>> The default test  Julia 1m15s, Python 24.5 minutes, Pharo 2hour 4minutes.
>> 
>> When I comment out the #sum and #average calls, Pharo completes the test in 
>> 3.5 seconds. So almost all the time is spent in those two calls.
>> 
>> So most of this conversation has focused on why #sum is as slow as it is or 
>> how to improve the performance of #sum with other implementations.
>> 
>> 
>> 
>> So I decided to breakdown the #sum and try some things.
>> 
>> Starting with the initial implementation and SequenceableCollection's 
>> default #sum  time of 02:04:03
>> 
>> 
>> "This implementation does no work. Only iterates through the array.
>> It completed in 00:10:08"
>> sum
>>    | sum |
>>     sum := 1.
>>    1 to: self size do: [ :each | ].
>>    ^ sum
>> 
>> 
>> "This implementation does no work, but adds to iteration, accessing the 
>> value of the array.
>> It completed in 00:32:32.
>> Quite a bit of time for simply iterating and accessing."
>> sum
>>    | sum |
>>    sum := 1.
>>    1 to: self size do: [ :each | self at: each ].
>>    ^ sum
>> 
>> 
>> "This implementation I had in my initial email as an experiment and also 
>> several other did the same in theirs.
>> A naive simple implementation.
>> It completed in 01:00:53.  Half the time of the original."
>> sum
>>   | sum |
>>    sum := 0.
>>    1 to: self size do: [ :each |
>>        sum := sum + (self at: each) ].
>>    ^ sum
>> 
>> 
>> 
>> "This implementation I also had in my initial email as an experiment I had 
>> done.
>> It completed in 00:50:18.
>> It reduces the iterations and increases the accesses per iteration.
>> It is the fastest implementation so far."
>> sum
>>    | sum |
>>    sum := 0.
>>    1 to: ((self size quo: 10) * 10) by: 10 do: [ :i |
>>        sum := sum + (self at: i) + (self at: (i + 1)) + (self at: (i + 2)) + 
>> (self at: (i + 3)) + (self at: (i + 4))              + (self at: (i + 5)) + 
>> (self at: (i + 6)) + (self at: (i + 7)) + (self at: (i + 8)) + (self at: (i 
>> + 9))].
>> 
>>    ((self size quo: 10) * 10 + 1) to: self size do: [ :i |
>>        sum := sum + (self at: i)].
>>      ^ sum
>> 
>> Summary
>> 
>> For whatever reason iterating and accessing on an Array is expensive. That 
>> alone took longer than Python to complete the entire test.
>> 
>> I had allowed this knowledge of how much slower Pharo was to stop me from 
>> using Pharo. Encouraged me to explore other options.
>> 
>> I have the option to use any language I want. I like Pharo. I do not like 
>> Python at all. Julia is unexciting to me. I don't like their anti-OO 
>> approach.
>> 
>> At one point I had a fairly complete Pharo implementation, which is where I 
>> got frustrated with backtesting taking days.
>> 
>> That implementation is gone. I had not switched to Iceberg. I had a problem 
>> with my hard drive. So I am starting over.
>> 
>> I am not a computer scientist, language expert, vm expert or anyone with the 
>> skills to discover and optimize arrays. So I will end my tilting at 
>> windmills here.
>> 
>> I value all the other things that Pharo brings, that I miss when I am using 
>> Julia or Python or Crystal, etc. Those languages do not have the vision to 
>> do what Pharo (or any Smalltalk) does.
>> 
>> Pharo may not optimize my app as much as x,y or z. But Pharo optimized me.
>> 
>> That said, I have made the decision to go all in with Pharo. Set aside all 
>> else.
>> In that regard I went ahead and put my money in with my decision and joined 
>> the Pharo Association last week.
>> 
>> Thanks for all of your help in exploring the problem.
>> 
>> 
>> Jimmie Houchin

Reply via email to