Yes that would certainly be useful. But, AFAIU, FloatArray consists of 32-bit Float numbers, I think we also need a DoubleFloatArray since 64-bit Floats are the default nowadays.
> On 12 Jan 2022, at 16:31, Henrik Sperre Johansen > <[email protected]> wrote: > > We could also try modifying Pharo to use C by reintroducing the FloatArray > plugin ;) > > | fa r | > fa := FloatArray new: 28800. > r := Random new. > 1 to fa size do: [ :i | fa at: i put: r next ]. > [ 1 to: fa size do: [ :i | fa sum ] ] timeToRun > > Pharo 9, no plugin: > 0:00:01:14.777 > Pharo 5, with plugin: > 0:00:00:00.526 > > Cheers, > Henry > > >> On 11 Jan 2022, at 10:08, Andrei Chis <[email protected]> wrote: >> >> Hi Jimmie, >> >> I was scanning through this thread and saw that the Python call uses >> the sum function. If I remember correctly, in Python the built-in sum >> function is directly implemented in C [1] (unless Python is compiled >> with SLOW_SUM set to true). In that case on large arrays the function >> can easily be several times faster than just iterating over the >> individual objects as the Pharo code does. The benchmark seems to >> compare summing numbers in C with summing numbers in Pharo. Would be >> interesting to modify the Python code to use a similar loop as in >> Pharo for doing the sum. >> >> Cheers, >> Andrei >> >> [1] >> https://github.com/python/cpython/blob/135cabd328504e1648d17242b42b675cdbd0193b/Python/bltinmodule.c#L2461 >> >>> On Mon, Jan 10, 2022 at 9:06 PM Jimmie Houchin <[email protected]> wrote: >>> >>> Some experiments and discoveries. >>> >>> I am running my full language test every time. It is the only way I can >>> compare results. It is also what fully stresses the language. >>> >>> The reason I wrote the test as I did is because I wanted to know a couple >>> of things. Is the language sufficiently performant on basic maths. I am not >>> doing any high PolyMath level math. Simple things like moving averages over >>> portions of arrays. >>> >>> The other is efficiency of array iteration and access. This why #sum is the >>> best test of this attribute. #sum iterates and accesses every element of >>> the array. It will reveal if there are any problems. >>> >>> The default test Julia 1m15s, Python 24.5 minutes, Pharo 2hour 4minutes. >>> >>> When I comment out the #sum and #average calls, Pharo completes the test in >>> 3.5 seconds. So almost all the time is spent in those two calls. >>> >>> So most of this conversation has focused on why #sum is as slow as it is or >>> how to improve the performance of #sum with other implementations. >>> >>> >>> >>> So I decided to breakdown the #sum and try some things. >>> >>> Starting with the initial implementation and SequenceableCollection's >>> default #sum time of 02:04:03 >>> >>> >>> "This implementation does no work. Only iterates through the array. >>> It completed in 00:10:08" >>> sum >>> | sum | >>> sum := 1. >>> 1 to: self size do: [ :each | ]. >>> ^ sum >>> >>> >>> "This implementation does no work, but adds to iteration, accessing the >>> value of the array. >>> It completed in 00:32:32. >>> Quite a bit of time for simply iterating and accessing." >>> sum >>> | sum | >>> sum := 1. >>> 1 to: self size do: [ :each | self at: each ]. >>> ^ sum >>> >>> >>> "This implementation I had in my initial email as an experiment and also >>> several other did the same in theirs. >>> A naive simple implementation. >>> It completed in 01:00:53. Half the time of the original." >>> sum >>> | sum | >>> sum := 0. >>> 1 to: self size do: [ :each | >>> sum := sum + (self at: each) ]. >>> ^ sum >>> >>> >>> >>> "This implementation I also had in my initial email as an experiment I had >>> done. >>> It completed in 00:50:18. >>> It reduces the iterations and increases the accesses per iteration. >>> It is the fastest implementation so far." >>> sum >>> | sum | >>> sum := 0. >>> 1 to: ((self size quo: 10) * 10) by: 10 do: [ :i | >>> sum := sum + (self at: i) + (self at: (i + 1)) + (self at: (i + 2)) + >>> (self at: (i + 3)) + (self at: (i + 4)) + (self at: (i + 5)) + >>> (self at: (i + 6)) + (self at: (i + 7)) + (self at: (i + 8)) + (self at: (i >>> + 9))]. >>> >>> ((self size quo: 10) * 10 + 1) to: self size do: [ :i | >>> sum := sum + (self at: i)]. >>> ^ sum >>> >>> Summary >>> >>> For whatever reason iterating and accessing on an Array is expensive. That >>> alone took longer than Python to complete the entire test. >>> >>> I had allowed this knowledge of how much slower Pharo was to stop me from >>> using Pharo. Encouraged me to explore other options. >>> >>> I have the option to use any language I want. I like Pharo. I do not like >>> Python at all. Julia is unexciting to me. I don't like their anti-OO >>> approach. >>> >>> At one point I had a fairly complete Pharo implementation, which is where I >>> got frustrated with backtesting taking days. >>> >>> That implementation is gone. I had not switched to Iceberg. I had a problem >>> with my hard drive. So I am starting over. >>> >>> I am not a computer scientist, language expert, vm expert or anyone with >>> the skills to discover and optimize arrays. So I will end my tilting at >>> windmills here. >>> >>> I value all the other things that Pharo brings, that I miss when I am using >>> Julia or Python or Crystal, etc. Those languages do not have the vision to >>> do what Pharo (or any Smalltalk) does. >>> >>> Pharo may not optimize my app as much as x,y or z. But Pharo optimized me. >>> >>> That said, I have made the decision to go all in with Pharo. Set aside all >>> else. >>> In that regard I went ahead and put my money in with my decision and joined >>> the Pharo Association last week. >>> >>> Thanks for all of your help in exploring the problem. >>> >>> >>> Jimmie Houchin
