Can you come up with a simple "base case" so we can find the bottleneck/problem?
I'm not sure about what you're trying to do.
What do you get if you try this in a workspace (adjust the value of n to what 
you want, I tested it with 10 million items).
Let's get this one step at a time!


|  floatArray  n  rng t1 t2 t3 r1 r2 r3 |
n := 10000000.
rng := Random new.
floatArray := Array new: n. floatArray doWithIndex: [:each :idx | floatArray 
at: idx put: rng next].
t1 := Time millisecondsToRun: [r1 := floatArray sum].t2 := Time 
millisecondsToRun: [| total |  total := 0. floatArray do: [:each | total := 
total + each ]. r2 := total]. t3 := Time millisecondsToRun: [r3 := floatArray 
inject: 0 into:  [: total :each | total + each ]].
Transcript cr.Transcript cr; show: 'Test with ', n printString, ' 
elements'.Transcript cr;show: 'Original #sum -> Time: ', t1 printString, ' 
milliseconds, Total: ', r1 printString.Transcript cr;show: 'Naive #sum -> Time: 
', t2 printString, ' milliseconds, Total: ', r2 printString.  Transcript 
cr;show: 'Inject #sum -> Time: ', t3 printString, ' milliseconds, Total: ', r3 
printString.  
--------------------------
Here are the results I get on Squeak 5.3
Test with 10000000 elementsOriginal #sum -> Time: 143 milliseconds, Total: 
4.999271889099622e6Naive #sum -> Time: 115 milliseconds, Total: 
4.999271889099622e6Inject #sum -> Time: 102 milliseconds, Total: 
4.999271889099622e6


----------------- 
Benoît St-Jean 
Yahoo! Messenger: bstjean 
Twitter: @BenLeChialeux 
Pinterest: benoitstjean 
Instagram: Chef_Benito
IRC: lamneth 
GitHub: bstjean
Blogue: endormitoire.wordpress.com 
"A standpoint is an intellectual horizon of radius zero".  (A. Einstein) 

    On Thursday, January 6, 2022, 03:38:22 p.m. EST, Jimmie Houchin 
<jlhouc...@gmail.com> wrote:  
 
 I have written a micro benchmark which stresses a language in areas 
which are crucial to my application.

I have written this micro benchmark in Pharo, Crystal, Nim, Python, 
PicoLisp, C, C++, Java and Julia.

On my i7 laptop Julia completes it in about 1 minute and 15 seconds, 
amazing magic they have done.

Crystal and Nim do it in about 5 minutes. Python in about 25 minutes. 
Pharo takes over 2 hours. :(

In my benchmarks if I comment out the sum and average of the array. It 
completes in 3.5 seconds.
And when I sum the array it gives the correct results. So I can verify 
its validity.

To illustrate below is some sample code of what I am doing. I iterate 
over the array and do calculations on each value of the array and update 
the array and sum and average at each value simple to stress array 
access and sum and average.

28800 is simply derived from time series one minute values for 5 days, 4 
weeks.

randarray := Array new: 28800.

1 to: randarray size do: [ :i | randarray at: i put: Number random ].

randarrayttr := [ 1 to: randarray size do: [ :i | "other calculations 
here." randarray sum. randarray average ]] timeToRun.

randarrayttr. "0:00:00:36.135"


I do 2 loops with 100 iterations each.

randarrayttr * 200. "0:02:00:27"


I learned early on in this adventure when dealing with compiled 
languages that if you don’t do a lot, the test may not last long enough 
to give any times.

Pharo is my preference. But this is an awful big gap in performance. 
When doing backtesting this is huge. Does my backtest take minutes, 
hours or days?

I am not a computer scientist nor expert in Pharo or Smalltalk. So I do 
not know if there is anything which can improve this.


However I have played around with several experiments of my #sum: method.

This implementation reduces the time on the above randarray in half.

sum: col
| sum |
sum := 0.
1 to: col size do: [ :i |
      sum := sum + (col at: i) ].
^ sum

randarrayttr2 := [ 1 to: randarray size do: [ :i | "other calculations 
here."
     ltsa sum: randarray. ltsa sum: randarray ]] timeToRun.
randarrayttr2. "0:00:00:18.563"

And this one reduces it a little more.

sum10: col
| sum |
sum := 0.
1 to: ((col size quo: 10) * 10) by: 10 do: [ :i |
      sum := sum + (col at: i) + (col at: (i + 1)) + (col at: (i + 2)) + 
(col at: (i + 3)) + (col at: (i + 4))
          + (col at: (i + 5)) + (col at: (i + 6)) + (col at: (i + 7)) + 
(col at: (i + 8)) + (col at: (i + 9))].
((col size quo: 10) * 10 + 1) to: col size do: [ :i |
      sum := sum + (col at: i)].
^ sum

randarrayttr3 := [ 1 to: randarray size do: [ :i | "other calculations 
here."
     ltsa sum10: randarray. ltsa sum10: randarray ]] timeToRun.
randarrayttr3. "0:00:00:14.592"

It closes the gap with plain Python3 no numpy. But that is a pretty low 
standard.

Any ideas, thoughts, wisdom, directions to pursue.

Thanks

Jimmie

  

Reply via email to