[Pharo-dev] Re: [lse-consortium-eng] Array sum. is very slow

Guillermo Polito Sun, 09 Jan 2022 01:54:25 -0800

Yet, be careful, that way of benchmarking will have a lot of variation and
noise. Remember there is an OS, other apps open, even the CPU getting
hot/cold can introduce performance differences...


At least, that snippet should be run so many times (I do 100 iterations in
general), and the averages should be compared taking into account standard
deviation.



El dom., 9 ene. 2022 10:14, Stéphane Ducasse <stephane.duca...@inria.fr>
escribió:

> On my machine so this is the same.
>
> SQ5.3
>
> Test with 10000000 elements
> Original #sum -> Time: 196 milliseconds, Total: 5.001448710680429e6
> Naive #sum -> Time: 152 milliseconds, Total: 5.001448710680429e6
> Inject #sum -> Time: 143 milliseconds, Total: 5.001448710680429e6
>
>
>
> On 8 Jan 2022, at 21:47, stephane ducasse <stephane.duca...@inria.fr>
> wrote:
>
> Thanks benoit for the snippet
> I run it in Pharo 10 and I got
>
> Test with 10000000 elements
> Original #sum -> Time: 195 milliseconds, Total: 4.999452880735064e6
> Naive #sum -> Time: 153 milliseconds, Total: 4.999452880735063e6
> Inject #sum -> Time: 198 milliseconds, Total: 4.999452880735063e6
>
>
> in Pharo 9
> Test with 10000000 elements
> Original #sum -> Time: 182 milliseconds, Total: 4.999339450212771e6
> Naive #sum -> Time: 148 milliseconds, Total: 4.999339450212771e6
> Inject #sum -> Time: 203 milliseconds, Total: 4.999339450212771e6
>
> I’m interested to understand why Pharo is slower. May be this is the
> impact
> of the new full blocks.
> We started to play with the idea of regression benchmarks.
>
> S
>
>
> On 7 Jan 2022, at 16:36, Benoit St-Jean via Pharo-dev <
> pharo-dev@lists.pharo.org> wrote:
>
> Can you come up with a simple "base case" so we can find the
> bottleneck/problem?
>
> I'm not sure about what you're trying to do.
>
> What do you get if you try this in a workspace (adjust the value of n to
> what you want, I tested it with 10 million items).
>
> Let's get this one step at a time!
>
>
>
> |  floatArray  n  rng t1 t2 t3 r1 r2 r3 |
>
> n := 10000000.
>
> rng := Random new.
>
> floatArray := Array new: n.
> floatArray doWithIndex: [:each :idx | floatArray at: idx put: rng next].
>
> t1 := Time millisecondsToRun: [r1 := floatArray sum].
> t2 := Time millisecondsToRun: [| total |
> total := 0.
> floatArray do: [:each | total := total + each ].
> r2 := total].
> t3 := Time millisecondsToRun: [r3 := floatArray inject: 0 into:  [: total
> :each | total + each ]].
>
> Transcript cr.
> Transcript cr; show: 'Test with ', n printString, ' elements'.
> Transcript cr;show: 'Original #sum -> Time: ', t1 printString, '
> milliseconds, Total: ', r1 printString.
> Transcript cr;show: 'Naive #sum -> Time: ', t2 printString, '
> milliseconds, Total: ', r2 printString.
> Transcript cr;show: 'Inject #sum -> Time: ', t3 printString, '
> milliseconds, Total: ', r3 printString.
>
> --------------------------
>
> Here are the results I get on Squeak 5.3
>
> Test with 10000000 elements
> Original #sum -> Time: 143 milliseconds, Total: 4.999271889099622e6
> Naive #sum -> Time: 115 milliseconds, Total: 4.999271889099622e6
> Inject #sum -> Time: 102 milliseconds, Total: 4.999271889099622e6
>
>
>
> -----------------
> Benoît St-Jean
> Yahoo! Messenger: bstjean
> Twitter: @BenLeChialeux
> Pinterest: benoitstjean
> Instagram: Chef_Benito
> IRC: lamneth
> GitHub: bstjean
> Blogue: endormitoire.wordpress.com
> "A standpoint is an intellectual horizon of radius zero".  (A. Einstein)
>
>
> On Thursday, January 6, 2022, 03:38:22 p.m. EST, Jimmie Houchin <
> jlhouc...@gmail.com> wrote:
>
>
> I have written a micro benchmark which stresses a language in areas
> which are crucial to my application.
>
> I have written this micro benchmark in Pharo, Crystal, Nim, Python,
> PicoLisp, C, C++, Java and Julia.
>
> On my i7 laptop Julia completes it in about 1 minute and 15 seconds,
> amazing magic they have done.
>
> Crystal and Nim do it in about 5 minutes. Python in about 25 minutes.
> Pharo takes over 2 hours. :(
>
> In my benchmarks if I comment out the sum and average of the array. It
> completes in 3.5 seconds.
> And when I sum the array it gives the correct results. So I can verify
> its validity.
>
> To illustrate below is some sample code of what I am doing. I iterate
> over the array and do calculations on each value of the array and update
> the array and sum and average at each value simple to stress array
> access and sum and average.
>
> 28800 is simply derived from time series one minute values for 5 days, 4
> weeks.
>
> randarray := Array new: 28800.
>
> 1 to: randarray size do: [ :i | randarray at: i put: Number random ].
>
> randarrayttr := [ 1 to: randarray size do: [ :i | "other calculations
> here." randarray sum. randarray average ]] timeToRun.
>
> randarrayttr. "0:00:00:36.135"
>
>
> I do 2 loops with 100 iterations each.
>
> randarrayttr * 200. "0:02:00:27"
>
>
> I learned early on in this adventure when dealing with compiled
> languages that if you don’t do a lot, the test may not last long enough
> to give any times.
>
> Pharo is my preference. But this is an awful big gap in performance.
> When doing backtesting this is huge. Does my backtest take minutes,
> hours or days?
>
> I am not a computer scientist nor expert in Pharo or Smalltalk. So I do
> not know if there is anything which can improve this.
>
>
> However I have played around with several experiments of my #sum: method.
>
> This implementation reduces the time on the above randarray in half.
>
> sum: col
> | sum |
> sum := 0.
> 1 to: col size do: [ :i |
>      sum := sum + (col at: i) ].
> ^ sum
>
> randarrayttr2 := [ 1 to: randarray size do: [ :i | "other calculations
> here."
>     ltsa sum: randarray. ltsa sum: randarray ]] timeToRun.
> randarrayttr2. "0:00:00:18.563"
>
> And this one reduces it a little more.
>
> sum10: col
> | sum |
> sum := 0.
> 1 to: ((col size quo: 10) * 10) by: 10 do: [ :i |
>      sum := sum + (col at: i) + (col at: (i + 1)) + (col at: (i + 2)) +
> (col at: (i + 3)) + (col at: (i + 4))
>          + (col at: (i + 5)) + (col at: (i + 6)) + (col at: (i + 7)) +
> (col at: (i + 8)) + (col at: (i + 9))].
> ((col size quo: 10) * 10 + 1) to: col size do: [ :i |
>      sum := sum + (col at: i)].
> ^ sum
>
> randarrayttr3 := [ 1 to: randarray size do: [ :i | "other calculations
> here."
>     ltsa sum10: randarray. ltsa sum10: randarray ]] timeToRun.
> randarrayttr3. "0:00:00:14.592"
>
> It closes the gap with plain Python3 no numpy. But that is a pretty low
> standard.
>
> Any ideas, thoughts, wisdom, directions to pursue.
>
> Thanks
>
> Jimmie
>
>
>
>

[Pharo-dev] Re: [lse-consortium-eng] Array sum. is very slow

Reply via email to