> On 7 Jan 2022, at 16:05, Jimmie Houchin <[email protected]> wrote:
>
> Hello Sven,
>
> I went and removed the Stdouts that you mention and other timing code from
> the loops.
>
> I am running the test now, to see if that makes much difference. I do not
> think it will.
>
> The reason I put that in there is because it take so long to run. It can be
> frustrating to wait and wait and not know if your test is doing anything or
> not. So I put the code in to let me know.
>
> One of your parameters is incorrect. It is 100 iterations not 10.
Ah, I misread the Python code, on top it says, reps = 10, while at the bottom
it does indeed say, doit(100).
So the time should be multiplied by 10.
The logging, esp. the #flush will slow things down. But the removing the
message tally spy is important too.
The general implementation of #sum is not optimal in the case of a fixed array.
Consider:
data := Array new: 1e5 withAll: 0.5.
[ data sum ] bench. "'494.503 per second'"
[ | sum | sum := 0. data do: [ :each | sum := sum + each ]. sum ] bench.
"'680.128 per second'"
[ | sum | sum := 0. 1 to: 1e5 do: [ :each | sum := sum + (data at: each) ]. sum
] bench. "'1033.180 per second'"
As others have remarked: doing #average right after #sum is doing the same
thing twice. But maybe that is not the point.
> I learned early on in this experiment that I have to do a large number of
> iterations or C, C++, Java, etc are too fast to have comprehensible results.
>
> I can tell if any of the implementations is incorrect by the final nsum. All
> implementations must produce the same result.
>
> Thanks for the comments.
>
> Jimmie
>
>
> On 1/7/22 07:40, Sven Van Caekenberghe wrote:
>> Hi Jimmie,
>>
>> I loaded your code in Pharo 9 on my MacBook Pro (Intel i5) macOS 12.1
>>
>> I commented out the Stdio logging from the 2 inner loops (#loop1, #loop2)
>> (not done in Python either) as well as the MessageTally spyOn: from #run
>> (slows things down).
>>
>> Then I ran your code with:
>>
>> [ (LanguageTest newSize: 60*24*5*4 iterations: 10) run ] timeToRun.
>>
>> which gave me "0:00:09:31.338"
>>
>> The console output was:
>>
>> ===
>> Starting test for array size: 28800 iterations: 10
>>
>> Creating array of size: 28800 timeToRun: 0:00:00:00.031
>>
>> Starting loop 1 at: 2022-01-07T14:10:35.395394+01:00
>> Loop 1 time: nil
>> nsum: 11234.235001659388
>> navg: 0.39007760422428434
>>
>> Starting loop 2 at: 2022-01-07T14:15:22.108433+01:00
>> Loop 2 time: 0:00:04:44.593
>> nsum: 11245.697629561537
>> navg: 0.3904756121375534
>>
>> End of test. TotalTime: 0:00:09:31.338
>> ===
>>
>> Which would be twice as fast as Python, if I got the parameters correct.
>>
>> Sven
>>
>>> On 7 Jan 2022, at 13:19, Jimmie Houchin <[email protected]> wrote:
>>>
>>> As I stated this is a micro benchmark and very much not anything resembling
>>> a real app, Your comments are true if you are writing your app. But if you
>>> want to stress the language you are going to do things which are seemingly
>>> non-sense and abusive.
>>>
>>> Also as I stated. The test has to be sufficient to stress faster languages
>>> or it is meaningless.
>>>
>>> If I remove the #sum and the #average calls from the inner loops, this is
>>> what we get.
>>>
>>> Julia 0.2256 seconds
>>> Python 5.318 seconds
>>> Pharo 3.5 seconds
>>>
>>> This test does not sufficiently stress the language. Nor does it provide
>>> any valuable insight into summing and averaging which is done a lot, in
>>> lots of places in every iteration.
>>>
>>> If you notice that inner array changes the array every iteration. So every
>>> call to #sum and #average is getting different data.
>>>
>>> Full Test
>>>
>>> Julia 1.13 minutes
>>> Python 24.02 minutes
>>> Pharo 2:09:04
>>>
>>> Code for the above is now published. You can let me know if I am doing
>>> something unequal to the various languages.
>>>
>>> And just remember anything you do which sufficiently changes the test has
>>> to be done in all the languages to give a fair test. This isn't a lets make
>>> Pharo look good test. I do want Pharo to look good, but honestly.
>>>
>>> Yes, I know that I can bind to BLAS or other external libraries. But that
>>> is not a test of Pharo. The Python is plain Python3 no Numpy, just using
>>> the the default list [] for the array.
>>>
>>> Julia is a whole other world. It is faster than Numpy. This is their domain
>>> and they optimize, optimize, optimize all the math. In fact they have
>>> reached the point that some pure Julia code beats pure Fortran.
>>>
>>> In all of this I just want Pharo to do the best it can.
>>>
>>> With the above results unless you already had an investment in Pharo, you
>>> wouldn't even look. :(
>>>
>>> Thanks for exploring this with me.
>>>
>>>
>>> Jimmie
>>>
>>>
>>>
>>>
>>> On 1/6/22 18:24, John Brant wrote:
>>>> On Jan 6, 2022, at 4:35 PM, Jimmie Houchin <[email protected]> wrote:
>>>>> No, it is an array of floats. The only integers in the test are in the
>>>>> indexes of the loops.
>>>>>
>>>>> Number random. "generates a float 0.8188008774329387"
>>>>>
>>>>> So in the randarray below it is an array of 28800 floats.
>>>>>
>>>>> It just felt so wrong to me that Python3 was so much faster. I don't care
>>>>> if Nim, Crystal, Julia are faster. But...
>>>>>
>>>>>
>>>>> I am new to Iceberg and have never shared anything on Github so this is
>>>>> all new to me. I uploaded my language test so you can see what it does.
>>>>> It is a micro-benchmark. It does things that are not realistic in an app.
>>>>> But it does stress a language in areas important to my app.
>>>>>
>>>>>
>>>>> https://github.com/jlhouchin/LanguageTestPharo
>>>>>
>>>>>
>>>>> Let me know if there is anything else I can do to help solve this problem.
>>>>>
>>>>> I am a lone developer in my spare time. So my apologies for any ugly code.
>>>>>
>>>> Are you sure that you have the same algorithm in Python? You are calling
>>>> sum and average inside the loop where you are modifying the array:
>>>>
>>>> 1 to: nsize do: [ :j || n |
>>>> n := narray at: j.
>>>> narray at: j put: (self loop1calc: i j: j n: n).
>>>> nsum := narray sum.
>>>> navg := narray average ]
>>>>
>>>> As a result, you are calculating the sum of the 28,800 size array 28,800
>>>> times (plus another 28,800 times for the average). If I write a similar
>>>> loop in Python, it looks like it would take almost 9 minutes on my machine
>>>> without using numpy to calculate the sum. The Pharo code takes ~40
>>>> seconds. If this is really how the code should be, then I would change it
>>>> to not call sum twice (once for sum and once in average). This will almost
>>>> result in a 2x speedup. You could also modify the algorithm to update the
>>>> nsum value in the loop instead of summing the array each time. I think the
>>>> updating would require <120,000 math ops vs the >1.6 billion that you are
>>>> performing.
>>>>
>>>>
>>>> John Brant