Updated benchmarks with pre-calculated collection of numbers (as suggested by
Sven):
Benchmarks of the current versions:
[ (1 to: 1000000) asArray sum ] benchFor: 10 seconds.
124 iterations, 12.389 per second
[ (1 to: 1000000) asArray sum: [ :e | e ] ] benchFor: 10 seconds.
109 iterations, 10.801 per second)
[ (1 to: 100) asArray sum ] benchFor: 10 seconds.
4,166,154 iterations, 416,532 per second
[ (1 to: 100) asArray sum: [ :e | e ] ] benchFor: 10 seconds.
3,115,121 iterations, 311,481 per second
Benchmarks of the new versions:
[ (1 to: 1000000) asArray sum ] benchFor: 10 seconds.
125 iterations, 12.490 per second
[ (1 to: 1000000) asArray sum: [ :e | e ] ] benchFor: 10 seconds.
119 iterations, 11.842 per second
[ (1 to: 1000000) asArray sum: [ :e | e ] ifEmpty: [ 0 ] ] benchFor: 10 seconds.
130 iterations, 12.947 per second
[ (1 to: 100) asArray sum ] benchFor: 10 seconds.
2,985,085 iterations, 298,449 per second
[ (1 to: 100) asArray sum: [ :e | e ] ] benchFor: 10 seconds.
3,260,280 iterations, 325,963 per second
[ (1 to: 100) asArray sum: [ :e | e ] ifEmpty: [ 0 ] ] benchFor: 10 seconds.
3,143,812 iterations, 314,287 per second
The benchmark for #sum suggests that there’s indeed a benefit of duplicating
the code for that particular case (the delegations would make #sum 30% slower).
I still think we should use delegation there instead of duplicate code. #sum is
not a “critical” method in my opinion.
Cheers,
Max
> On 20 Dec 2015, at 14:01, Sven Van Caekenberghe <[email protected]> wrote:
>
>>
>> On 20 Dec 2015, at 12:59, Max Leske <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> I would like to wrap up this discussion.
>
> Good idea.
>
>>> On 05 Dec 2015, at 18:14, stepharo <[email protected]> wrote:
>>>
>>> So what is the conclusion?
>>> I like the idea of Esteban M to have iterator because it moves some
>>> behavior out of core classes.
>>>
>>> [[[
>>>
>>> aCollection arithmetic sum: [...] or.... aCollection
>>> arithmetic avg.
>>>
>>> ]]]
>>>
>>
>> While I think that iterators are an intriguing idea I also believe that they
>> are beyond the scope of this issue. If anybody wants to follow up on
>> iterators (or unit types for that matter) please start a new thread / issue.
>>
>>
>> I propose to use Sven’s version for #sum:ifEmpty:. The result would be these
>> three methods:
>>
>> sum
>> ^ self
>> sum: [ :each | each ]
>> ifEmpty: [ 0 ]
>>
>> sum: aBlock
>> ^ self
>> sum: aBlock
>> ifEmpty: [ self errorEmptyCollection ]
>>
>> sum: aBlock ifEmpty: emptyBlock
>> | sum sample |
>> self isEmpty ifTrue: [ ^ emptyBlock value ].
>> sample := self anyOne.
>> sum := self
>> inject: sample
>> into: [ :accum :each | accum + (aBlock value: each) ].
>> ^ sum - sample
>>
>>
>> I’ve attached a couple of benchmark results below. To me they show that
>> 1. the new implementation is maybe a tiny bit slower but insignificantly so
>> (if you’re going for performance you’ll probably write your own optimised
>> version anyway)
>> 2. there is no need to duplicate the code (like #sum and #sum: currently
>> do). It’s perfectly fine to delegate to #sum:ifEmpty:
>>
>>
>>
>> In addition to the above changes I would like to remove #detectSum: (->
>> #sum:) and #sumNumbers (-> #sum).
>>
>>
>> Note that once we agree on changing this API, we will need to also change
>> #detectMin:, #detectMax:, #min, #max as well as all overrides (e.g.
>> RunArray, Interval) of these and of #sum et. al. to stay consistent. The
>> changes would of course be in line with this change, such that every
>> operation has a unary selector with a sensible default, one that takes a
>> block and throws an error for empty collections and a third that takes a
>> block for the iteration and one for the empty case.
>
> Excellent summary, thanks.
>
>> Please cast your vote for these changes:
>>
>> 1. Do you agree to changing #sum and #sum: in the suggested way?
>
> yes
>
>> 2. Do you agree to the removal of #detectSum:?
>
> yes
>
>> 3. Do you agree to the removal of #sumNumbers?
>
> yes
>
>> 4. Do you agree that the #max and #min selectors also need to be adapted?
>
> probably yes.
>
>> Thanks for you help.
>>
>> Cheers,
>> Max
>>
>>
>>
>>
>>
>> Benchmarks
>> ============
>> (Note that these aren’t very good benchmarks. There’s quite some variation
>> on each run.)
>
> You should exclude your setup out of your actual benchmark, like this:
>
> | numbers |
> numbers := (1 to: 1000000) asArray.
> [ numbers sum ] benchFor: 10 seconds.
>
> "a BenchmarkResult(126 iterations in 10 seconds 77 milliseconds. 12.504 per
> second)"
>
>> Machine:
>> MacBook Pro (15-inch, Early 2011)
>> CPU: 2.2 GHz Intel Core i7
>> Memory: 8 GB 1333 MHz DDR3
>> Disk: APPLE SSD TS512C (500 GB)
>>
>>
>> Benchmarks of the current versions:
>>
>> [ (1 to: 1000000) asArray sum ] benchFor: 10 seconds.
>> 75 iterations, 7.470 per second
>>
>> [ (1 to: 1000000) asArray sum: [ :e | e ] ] benchFor: 10 seconds.
>> 72 iterations, 7.128 per second
>>
>>
>> [ (1 to: 100) asArray sum ] benchFor: 10 seconds.
>> 1,189,477 iterations, 118,912 per second
>>
>>
>> [ (1 to: 100) asArray sum: [ :e | e ] ] benchFor: 10 seconds.
>> 1,171,467 iterations, 117,112 per second
>>
>>
>>
>> Benchmarks of the new versions:
>>
>> [ (1 to: 1000000) asArray sum ] benchFor: 10 seconds.
>> 73 iterations, 7.244 per second
>>
>> [ (1 to: 1000000) asArray sum: [ :e | e ] ] benchFor: 10 seconds.
>> 75 iterations, 7.480 per second
>>
>> [ (1 to: 1000000) asArray sum: [ :e | e ] ifEmpty: [ 0 ] ] benchFor: 10
>> seconds.
>> 72 iterations, 7.141 per second
>>
>>
>> [ (1 to: 100) asArray sum ] benchFor: 10 seconds.
>> 1,115,827 iterations, 111,560 per second
>>
>> [ (1 to: 100) asArray sum: [ :e | e ] ] benchFor: 10 seconds.
>> 1,154,595 iterations, 115,425 per second
>>
>> [ (1 to: 100) asArray sum: [ :e | e ] ifEmpty: [ 0 ] ] benchFor: 10 seconds.
>> 1,102,358 iterations, 110,203 per second