Hello,
Yes, I understand this is a micro-benchmark. And yes I know that
micro-benchmarks can be dangerous and sometimes worthless. However,
financial trading, which is the app I am working on, does this a lot.
This is precisely where most of the time is spent. Adding data (numbers)
or accessing data in arrays or matrices. This is where my app lives. And
performance is important.
Regarding my running of the test and the variety of loop sizes. From
what I see in the test there is only one loop size. 10000
There are differing initial OrderedCollection sizes. Unless I am
misreading the code.
OrderedCollection new: 100. (capacity 100).
I don't understand how this could be a capacity of 100 by what seems to
me to be a normal understanding when its initial size is 0 and it can
grow infinitely beyond 100.
To me if it has a capacity of 100, immediately after creation its size
should be 100
and its initial 100 elements should be accessible.
oc at: 1 put: 1 should work.
So I do not understand where this supposed capacity enters into
anything. It is neither immediately accessible, nor does it limit.
When I create an Array, it has a capacity with an understanding that I
have.
a := Array new: 100.
size a. "100"
a at: 1. "nil" "works"
a at: 1 put: 1. "works"
I naively expect the same from another Collection which looks like
oc := OrderedCollection new: 100.
Since it is a collection with an initial size (capacity). But because it
is an OrderedCollection and not an Array, it is not limited by its
initial capacity.
Apparently my understanding faulty. I just wanted to express how this
would potentially/probably be understood by someone who does not know
this peculiarity of OrderedCollection.
Regarding the comparison to Python and Julia.
Mea Culpa, mea culpa, maxima mea culpa.
While I thought I duplicated the Pharo code. I did not.
I duplicated the inner loop which created and populated the array.
I completely left off the outer loop which ran the inner loop 10000 times.
The Python code is simple.
def t(n):
s = time.time()
for x in range(0,10000):
l = []
for i in range(1,10001):
l.append(i)
e = time.time()
print(e-s, len(l))
return l
list_test(10000) # == 15.5 seconds
Julia
function test_array(asize)
for i = 1:10000
a = Array(Int64, asize)
for i = 1:asize
a[i]=i
end
end
end
@time(test_array(10000)) # 0.9 seconds
So Pharo at 1.6 seconds (Array time) is respectable compared to Julia's
highly optimized preallocated arrays and thoroughly keeps Python in its
place.
And if I start Julia off with an Array(Int64,0) and do append!(a,[i,]).
Julia's time plummets to 19seconds.
I feel much better. Thanks for challenging my numbers and insuring my
comparisons were correct.
Jimmie
On 1/15/2015 1:13 PM, Sven Van Caekenberghe wrote:
Jimmy,
On 15 Jan 2015, at 19:46, Jimmie Houchin <[email protected]> wrote:
I do not understand what is happening here.
oc := OrderedCollection new: 10000.
oc size. "0"
a := Array new: 10000.
a size. "10000"
So it seems that the #new: isn't creating an OrderedCollection of the size we
pass in.
I discovered this when I attempted to do the identical test but with an Array
instead of an OrderedCollection. Since the Array doesn't #add: I had to
change the code to #at:put:.
This is all normal and by design.
OrderedCollection is a growable collection, designed so you can add elements at
either end. The thread's discussion is about growing strategies.
Consider,
(OrderedCollection new: 100) capacity = 100
it means it has preallocated room for a 100 elements, but the way it is managed
is tricky, as people discovered.
When I noticed all of the OC code did #add: even if supposedly setting its size
to 10000(0). I tried to use #at:put: in the OrderedCollection but it failed
because their was no index 1, because the size was still 0.
I naively would have thought that OrderedCollection new: 10000 would return an
OrderedCollection of 10000 empty/nil elements and that I could #at:put: into
any of those 10000 slots and #add if I needed to go beyond those 10000.
Tests done on a laptop with Windows 7, Pharo 4 latest image, latest stable vm,
3rd gen i7 processor, 12gb ram.
I did all of the tests in Pharo 4 and added my Array test.
0:00:00:04.37
0:00:00:03.183
0:00:00:03.674
0:00:00:03.753
0:00:00:01.647 and with an Array instead of OrderedCollection
So we can see if we know the max size and create the collection with that size
it should improve performance.
I also do not understand why Pharo is so slow on this?
With the preallocated Array above it took 1.647 seconds.
Python 3.4.2, 0.003 seconds with 10,000 ints, and 0.17 with 1,000,000 ints.
Julia 3.5, 0.005 seconds with 1,000,000 Int64.
Now I don't expect Pharo to keep up with Julia. But we aren't even in the
neighborhood with Python.
I really would like to use Pharo. It really is my favorite language and
environment. But these performance numbers give me pause.
There are differently sized loops in this thread, what exactly did you try ?
What is the Python code that you ran ?
Benchmarking is always dangerous, you must make absolutely clear what you are
running and comparing.
Sven
Jimmie
On 1/14/2015 3:53 AM, Max Leske wrote:
Hi Alex
I thought the following little experiment might interest you.
I was trying to find out (as per your suggestion) how big the difference would
be between:
- setting the array in OrderedCollection to nil and initializing it to size 10
on first access
- initializing the array in OrderedCollection to size 0.
Expectation: #grow will take time, ergo, using an empty array should be slower
than using
a preallocated array.
Experiment (Pharo 1.3 on SqueakVM):
[ 10000 timesRepeat: [ | c | c := OrderedCollection new: 100000.
1 to: 10000 do: [ :i | c add: i ] ] ] timeToRun 17566
[ 10000 timesRepeat: [ | c | c := OrderedCollection new: 10000.
1 to: 10000 do: [ :i | c add: i ] ] ] timeToRun 19717
[ 10000 timesRepeat: [ | c | c := OrderedCollection new: 0.
1 to: 10000 do: [ :i | c add: i ] ] ] timeToRun 17598
[ 10000 timesRepeat: [ | c | c := OrderedCollection new: 1000.
1 to: 10000 do: [ :i | c add: i ] ] ] timeToRun 18084
Result: growing a collection is always faster (or at least nearly equally fast)!
Explanation: What makes preinitialized collections so slow is the fact the the
firstIndex variable will be set to
the first third of the array (3333 in this case). When the last
position in the array is reached, all elements
are copied to the beginning of the array (2n operations). This copy
operation is really slow (which can be seen
from the first example with an oversized collection that does not
require copying because the last position
of the array is never reached).
Experiment (Pharo 4 on PharoVM):
(I’m using bigger numbers here to compensate for the faster VM :) )
[ 10000 timesRepeat: [ | c | c := OrderedCollection new: 1000000.
1 to: 100000 do: [ :i | c add: i ] ] ] timeToRun "0:00:01:09.225"
[ 10000 timesRepeat: [ | c | c := OrderedCollection new: 100000.
1 to: 100000 do: [ :i | c add: i ] ] ] timeToRun "0:00:00:37.653"
[ 10000 timesRepeat: [ | c | c := OrderedCollection new: 0.
1 to: 100000 do: [ :i | c add: i ] ] ] timeToRun "0:00:00:40.152"
[ 10000 timesRepeat: [ | c | c := OrderedCollection new: 10000.
1 to: 100000 do: [ :i | c add: i ] ] ] timeToRun "0:00:00:42.121”
Result: growing a collection is easily fast enough, although minute performance
gains are possible with preinitialized collections.
Explanation: #makeRoomAtLast has been improved. Also: firstIndex is now set to
the beginning of the array, preventing the copy
operation that made the former measurements so slow.
The run with the oversized collection is only so slow because it
apparently takes a lot of time to assign the large array (?).
Consequence of my little experiment: My initial implementation of the idea in
you paper set the array in OrderedCollection to nil
under the assumption that grow operations are very costly. That choice
leads to many changes in methods of OrderedCollection
where it is necessary to check if the array has been initialized yet.
This introduces overhead and makes the code more error prone.
I see now that it makes much more sense to use an empty array, the
performance lost by growing ist neglectable.
It is also interesting to note that there almost never seems to be a
good reason to initialize an OrderedCollection with a large array.
Cheers,
Max