no time to think about it but I found that interesting.

Begin forwarded message:

> From: Paul Baumann <[email protected]>
> Subject: [vwnc] [Tuning] Set class>>new:
> Date: January 11, 2012 1:11:19 AM GMT+01:00
> To: "'VWNC'" <[email protected]>
> 
> Inspect this code to see hashed collection instances grouped by their free 
> space percentage:
>  
>                 ^((ObjectMemory allInstancesOfAny: Set withAllSubclasses 
> asArray) asOrderedCollection
>                                 reject: [:ea | ea size < 10 ])
>                                                 groupedBy: [:ea | ea capacity 
> / ea size - 1.0 ]
>  
> Hashing collections in VW need to have extra space allocated to act as search 
> limits between logical groups. Sets performance drops there are aren't many 
> free slots allocated.  When there are only 25% free slots allocated then the 
> collection grows. VW rounds up to a prime number for instance creation and 
> then targets 33% rounded up to a prime for growths afterwards. A target of 
> 33% rounded to prime skimps on memory at the expense of performance.
>  
> As a collection grows to the minimum 25% free space goal (in #fullCheck) 
> there is a 75% chance of crossing into the next logical bucket before finding 
> a nil value. Crossing into the next logical bucket just once can double the 
> number of slots searched for a miss. Four logical buckets may be traversed 
> for each search-miss by the time the 25% minimum is reached.
>  
> When VW skimps on allocation then it has to grow collections more frequently 
> and has to do more comparisons for each lookup. That is a general performance 
> problem, but there is another problem with how VW allocates memory for 
> pre-sized collections...
>  
> When you see collections created for an expected size like these:
>  
>                 anArray := Array new: 1000.
>                 aSet := Set new: 1000.
>                 anOc := OrderedCollection new: 1000.
>                 aDict := Dictionary new: 1000.
>  
> How many items would you expect each of these collections to efficiently 
> contain? Clearly the Array will be exactly 1000. The OrderedCollection will 
> take up to 1000 additions before it needs to grow. Pre-sized collections are 
> created to avoid incremental growth costs. Is it not common and natural to 
> think that the Set and Dictionary would also be created to efficiently 
> contain 1000 items too? Would you be surprised to learn that all subclasses 
> of Set in VW (and likely only VW) are inefficient with this kind of 
> pre-allocation when you add 1000 objects to them?
>  
> No application code should be expected to know how Sets are implemented and 
> to pad sizes having knowledge of VW implementation details! Look at the 
> capacity variances that VW creates even though the Set has been created each 
> time to anticipate 1000 additions:
>  
> (Set withAll: (Array new: 1000)) capacity
> 1511
>  
> (Set withAll: ((1 to: 1000) collect: [:i | Object new ])) capacity
> 1511
>  
> (Set new: 1000) capacity
> 1009
>  
> ((Set new: 1000) addAll: ((1 to: 1000) collect: [:i | Object new ]); 
> yourself) capacity
> 2027
>  
> ((Set new) addAll: ((1 to: 1000) collect: [:i | Object new ]); yourself) 
> capacity
> 1361
>  
> The first four should all have the same result. The last one can be different 
> because it is not pre-sized. The third example shows that only 0.9% free 
> space is allocated in addition to the requested size. The fourth example 
> shows that the under-allocation caused a growth that then over-allocated to 
> end at 102.7% free space. The costs of the third and fourth example can be 
> seen in more detail with this test:
>  
> | objects set timings cap0 t0 t1 |
> objects := (1 to: ObjectMemory maximumIdentityHashValue) collect: [:i | 
> Object new].
> timings := OrderedCollection new: objects size.
> t0 := t1 := Time microsecondClock.
> set := Set new: objects size.
> cap0 := set capacity.
> objects size to: 1 by: -1 do: [ :i |
>                 set add: (objects at: i).
>                 i \\ 1000 = 1 ifTrue: [timings add: (t1 negated + (t1 :=Time 
> microsecondClock))].
> ].
> (Array new: 5)
>                 at: 1 put: cap0; "Starting capacity"
>                 at: 2 put: set capacity;    "Ending capacity"
>                 at: 3 put: (1 - (set capacity / set size) * -100.0);   
> "Percentage of free slots"
>                 at: 4 put: timings asArray;             "Timings of each 1000"
>                 at: 5 put: t1 - t0;                "Total time"
>                 yourself
>  
> #(17417 34939 113.264 #(147 264 262 261 262 262 265 261 261 260 261 260 260 
> 2474 184 184 184) 6312)       "primes"
>  
> A difference between the first and second numbers indicates growth costs were 
> incurred while adding. The third number shows there were 1.13 free slots 
> allocated for each logical bucket. The last number is total execution time. 
> Notice that performance of each group improves 41% just after the collection 
> grows.
>  
> Load a package named ICXVwHashTuning from the Cincom Public Code Repository 
> and try the test again.
>  
> #(28670 28670 74.9985 #(104 182 179 179 180 263 180 183 185 184 190 189 188 
> 188 189 254 188) 3205)        "After loading ICXVwHashTuning"
>  
> Total execution time was cut by 49% while consuming 18% less memory and 
> ending with a (very efficient) 75% free slots.
>  
> When application code says "Set new: 1000" then they mean that they want a 
> Set instance that should expect to contain 1000 additions. It is the 
> responsibility of the Set to know how much physical space should be allocated 
> to contain 1000 items.  These are simple changes that make a broad 
> performance improvement while also having intuitive performance. Cincom 
> should take notice of opportunities like this but at least you can improve 
> the performance of your existing code if they don't.
>  
> You might want to also execute "(ObjectMemory allInstancesOfAny: Set 
> withAllSubclasses asArray) do: [:ea | ea trim ]." to optimize free space 
> allocation for all the hashed instances. You may notice a performance boost 
> from doing that once after loading ICXVwHashTuning.
>  
> Paul Baumann
>  
> 
> This message may contain confidential information and is intended for 
> specific recipients unless explicitly noted otherwise. If you have reason to 
> believe you are not an intended recipient of this message, please delete it 
> and notify the sender. This message may not represent the opinion of 
> IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and 
> does not constitute a contract or guarantee. Unencrypted electronic mail is 
> not secure and the recipient of this message is expected to provide 
> safeguards from viruses and pursue alternate means of communication where 
> privacy or a binding message is desired.
> _______________________________________________
> vwnc mailing list
> [email protected]
> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc




Reply via email to