Nice - any thoughts on where to focus for improved performance?
geir
Rana Dasgupta wrote:
> Since the allocation helper is inlined now, I reran the old allocation
rate
> test( with the default heapsize 256 M ) ...while gc_gen and gc_cc are in
> the
> same ballpark, there is still some way to go to catch up with RI. Log
> attached.
>
>
>
>
> On 12/5/06, Mikhail Fursov <[EMAIL PROTECTED]> wrote:
>>
>> If you compare performance of allocation - allocation fast path helper
>> code
>> is all you need.
>> And we need to check performance not with microtests, but use real
>> benchmarks. Microtests can hide cache misses in our example.
>>
>> On 12/5/06, Ivan Volosyuk <[EMAIL PROTECTED]> wrote:
>> >
>> > Helper code is equal. GC code is not. Lets compare apples with
oranges.
>> > --
>> > Ivan
>> >
>> > On 12/5/06, Mikhail Fursov <[EMAIL PROTECTED]> wrote:
>> > > The helpers code is equal, except this load. So if we have
different
>> > > performance -> this extra memory access is the cause.
>> > >
>> > > On 12/5/06, Ivan Volosyuk <[EMAIL PROTECTED]> wrote:
>> > > >
>> > > > I think in order to do this comparison, other conditions should
be
>> > > > equal. Comparing helper with 1 dependent load in gc_cc and helper
>> with
>> > > > 2 dependent loads in gc_v5 makes no sense to me.
>> >
>>
>>
>>
>> --
>> Mikhail Fursov
>>
>>
>
>
> ------------------------------------------------------------------------
>
> gcgen default heapsize 256M
> =============================
> Timing 50 million total object allocations
> Varying number of threads and number of objects retained
>
> Timing 1 threads, retaining 64 Objects:
> 3.625 seconds 210.20689655172413 MB/sec
> Timing 1 threads, retaining 128 Objects:
> 3.593 seconds 212.0790425827999 MB/sec
> Timing 1 threads, retaining 256 Objects:
> 3.579 seconds 212.90863369656327 MB/sec
> Timing 1 threads, retaining 512 Objects:
> 3.578 seconds 212.96813862493013 MB/sec
> Timing 1 threads, retaining 1024 Objects:
> 3.578 seconds 212.96813862493013 MB/sec
> Timing 1 threads, retaining 2048 Objects:
> 3.578 seconds 212.96813862493013 MB/sec
> Timing 1 threads, retaining 4096 Objects:
> 3.688 seconds 206.61605206073753 MB/sec
> Timing 1 threads, retaining 8192 Objects:
> 3.687 seconds 206.67209113100083 MB/sec
> Timing 2 threads, retaining 64 Objects:
> 5.344 seconds 142.58982035928142 MB/sec
> Timing 2 threads, retaining 128 Objects:
> 5.484 seconds 138.94967177242887 MB/sec
> Timing 2 threads, retaining 256 Objects:
> 5.485 seconds 138.92433910665451 MB/sec
> Timing 2 threads, retaining 512 Objects:
> 5.14 seconds 148.24902723735408 MB/sec
> Timing 2 threads, retaining 1024 Objects:
> 5.204 seconds 146.42582628747118 MB/sec
> Timing 2 threads, retaining 2048 Objects:
> 5.312 seconds 143.4487951807229 MB/sec
> Timing 2 threads, retaining 4096 Objects:
> 5.219 seconds 146.00498179727916 MB/sec
> Timing 2 threads, retaining 8192 Objects:
> 5.219 seconds 146.00498179727916 MB/sec
> Timing 4 threads, retaining 64 Objects:
> 6.265 seconds 121.62809257781325 MB/sec
> Timing 4 threads, retaining 128 Objects:
> 5.672 seconds 134.3441466854725 MB/sec
> Timing 4 threads, retaining 256 Objects:
> 5.531 seconds 137.76893870909421 MB/sec
> Timing 4 threads, retaining 512 Objects:
> 5.454 seconds 139.71397139713972 MB/sec
> Timing 4 threads, retaining 1024 Objects:
> 5.422 seconds 140.53854666174843 MB/sec
> Timing 4 threads, retaining 2048 Objects:
> 5.593 seconds 136.24173073484712 MB/sec
> Timing 4 threads, retaining 4096 Objects:
> 5.109 seconds 149.14856136230182 MB/sec
> Timing 4 threads, retaining 8192 Objects:
> 5.391 seconds 141.34668892598776 MB/sec
> Timing 8 threads, retaining 64 Objects:
> 5.594 seconds 136.21737575974257 MB/sec
> Timing 8 threads, retaining 128 Objects:
> 5.5 seconds 138.54545454545453 MB/sec
> Timing 8 threads, retaining 256 Objects:
> 5.516 seconds 138.14358230601886 MB/sec
> Timing 8 threads, retaining 512 Objects:
> 5.515 seconds 138.16863100634635 MB/sec
> Timing 8 threads, retaining 1024 Objects:
> 5.5 seconds 138.54545454545453 MB/sec
> Timing 8 threads, retaining 2048 Objects:
> 5.438 seconds 140.1250459727841 MB/sec
> Timing 8 threads, retaining 4096 Objects:
> 5.547 seconds 137.3715521903732 MB/sec
> Timing 8 threads, retaining 8192 Objects:
> 5.89 seconds 129.37181663837012 MB/sec
> Timing 16 threads, retaining 64 Objects:
> 5.828 seconds 130.7481125600549 MB/sec
> Timing 16 threads, retaining 128 Objects:
> 5.86 seconds 130.03412969283275 MB/sec
> Timing 16 threads, retaining 256 Objects:
> 5.859 seconds 130.0563236047107 MB/sec
> Timing 16 threads, retaining 512 Objects:
> 5.828 seconds 130.7481125600549 MB/sec
> Timing 16 threads, retaining 1024 Objects:
> 5.641 seconds 135.0824321928736 MB/sec
> Timing 16 threads, retaining 2048 Objects:
> 5.781 seconds 131.81110534509602 MB/sec
> Timing 16 threads, retaining 4096 Objects:
> 5.719 seconds 133.24007693652734 MB/sec
> Timing 16 threads, retaining 8192 Objects:
> 5.672 seconds 134.3441466854725 MB/sec
> Timing 32 threads, retaining 64 Objects:
> 5.688 seconds 133.9662447257384 MB/sec
> Timing 32 threads, retaining 128 Objects:
> 5.656 seconds 134.72418670438472 MB/sec
> Timing 32 threads, retaining 256 Objects:
> 5.656 seconds 134.72418670438472 MB/sec
> Timing 32 threads, retaining 512 Objects:
> 5.516 seconds 138.14358230601886 MB/sec
> Timing 32 threads, retaining 1024 Objects:
> 6.062 seconds 125.70108874958758 MB/sec
> Timing 32 threads, retaining 2048 Objects:
> 6.25 seconds 121.92 MB/sec
> Timing 32 threads, retaining 4096 Objects:
> 5.672 seconds 134.3441466854725 MB/sec
> Timing 32 threads, retaining 8192 Objects:
> 5.859 seconds 130.0563236047107 MB/sec
> Total: 252.845 seconds
>
> gc4.1 default heapsize 256 M
> ===============================
> Timing 50 million total object allocations
> Varying number of threads and number of objects retained
>
> Timing 1 threads, retaining 64 Objects:
> 3.516 seconds 216.7235494880546 MB/sec
> Timing 1 threads, retaining 128 Objects:
> 3.484 seconds 218.71412169919634 MB/sec
> Timing 1 threads, retaining 256 Objects:
> 3.485 seconds 218.65136298421808 MB/sec
> Timing 1 threads, retaining 512 Objects:
> 3.484 seconds 218.71412169919634 MB/sec
> Timing 1 threads, retaining 1024 Objects:
> 3.5 seconds 217.71428571428572 MB/sec
> Timing 1 threads, retaining 2048 Objects:
> 3.531 seconds 215.80288870008496 MB/sec
> Timing 1 threads, retaining 4096 Objects:
> 3.516 seconds 216.7235494880546 MB/sec
> Timing 1 threads, retaining 8192 Objects:
> 3.594 seconds 212.02003338898163 MB/sec
> Timing 2 threads, retaining 64 Objects:
> 5.547 seconds 137.3715521903732 MB/sec
> Timing 2 threads, retaining 128 Objects:
> 5.406 seconds 140.9544950055494 MB/sec
> Timing 2 threads, retaining 256 Objects:
> 5.297 seconds 143.85501227109685 MB/sec
> Timing 2 threads, retaining 512 Objects:
> 5.687 seconds 133.98980130121328 MB/sec
> Timing 2 threads, retaining 1024 Objects:
> 5.282 seconds 144.2635365391897 MB/sec
> Timing 2 threads, retaining 2048 Objects:
> 5.593 seconds 136.24173073484712 MB/sec
> Timing 2 threads, retaining 4096 Objects:
> 5.032 seconds 151.4308426073132 MB/sec
> Timing 2 threads, retaining 8192 Objects:
> 5.765 seconds 132.17692974848222 MB/sec
> Timing 4 threads, retaining 64 Objects:
> 5.703 seconds 133.61388742766965 MB/sec
> Timing 4 threads, retaining 128 Objects:
> 5.375 seconds 141.7674418604651 MB/sec
> Timing 4 threads, retaining 256 Objects:
> 5.422 seconds 140.53854666174843 MB/sec
> Timing 4 threads, retaining 512 Objects:
> 5.532 seconds 137.74403470715836 MB/sec
> Timing 4 threads, retaining 1024 Objects:
> 5.375 seconds 141.7674418604651 MB/sec
> Timing 4 threads, retaining 2048 Objects:
> 5.359 seconds 142.19070722149655 MB/sec
> Timing 4 threads, retaining 4096 Objects:
> 5.531 seconds 137.76893870909421 MB/sec
> Timing 4 threads, retaining 8192 Objects:
> 5.422 seconds 140.53854666174843 MB/sec
> Timing 8 threads, retaining 64 Objects:
> 5.985 seconds 127.31829573934836 MB/sec
> Timing 8 threads, retaining 128 Objects:
> 6.406 seconds 118.95098345301281 MB/sec
> Timing 8 threads, retaining 256 Objects:
> 5.828 seconds 130.7481125600549 MB/sec
> Timing 8 threads, retaining 512 Objects:
> 5.61 seconds 135.82887700534758 MB/sec
> Timing 8 threads, retaining 1024 Objects:
> 5.593 seconds 136.24173073484712 MB/sec
> Timing 8 threads, retaining 2048 Objects:
> 5.625 seconds 135.46666666666667 MB/sec
> Timing 8 threads, retaining 4096 Objects:
> 5.625 seconds 135.46666666666667 MB/sec
> Timing 8 threads, retaining 8192 Objects:
> 5.625 seconds 135.46666666666667 MB/sec
> Timing 16 threads, retaining 64 Objects:
> 5.954 seconds 127.9811891165603 MB/sec
> Timing 16 threads, retaining 128 Objects:
> 5.625 seconds 135.46666666666667 MB/sec
> Timing 16 threads, retaining 256 Objects:
> 5.437 seconds 140.15081846606583 MB/sec
> Timing 16 threads, retaining 512 Objects:
> 5.438 seconds 140.1250459727841 MB/sec
> Timing 16 threads, retaining 1024 Objects:
> 5.719 seconds 133.24007693652734 MB/sec
> Timing 16 threads, retaining 2048 Objects:
> 5.953 seconds 128.00268772047707 MB/sec
> Timing 16 threads, retaining 4096 Objects:
> 5.422 seconds 140.53854666174843 MB/sec
> Timing 16 threads, retaining 8192 Objects:
> 5.484 seconds 138.94967177242887 MB/sec
> Timing 32 threads, retaining 64 Objects:
> 5.484 seconds 138.94967177242887 MB/sec
> Timing 32 threads, retaining 128 Objects:
> 5.563 seconds 136.97645155491642 MB/sec
> Timing 32 threads, retaining 256 Objects:
> 5.469 seconds 139.33077345035656 MB/sec
> Timing 32 threads, retaining 512 Objects:
> 5.422 seconds 140.53854666174843 MB/sec
> Timing 32 threads, retaining 1024 Objects:
> 5.422 seconds 140.53854666174843 MB/sec
> Timing 32 threads, retaining 2048 Objects:
> 5.406 seconds 140.9544950055494 MB/sec
> Timing 32 threads, retaining 4096 Objects:
> 5.391 seconds 141.34668892598776 MB/sec
> Timing 32 threads, retaining 8192 Objects:
> 5.563 seconds 136.97645155491642 MB/sec
> Total: 250.502 seconds
>
>
> RI heapsize 256M
> =====================
> Timing 50 million total object allocations
> Varying number of threads and number of objects retained
>
> Timing 1 threads, retaining 64 Objects:
> 0.922 seconds 826.4642082429501 MB/sec
> Timing 1 threads, retaining 128 Objects:
> 0.906 seconds 841.0596026490066 MB/sec
> Timing 1 threads, retaining 256 Objects:
> 0.938 seconds 812.3667377398721 MB/sec
> Timing 1 threads, retaining 512 Objects:
> 0.953 seconds 799.5802728226653 MB/sec
> Timing 1 threads, retaining 1024 Objects:
> 1.031 seconds 739.0882638215326 MB/sec
> Timing 1 threads, retaining 2048 Objects:
> 1.172 seconds 650.1706484641638 MB/sec
> Timing 1 threads, retaining 4096 Objects:
> 1.422 seconds 535.8649789029536 MB/sec
> Timing 1 threads, retaining 8192 Objects:
> 3.0 seconds 254.0 MB/sec
> Timing 2 threads, retaining 64 Objects:
> 1.047 seconds 727.7936962750716 MB/sec
> Timing 2 threads, retaining 128 Objects:
> 1.015 seconds 750.7389162561577 MB/sec
> Timing 2 threads, retaining 256 Objects:
> 1.031 seconds 739.0882638215326 MB/sec
> Timing 2 threads, retaining 512 Objects:
> 1.079 seconds 706.2094531974051 MB/sec
> Timing 2 threads, retaining 1024 Objects:
> 1.156 seconds 659.1695501730104 MB/sec
> Timing 2 threads, retaining 2048 Objects:
> 1.265 seconds 602.3715415019764 MB/sec
> Timing 2 threads, retaining 4096 Objects:
> 1.344 seconds 566.9642857142857 MB/sec
> Timing 2 threads, retaining 8192 Objects:
> 2.797 seconds 272.43475151948513 MB/sec
> Timing 4 threads, retaining 64 Objects:
> 1.047 seconds 727.7936962750716 MB/sec
> Timing 4 threads, retaining 128 Objects:
> 1.062 seconds 717.5141242937852 MB/sec
> Timing 4 threads, retaining 256 Objects:
> 1.156 seconds 659.1695501730104 MB/sec
> Timing 4 threads, retaining 512 Objects:
> 1.125 seconds 677.3333333333334 MB/sec
> Timing 4 threads, retaining 1024 Objects:
> 1.141 seconds 667.8352322524102 MB/sec
> Timing 4 threads, retaining 2048 Objects:
> 1.281 seconds 594.8477751756441 MB/sec
> Timing 4 threads, retaining 4096 Objects:
> 1.328 seconds 573.7951807228916 MB/sec
> Timing 4 threads, retaining 8192 Objects:
> 1.563 seconds 487.5239923224568 MB/sec
> Timing 8 threads, retaining 64 Objects:
> 1.187 seconds 641.9545071609098 MB/sec
> Timing 8 threads, retaining 128 Objects:
> 1.188 seconds 641.4141414141415 MB/sec
> Timing 8 threads, retaining 256 Objects:
> 1.156 seconds 659.1695501730104 MB/sec
> Timing 8 threads, retaining 512 Objects:
> 1.156 seconds 659.1695501730104 MB/sec
> Timing 8 threads, retaining 1024 Objects:
> 1.109 seconds 687.1055004508567 MB/sec
> Timing 8 threads, retaining 2048 Objects:
> 1.313 seconds 580.3503427265804 MB/sec
> Timing 8 threads, retaining 4096 Objects:
> 1.359 seconds 560.7064017660044 MB/sec
> Timing 8 threads, retaining 8192 Objects:
> 1.407 seconds 541.5778251599147 MB/sec
> Timing 16 threads, retaining 64 Objects:
> 1.343 seconds 567.3864482501862 MB/sec
> Timing 16 threads, retaining 128 Objects:
> 1.282 seconds 594.383775351014 MB/sec
> Timing 16 threads, retaining 256 Objects:
> 1.25 seconds 609.6 MB/sec
> Timing 16 threads, retaining 512 Objects:
> 1.203 seconds 633.4164588528678 MB/sec
> Timing 16 threads, retaining 1024 Objects:
> 1.219 seconds 625.1025430680886 MB/sec
> Timing 16 threads, retaining 2048 Objects:
> 1.171 seconds 650.7258753202391 MB/sec
> Timing 16 threads, retaining 4096 Objects:
> 1.297 seconds 587.5096376252892 MB/sec
> Timing 16 threads, retaining 8192 Objects:
> 1.344 seconds 566.9642857142857 MB/sec
> Timing 32 threads, retaining 64 Objects:
> 1.438 seconds 529.90264255911 MB/sec
> Timing 32 threads, retaining 128 Objects:
> 1.609 seconds 473.586078309509 MB/sec
> Timing 32 threads, retaining 256 Objects:
> 1.469 seconds 518.720217835262 MB/sec
> Timing 32 threads, retaining 512 Objects:
> 1.437 seconds 530.2713987473903 MB/sec
> Timing 32 threads, retaining 1024 Objects:
> 1.266 seconds 601.8957345971564 MB/sec
> Timing 32 threads, retaining 2048 Objects:
> 1.282 seconds 594.383775351014 MB/sec
> Timing 32 threads, retaining 4096 Objects:
> 1.344 seconds 566.9642857142857 MB/sec
> Timing 32 threads, retaining 8192 Objects:
> 1.312 seconds 580.7926829268292 MB/sec
> Total: 61.969 seconds
>
>
>