Certainly interesting. We'd be interested in incorporating some ideas from your experiments if they translate over.
One thing to caution, you showed no numbers for space utilization, map miss speed, whether your map can do iteration, etc. It is possible to trade off one of these for other ones, so it is not immediately obvious that your techniques would apply to the builtin map. For instance, if you're using 10x the space to get 10% speedup, we probably wouldn't want to do that for the general map. (Not saying that's what you did, just pointing out raw speed is not the only consideration.) On Thursday, June 19, 2025 at 4:11:28 AM UTC-7 christoph...@gmail.com wrote: > Hello, > > trying to implement a fast cache, I inadvertently entered a rabbit hole > that led me to implement my own map. In the process I tried to make it > faster than the go map just to see if it is possible. I worked weeks on it > trying out various architectures and methods. > > On my mac book air M2, I get the following benchmarks for a Get operation. > The numbers are the number of items inserted in the table. My keys are 8 > byte long strings. > > goos: darwin > goarch: arm64 > pkg: fastCache/map > cpu: Apple M2 > │ dirstr12/stats_arm64.txt │ > puremapstr/stats_arm64.txt │ > │ sec/op │ sec/op vs base > │ > Cache2Hit/_______1-8 6.151n ± 6% 7.087n ± 1% +15.22% > (p=0.002 n=6) > Cache2Hit/______10-8 8.491n ± 0% 8.156n ± 29% ~ > (p=0.394 n=6) > Cache2Hit/_____100-8 8.141n ± 7% 14.185n ± 13% +74.24% > (p=0.002 n=6) > Cache2Hit/____1000-8 8.252n ± 3% 10.635n ± 39% +28.89% > (p=0.002 n=6) > Cache2Hit/___10000-8 10.45n ± 2% 20.99n ± 4% +100.81% > (p=0.002 n=6) > Cache2Hit/__100000-8 12.16n ± 1% 19.11n ± 10% +57.05% > (p=0.002 n=6) > Cache2Hit/_1000000-8 42.28n ± 2% 47.90n ± 2% +13.29% > (p=0.002 n=6) > Cache2Hit/10000000-8 56.38n ± 12% 61.82n ± 6% ~ > (p=0.065 n=6) > geomean 13.44n 17.86n +32.91% > > On my amd64 i5 11th gen I get the following benchmarks. > > goos: linux > goarch: amd64 > pkg: fastCache/map > cpu: 11th Gen Intel(R) Core(TM) i5-11400 @ 2.60GHz > │ dirstr12/stats_amd64.txt │ > puremapstr/stats_amd64.txt │ > │ sec/op │ sec/op vs base > │ > Cache2Hit/_______1-12 9.207n ± 1% 7.506n ± 3% -18.48% > (p=0.002 n=6) > Cache2Hit/______10-12 9.223n ± 0% 8.806n ± 6% ~ > (p=0.058 n=6) > Cache2Hit/_____100-12 9.279n ± 2% 10.175n ± 3% +9.66% > (p=0.002 n=6) > Cache2Hit/____1000-12 10.45n ± 2% 11.29n ± 3% +8.04% > (p=0.002 n=6) > Cache2Hit/___10000-12 16.00n ± 2% 17.21n ± 5% +7.59% > (p=0.002 n=6) > Cache2Hit/__100000-12 22.20n ± 17% 24.73n ± 22% +11.42% > (p=0.026 n=6) > Cache2Hit/_1000000-12 87.75n ± 2% 91.05n ± 5% +3.76% > (p=0.009 n=6) > Cache2Hit/10000000-12 104.2n ± 2% 105.6n ± 5% ~ > (p=0.558 n=6) > geomean 20.11n 20.49n +1.90% > > On amd64 the performance is on par with go map, but go map uses inlined > simd instructions which I don’t use because I don’t have access to it in > pure go. I use xxh3 right out of the box for the hash function. The > differences are not due to the hash function. > > If there is interest in this, please let me know. > > > > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/66541eae-b5c1-46e0-a843-1e972e7e632bn%40googlegroups.com.