When performing benchmarks previously in C/C++ I noticed that lookup operations on vectors were extremely fast, so fast that I thought that perhaps the code didn't run. All of these tests only performed a simple operation(s) in a loop, similar to what you are doing. I think it has to do with highly efficient caching when this type of structure used in a loop.
In more realistic code, as opposed to this benchmark-style code, your results could be different.
