> Anyway, why are i32 integers the fastest when performing calculations?
Some CPU may have no 64 bit data type at all, like some ARM and embedded chips. So for int64 addition we would get two operations at least, one plain addition and one add with carry. Even for a CPU with 64 bit support a multiplication will result in a 128 bit result which may be not a native type. And finally, even when the 64 bit type is fully supported, you have to regard memory bandwidth and cache size. There can be twice as much int32 in cache as int64. For smaller types than native word size, like int16 or int8 there may exists CPUs which first have to extent the size to native word size like 32 or 64 bit, and then do the math. I think for x86 CPU that is not true. For float types -- well when there is a FPU add and mul can be fast too, division is generally not that fast and has latency.
