A couple of notes:
* Your stack size of 10_000_000 is very big. That would only work with global
arrays as those are statically assigned a size at program start. However if
used within a function you will most certainly stack overflow. For example a
stack of int would take 80_000_000 bytes = 80MB but max stack size is about 8MB
on most machines. If this is used within a function you will have to allcoate
it on the heap with a ref array or just use a seq.
* In terms of speed, if the address of the seq and the address of the arrays
are hot in cache, there is no difference. Data structures on the stack are
always hot in cache within a function.
* The `base_addr_offset + index * size` computation to address each elements
of a seq/array doesn't matter in general for 2 reasons:
* The size of an object is power of 2 (due to padding and alignment) and
x86 can do offset + index * pow2 addressing in a single instruction without a
multiplication via something called SIB addressing (Scaled Index Byte) for
types of size 1, 2, 4 or 8:
[https://wiki.osdev.org/X86-64_Instruction_Encoding#SIB](https://wiki.osdev.org/X86-64_Instruction_Encoding#SIB)
* It is very likely that the bottleneck is either memory or branch
predictions and that your CPU has a lot of free time to do those computations
in between waiting for data from the L1/L2 cache.