When looking at NDArray seems to me that the code will be cleaner if
we used dynamic dispatch and have different implementations such as
MKLDNN in diferent compilation units instead of having the ifdef code
intermingled. Is there any reason that I'm not aware of not to do it
that way?

What happens when we integrate the next tensor library from the next


