I did not do any experiments, but I would be very surprised if it is not at least 50% - you only have a few very small functions for all the infers, while you need at least 8 variants for `MSHADOW_TYPE_SWITCH`, and especially on the NumPy side there are functions that use those switches nested (so 64+ variants), not to mention that those functions tend to be larger.
-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-728284228