bgawrych opened a new pull request #20745: URL: https://github.com/apache/incubator-mxnet/pull/20745
## Description ## This change optimizes take operator by new CPU Kernel - it is performing less memory operations (IO), especially for axes in the middle.  Benchmark on CLX-8280 (28 cores) Script: ``` import mxnet import mxnet.gluon.nn as nn import mxnet.numpy as np import time dims = [128, 512, 1024, 4096] print("shape;axis;time") for ndim in range (2): for dim1 in dims: for dim2 in dims: shape = (dim1, dim2) if ndim == 0 else (32, dim1, dim2) a = np.random.uniform(-1.0, 1.0, shape).astype(np.float32) for axis in range(2 + ndim): indices = np.random.uniform(0, shape[axis], (shape[axis]//2,)).astype(np.int32) tic = time.time() for i in range(200): out = np.take(a, indices, axis=axis) out.wait_to_read() toc = time.time() print(f"{shape};{axis};{toc-tic}") ``` ## Checklist ## ### Essentials ### - [ ] PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc) - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage - [ ] Code is well-documented ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
