ptrendx opened a new pull request #15167: [WIP] Pointwise fusion for GPU URL: https://github.com/apache/incubator-mxnet/pull/15167 ## Description ## This PR enables fusion of pointwise ops for GPU context, using runtime compilation (NVRTC). I will make and post a design doc in the near future. Work done by me and @Caenorst. **Important note**: this is **NOT** meant to compete with other long term compilation strategies for MXNet (like integration with TVM etc.). It is intended as a short term solution (and so e.g. does not tackle harder problems like fusion with convolutions/gemms) while those long term solutions are in development. This is the first stage of this effort, focusing on pointwise ops that do not change the shape of the output. Future PRs will tackle broadcast ops and slices. FYI @zheng-da @junrushao1994 @eric-haibin-lin @szha @KellenSunderland @nvchai ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [x] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - The fusion is controlled by MXNET_USE_FUSION env variable and currently on by default for GPU context (mostly to see how it will behave in the CI environment) - I will probably change it to opt-in later in the review process. - It supports both forward and backward pass - It currently does not support rebinding with different shapes - working on it. - It currently supports Symbol and Module API - support for Gluon is WIP.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
