ptrendx opened a new pull request #15167: [WIP] Pointwise fusion for GPU
URL: https://github.com/apache/incubator-mxnet/pull/15167
 
 
   ## Description ##
   This PR enables fusion of pointwise ops for GPU context, using runtime 
compilation (NVRTC). I will make and post a design doc in the near future.
   
   Work done by me and @Caenorst.
   
   **Important note**: this is **NOT** meant to compete with other long term 
compilation strategies for MXNet (like integration with TVM etc.). It is 
intended as a short term solution (and so e.g. does not tackle harder problems 
like fusion with convolutions/gemms) while those long term solutions are in 
development.
   
   This is the first stage of this effort, focusing on pointwise ops that do 
not change the shape of the output. Future PRs will tackle broadcast ops and 
slices.
   
   FYI @zheng-da @junrushao1994 @eric-haibin-lin @szha @KellenSunderland 
@nvchai 
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [x] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - The fusion is controlled by MXNET_USE_FUSION env variable and currently on 
by default for GPU context (mostly to see how it will behave in the CI 
environment) - I will probably change it to opt-in later in the review process. 
   - It supports both forward and backward pass
   - It currently does not support rebinding with different shapes - working on 
it.
   - It currently supports Symbol and Module API - support for Gluon is WIP.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to