ptrendx opened a new pull request #18113:
URL: https://github.com/apache/incubator-mxnet/pull/18113


   * Vectorized loads for binary elemwise kernel
   
   * More generalization
   
   * Add backwardusenone
   
   * Remove the unused _backward_add op
   
   * Add vectorized backwardusein
   
   * Extending vectorization to more binary ops, binary ops with scalar and
   unary ops
   
   * Handling ElementwiseSum
   
   * Get rid of half2 in mshadow
   
   * Remove backward_elemwiseaddex
   
   * Revert "Remove the unused _backward_add op"
   
   This reverts commit f86da86f809c8cbad07db76a3554f23890fe05a3.
   
   * Revert "Remove backward_elemwiseaddex"
   
   This reverts commit 7729114caf6a1718c08ce1f35529d2267057d515.
   
   * Add back the backward_add since C++ test relies on it
   
   * Test bcast implementations
   
   * First version of vecotrized bcast
   
   * Adding single side vectorized bcast kernel
   
   * Removing debug prints
   
   * Actually run the single side kernel
   
   * Move the default implementation of bcast to the vectorized one
   
   * Limit the new implementation to GPU only
   
   * Enabling vectorization when broadcast does not actually do broadcast
   
   * Cleaning
   
   * Cleaning part 2
   
   * Fix for numpy ops using stuff from broadcast
   
   * Fix
   
   * Fix lint
   
   * Try to debug pinv numpy test
   
   * Fix
   
   * Fix the vectorized broadcast implementation for misaligned input
   pointers
   
   * Added tests
   
   * Added docs to cuda_vectorization.cuh
   
   * Another fix for broadcast and fix INT64 compilation
   
   * Optimize for aligned=true
   
   * 1 more addition to test
   
   * Reverting the change to Numpy op test
   
   * Trying mcmodel=medium to fix the failure in CMake static build
   
   * Revert "Trying mcmodel=medium to fix the failure in CMake static build"
   
   This reverts commit 1af684c507dd5b2c7ab7ffe89d21799320e3d9c6.
   
   * Limiting the PR to just elementwise ops
   
   @ciyongch 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to