marcoabreu commented on a change in pull request #18921:
URL: https://github.com/apache/incubator-mxnet/pull/18921#discussion_r471827712



##########
File path: 3rdparty/mshadow/CMakeLists.txt
##########
@@ -59,6 +59,9 @@ endif()
 if(USE_CUDNN)
   target_compile_definitions(mshadow INTERFACE MSHADOW_USE_CUDNN)
 endif()
+if(USE_CUTENSOR)
+  target_compile_definitions(mshadow INTERFACE MSHADOW_USE_CUTENSOR)
+endif()

Review comment:
       No that doesn't really make a difference, we have to consider them 
either way.
   
   We now again have competing operator implementations which are chosen at 
compile time - numpyGpu vs cutensor. This creates another situation where from 
a user perspective, we are adding another magic knob which improves performance 
but is pretty hidden. As a product, we should make mxnet smart enough to figure 
these things out at runtime - run with cutensor if available and fall back to 
numpy implementation (our own) if not available. But adding even more compile 
time branching sounds totally user-unfriendly in my opinion. 
   
   In the end, this might totally work for a benchmark scenario or a situation 
where somebody knows exactly what's happening in every detail of mxnet, but the 
standard user should get a good Performance without having to know every single 
detail of every single operator and which build flags to choose - this is a 
more fundamental thing. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to