marcoabreu commented on a change in pull request #18921:
URL: https://github.com/apache/incubator-mxnet/pull/18921#discussion_r471827712
##########
File path: 3rdparty/mshadow/CMakeLists.txt
##########
@@ -59,6 +59,9 @@ endif()
if(USE_CUDNN)
target_compile_definitions(mshadow INTERFACE MSHADOW_USE_CUDNN)
endif()
+if(USE_CUTENSOR)
+ target_compile_definitions(mshadow INTERFACE MSHADOW_USE_CUTENSOR)
+endif()
Review comment:
No that doesn't really make a difference, we have to consider them
either way.
We now again have competing operator implementations which are chosen at
compile time - numpyGpu vs cutensor. This creates another situation where from
a user perspective, we are adding another magic knob which improves performance
but is pretty hidden. As a product, we should make mxnet smart enough to figure
these things out at runtime - run with cutensor if available and fall back to
numpy implementation (our own) if not available. But adding even more compile
time branching sounds totally user-unfriendly in my opinion.
In the end, this might totally work for a benchmark scenario or a situation
where somebody knows exactly what's happening in every detail of mxnet, but the
standard user should get a good Performance without having to know every single
detail of every single operator and which build flags to choose - this is a
more fundamental thing.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]