DickJC123 commented on a change in pull request #7347: Tensorcore conv deconv 
URL: https://github.com/apache/incubator-mxnet/pull/7347#discussion_r132093608

 File path: src/operator/cudnn_algoreg-inl.h
 @@ -40,12 +65,17 @@ class CuDNNAlgoReg {
     oss << "cudnn_data_type=" << cudnn_data_type << ";";
     oss << "cudnn_forward_compute_type=" << cudnn_forward_compute_type << ";";
     oss << "cudnn_backward_compute_type=" << cudnn_backward_compute_type << 
+    // A system could be heterogeneous and thus have different algo choices 
for different
+    // device ids.  'device_id' could possibly be replaced with gpu compute 
+    // but identical GPUs could technically have different clock settings.
+    oss << "device_id=" << device_id << ";";
 Review comment:
   I'll update the PR tomorrow by substituting compute capability.  This will 
ensure proper operation for a workstation with both a PASCAL and a VOLTA brick, 
yet will improve the algo selection speed for an 8-way homogeneous system.  
Regarding key compaction, I'll point out that all key look-ups are performed 
during the graph construction phase, never during inference or training.  Also, 
the key look-up times are probably dwarfed by any Find() calls that are run 
during this same phase.  Not sure how the times compare to the Get() calls 
performed when auto-tuning it turned off.
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

Reply via email to