I have had the same experience that Patric describes, having tried to use a 
model that had operators with hardware-specific (cudnn_off in my case) 
attributes and unable to use the model more generally. However, I also 
appreciate what Dick is proposing and I too see a need for hardware specific 
“customization” of operators that is scalable.

Instead, I propose that we have a way of providing hardware-specific operator 
attributes that is orthogonal to the MXNet (NNVM) operator registration 
abstraction. When we need to change an operator for a specific 
hardware/processor/accelerator we shouldn’t need to modify MXNet’s source code.

One possibility is to use the Accelerator API proposal: 
https://cwiki.apache.org/confluence/display/MXNET/Bring+your+own+Accelerator to 
treat each processor (CPU, GPU:CUDA, GPU:CUDNN, GPU:TensorRT, etc) as a 
separate hardware backend, to clearly demarcate what is a “pure” operator 
definition and what are hardware-specific attributes for operators.

Sam

On Jun 4, 2019, at 7:29 PM, Zhao, Patric 
<[email protected]<mailto:[email protected]>> wrote:

Thanks for the new proposal.

My concern for the current proposal is that the script/code will be NOT 
portable and backward compatible, also increase the complexity for the usage, 
with such backend specific info in the operator.
Let's say if the user set the backend parameter in their script , such as conv 
algo=Winograd, precision=fp16, layout=NHWC, etc.
This group of the parameter can get the best performance in the tested HW but 
maybe cause performance degradation and even can't be executed in different HW.
One example is from the Github issue where the `layout` parameter caused the 
error, https://github.com/apache/incubator-mxnet/issues/15079

Thus, I think we need to remove this kind of context-specific operator 
parameters, like 'cudnn_tune', 'cudnn_off`, `layout`, rather than adding more 
parameters into operator.
I suggest hiding this kind of optimization and selection to backend, maybe 
using subgraph.

Thanks,

--Patric


-----Original Message-----
From: Dick Carter [mailto:[email protected]]
Sent: Tuesday, June 4, 2019 8:21 AM
To: [email protected]<mailto:[email protected]>
Subject: Context-specific operator parameters

MXNet has a number of context-specific operator parameters:  'cudnn_tune',
'cudnn_off' and 'workspace' are parameters that control the behavior of
Convolution on gpu contexts with NVIDIA gpus.  Even with these, there
would be benefits to having additional parameters, e.g. to  set Convolution
algos by number, or force the compute precision to float16.  With the desire
to support multiple backends and a growing number of operators, it's time to
ask the question, "Is this scalable?"

I propose that, rather than adding a new parameter at the Python level for
each new backend-specific parameter 'knob', all context-specific parameters
be swept into a single dictionary, called e.g. 'ctx_params':

Convolution(..., ctx_params= {'cudnn_tune': 2, 'cudnn_off': False,
'workspace': 2000}, ...)

I'll stop short of working out all the details to hopefully generate more
discussion.  Some open questions:

Do all backends share the same namespace, or do we have separate
'gpu_ctx_params', 'cpu_ctx_params', etc.?

Is there a clean extension to the general parameter parsing facility of dmlc to
handle this dictionary, and what form do these extension params take in the
backend, Map<string,string>?

And while this proposes to organize and consolidate these context-specific
parameters at the Python level, we'd need to tolerate (and auto-create)
documentation for these new parameters.

Other approaches welcome.

Reply via email to