Yes, currently the structure we have is - `operator_name.cc` which contains operator definition (+ all the infershape/type etc.) and `FCompute<cpu>` - `operator_name.cu` which contains just `FCompute<gpu>`
We should change that to something like: - `src/operator/operator_name.cc` which contains all the device independent operator definition - `src/operator_impl/cpu/operator_name.cc` which contains just `FCompute<cpu>` - `src/operator_impl/cuda/operator_name.cu` which contains just `FCompute<gpu>` This would make it possible to have a subgraph backend replace whatever they need, as all the operator definitions would still exist. And I agree, together with the external ops functionality we could make it so `libmxnet.so` contains just the operator definitions, while separate `.so` would contains the actual implementations for different platforms. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-mxnet/issues/19521#issuecomment-728182152