I think an alternative choice is to define xpu at compile time, and use xpu whenever we need to declare the Tensor or Blob objects. For some layers whose data must be store in cpu memory, e.g., DataLayer which loads disk data into memory, we hard code the declarations to be Blob<cpu>.
In this way, we only need to replace the cpu to xpu in most code under neuralnet folder (am I right?) Issue 1: The major drawback is that users have to choose the running mode at compile time. Issue 2 :The minor drawback is that we cannot utilize both cpu and gpu to do the computation for ComputeFeature and ComputeGradient. Actually this is also a problem of other systems, like Caffe and CXXNET. They also cannot run workload on both cpu and gpu. Users have to choose a running mode before they start the program. <<Caffe con Troll: Shallow Ideas to Speed Up Deep Learning>> extends caffe to support this running mode. In some situations, it may be useful, e.g., when you have only one gpu card but many cpus. Currently, I think it is not our major concern. Hence, my suggestion is to simply replace cpu with xpu and define xpu at compile time. We will not waste time for implementing this even though we decide to solve issue 1 later. If any other suggestions or concerns, please let us know. regards, wang wei On Sat, Jun 27, 2015 at 9:03 AM, 陈海波 <[email protected]> wrote: > Hi,wang wei~ > When I try to modify interface about GPU interfaces,I find some > questions should be discussed. > > My solution: > > xxx layer: > > ComputeFeature() { > Tensor<xpu, 2> data // some places use tensor class > blob<xpu,float> data_ // some places use blob class,and suppose I > have added a device template parameter into blob class. > } > > ComputeGradient() { > //the same as ComputeFeature function > } > > > Some disadvantages: > All layer classes must be changed,and it spends too much on the change > of code. > > eg. > > template<typename xpu> > class Param { > } > > template<typename xpu> > class XXXLayer { > > protected: > Blob<float,xpu> data_; > shared_ptr<Param<xpu>> weight_, bias_; > ........ > > } > > What dou you think ? And if you have a good idea,please let me know. > thanks~ >
