[1/2] incubator-singa git commit: SINGA-131 Implement and optimize hybrid training using both CPU and GPU

wangsh Wed, 06 Apr 2016 05:29:09 -0700

Repository: incubator-singa
Updated Branches:
  refs/heads/master c97b970dc -> 040cbb2e1



SINGA-131 Implement and optimize hybrid training using both CPU and GPU

Allow users set the batchsize of instances of StoreInputLayer manually.
We can then assign different workload for GPU and CPU workers who have
different input layers, e.g.,
```
 batchsize: 128
 batchsize: 16
 ```
 If the first worker is GPU and the second worker is CPU, then the
 above setting would assign 128 images per mini-batch to the GPU worker
 and assign 16 images to the CPU worker.

 Internally, StoreInputLayer gets its batchsize based on its partition
 ID.
 Currently, it works for MNIST example which has no Conv layers.
 For Conv layers, since the GPU and CPU implementations have different
 layer names, we cannot using the single net config.


Project: http://git-wip-us.apache.org/repos/asf/incubator-singa/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-singa/commit/a91e82f3
Tree: http://git-wip-us.apache.org/repos/asf/incubator-singa/tree/a91e82f3
Diff: http://git-wip-us.apache.org/repos/asf/incubator-singa/diff/a91e82f3

Branch: refs/heads/master
Commit: a91e82f3c8771980bff916511a7c750f5f6d039d
Parents: c97b970
Author: Wei Wang <[email protected]>
Authored: Mon Apr 4 13:39:19 2016 +0800
Committer: Wei Wang <[email protected]>
Committed: Wed Apr 6 17:51:14 2016 +0800

----------------------------------------------------------------------
 src/neuralnet/input_layer/store.cc | 11 ++++++++---
 src/proto/job.proto                |  2 +-
 2 files changed, 9 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-singa/blob/a91e82f3/src/neuralnet/input_layer/store.cc
----------------------------------------------------------------------
diff --git a/src/neuralnet/input_layer/store.cc 
b/src/neuralnet/input_layer/store.cc
index 283b0c7..f213227 100644
--- a/src/neuralnet/input_layer/store.cc
+++ b/src/neuralnet/input_layer/store.cc
@@ -34,10 +34,15 @@ StoreInputLayer::~StoreInputLayer() {
 void StoreInputLayer::Setup(const LayerProto& conf,
     const vector<Layer*>& srclayers) {
   InputLayer::Setup(conf, srclayers);
-  batchsize_ = conf.store_conf().batchsize();
+  const auto& batchsize = conf.store_conf().batchsize();
+  CHECK(batchsize.size());
   if (conf.partition_dim() == 0) {
-    batchsize_ /= conf.num_partitions();
-  }
+    if (batchsize.size() == 1)  // equal partition
+      batchsize_ = batchsize.Get(0) / conf.num_partitions();
+    else  // manual partition
+      batchsize_ = batchsize.Get(conf.partition_id());
+  } else
+    batchsize_ = conf.store_conf().batchsize(0);
 }
 
 void StoreInputLayer::ComputeFeature(int flag,

http://git-wip-us.apache.org/repos/asf/incubator-singa/blob/a91e82f3/src/proto/job.proto
----------------------------------------------------------------------
diff --git a/src/proto/job.proto b/src/proto/job.proto
index 6afc599..1a6ab42 100644
--- a/src/proto/job.proto
+++ b/src/proto/job.proto
@@ -374,7 +374,7 @@ message StoreProto {
   optional string std_file = 5;
   optional float mean_value = 6;
   optional float std_value = 7;
-  optional int32 batchsize = 8 [default = 1];
+  repeated int32 batchsize = 8;
   repeated int32 shape = 9;
   optional bool encoded = 10 [default = false];
   optional int32 random_skip = 11 [default = 0];

[1/2] incubator-singa git commit: SINGA-131 Implement and optimize hybrid training using both CPU and GPU

Reply via email to