benqua commented on issue #8129: [scala] Module api: label (1,1,868,868) and 
prediction (1,868,868)should have the same length
URL: 
https://github.com/apache/incubator-mxnet/issues/8129#issuecomment-333992969
 
 
   ok, I checked and log the shapes as suggested and realize that it is not the 
batch size dimension that is lost but the channel one (I had 1 for both, so I 
didn't realize at first).
   
   The network is a u-net very similar as the one describe in the original 
u-net paper.
   Each pixel can be in one of two classes, as in the original paper.
   So, the last layers are:
   ```scala
       // output
       val conv10 = Symbol.Convolution()()(Map("data" -> conv9, "num_filter" -> 
2, "kernel" -> "(1,1)"))
       val label  = Symbol.Variable("softmax_label")
       val so     = Symbol.SoftmaxOutput()()(Map("data" -> conv10, "label" -> 
label, "multi_output" -> true))
   ```
   Now, when I run the code to train the network (posted above) with more 
logging, I get the following:
   ```
   2017-10-03 23:40:37,808 [run-main-0] [UNet] [INFO] - so - Shape: 
Vector((1,2,868,868))
   2017-10-03 23:40:37,897 [run-main-0] [TrainModuleUNet] [INFO] - symbol 
shape: Vector((1,2,868,868))
   2017-10-03 23:40:37,899 [run-main-0] [TrainModuleUNet] [INFO] - 
providedData: data -> (1,1,1052,1052)
   2017-10-03 23:40:37,900 [run-main-0] [TrainModuleUNet] [INFO] - 
providedLabel: softmax_label -> (1,1,868,868)
   2017-10-03 23:40:38,038 [run-main-0] [TrainModuleUNet] [INFO] - bound!
   2017-10-03 23:40:38,088 [run-main-0] [TrainModuleUNet] [INFO] - initialized!
   2017-10-03 23:40:38,089 [run-main-0] [ml.dmlc.mxnet.module.Module] [WARN] - 
Already binded, ignoring bind()
   MKL Build:20170720
   [error] (run-main-0) java.lang.IllegalArgumentException: requirement failed: 
label (1,1,868,868) and prediction (1,868,868)should have the same length.
   java.lang.IllegalArgumentException: requirement failed: label (1,1,868,868) 
and prediction (1,868,868)should have the same length.
        at scala.Predef$.require(Predef.scala:224)
        at ml.dmlc.mxnet.Accuracy$$anonfun$update$4.apply(EvalMetric.scala:111)
   (...)
   ```
   The output of my network has a shape of (1, 2, 868, 868). However, the error 
message said that prediction shape is (1, 868, 868). How can this be?
   
   I also see that my label is likely not in the right shape (one channel, with 
either 0 or 1 instead of two channels with the probability of 0 and 1). 
However, the bind function seems ok, which makes me think that there is 
possibly a implicit conversion done somewhere.
   
   Another very strange thing is that the program doesn't really stop after 
this exception. Memory and CPU usage continue to grow up until I kill sbt. 
Despite the filed require, the C++ backend continues to work...
   
   Any hint about how to correctly use SoftmaxOutput with muli_output would be 
greatly appreciate. :)
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to