[GitHub] [incubator-mxnet] Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-507582770 Reproducible case (sort of): `data <- mx.symbol.Variable("data")` `conv <- mx.symbol.Convolution(data = data, kernel = c(7, 7) , stride = c(2, 2), pad = c(0, 0), num.filter = 24, name = paste0("test", "_conv1"))` `act <- mx.symbol.LeakyReLU(data = conv, act.type = "elu", name = paste0("test", "_act1")) ` `softmax <- FC(data = act, name = "out")` Error (for batch size = 8): > Start training with 1 devices Error in mx.nd.internal.dispatch.Ops(.Generic, e1, e2) : [10:40:29] c:\build_mxnet\with_mkldnn\incubator-mxnet\src\operator\tensor\../elemwise_op_common.h:135: Check failed: assign(, vec.at(i)): Incompatible attr in node at 1-th input: expected [8], got [16] Calls: mx.model.FeedForward.create ... Ops.MXNDArray -> mx.nd.internal.dispatch.Ops -> .External Execution halted Error (for batch size = 16): > Start training with 1 devices Error in mx.nd.internal.dispatch.Ops(.Generic, e1, e2) : [10:52:19] c:\build_mxnet\with_mkldnn\incubator-mxnet\src\operator\tensor\../elemwise_op_common.h:135: Check failed: assign(, vec.at(i)): Incompatible attr in node at 1-th input: expected [16], got [32] Calls: mx.model.FeedForward.create ... Ops.MXNDArray -> mx.nd.internal.dispatch.Ops -> .External Execution halted Output from R CMD looks normal. > ** building package indices ** installing vignettes ** testing if installed package can be loaded from temporary location ** testing if installed package can be loaded from final location ** testing if installed package keeps a record of temporary installation path >* MD5 sums zip I/O error: No such file or directory zip error: Temporary file failure (D:/zih8Nowg) running 'zip' failed >* DONE (mxnet) But interaction with MKL-DNN looks wierd. Somehow **batch size is doubled**. Or dimensions mismatch...? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-50720 I'm lost too, because every time something new is popping out. Seems to me it's a OpenCV problem (?). 1. Build R package with MKL-DNN, OpenCV 4.1.0, Visual Studio 15. 2. Reproducible example (at last), data from [Kaggle](https://www.kaggle.com/jessicali9530/caltech256). `python im2rec.py --resize 224 --encoding .jpg --quality 60` `library(mxnet) data<-mx.io.ImageRecordIter(path.imgrec="D:/256_ObjectCategories/caltech256_train.rec", path.imglist="D:/256_ObjectCategories/caltech256_train.lst", batch.size = 8, data_shape = c(224, 224, 3)) data <- mx.symbol.Variable("data") conv <- mx.symbol.Convolution(data, kernel = c(7, 7) , stride = c(2, 2), pad = c(0, 0), num.filter = 24, name = paste0("test", "_conv1")) act <- mx.symbol.LeakyReLU(conv, act.type = "leaky", name = paste0("test", "_act1")) fc <- mx.symbol.FullyConnected(act, num_hidden = 256, name = paste0("test", "_FC")) softmax <- mx.symbol.SoftmaxOutput(fc, name = "softmax") devices <- mx.cpu() model <- mx.model.FeedForward.create(softmax, initializer=mx.init.Xavier(factor_type = "in", magnitude=2), X=data, ctx=devices, num.round=2, begin.round=epoch+1, eval.data=NULL, optimizer=mx.opt.create("sgd", learning.rate = 0.005, momentum = 0.9, wd = 0, lr_scheduler = NULL), eval.metric=mx.metric.accuracy)` Now I got... > Error in mx.model.init.iter(X, y, batch.size = array.batch.size, is.train = TRUE) : Need to provide parameter y for training with R arrays. With other network. > [20:53:34] c:\build_mxnet\with_mkldnn\incubator-mxnet\src\io\./image_iter_common.h:77: Loaded ImageList from D:/256_ObjectCategories/ca ltech256_train.lst 30607 Image records [20:53:34] C:\build_mxnet\with_mkldnn\incubator-mxnet\src\io\iter_image_recordio_2.cc:172: ImageRecordIOParser2: D:/256_ObjectCategorie s/caltech256_train.rec, use 4 threads for decoding.. Start training with 1 devices Error in mx.nd.internal.dispatch.Ops(.Generic, e1, e2) : [20:53:34] c:\build_mxnet\with_mkldnn\incubator-mxnet\src\operator\tensor\../elemwise_op_common.h:135: Check failed: assign(, v ec.at(i)): Incompatible attr in node at 1-th input: expected [8], got [16] Calls: mx.model.FeedForward.create ... Ops.MXNDArray -> mx.nd.internal.dispatch.Ops -> .External Execution halted This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-507582770 Reproducible case (sort of): `data <- mx.symbol.Variable("data")` `conv <- mx.symbol.Convolution(data = data, kernel = c(7, 7) , stride = c(2, 2), pad = c(0, 0), num.filter = 24, name = paste0("test", "_conv1"))` `act <- mx.symbol.LeakyReLU(data = conv, act.type = "elu", name = paste0("test", "_act1")) ` `softmax <- FC(data = act, name = "out")` Error (for batch size = 8): > Start training with 1 devices Error in mx.nd.internal.dispatch.Ops(.Generic, e1, e2) : [10:40:29] c:\build_mxnet\with_mkldnn\incubator-mxnet\src\operator\tensor\../elemwise_op_common.h:135: Check failed: assign(, vec.at(i)): Incompatible attr in node at 1-th input: expected [8], got [16] Calls: mx.model.FeedForward.create ... Ops.MXNDArray -> mx.nd.internal.dispatch.Ops -> .External Execution halted Error (for batch size = 16): > Start training with 1 devices Error in mx.nd.internal.dispatch.Ops(.Generic, e1, e2) : [10:52:19] c:\build_mxnet\with_mkldnn\incubator-mxnet\src\operator\tensor\../elemwise_op_common.h:135: Check failed: assign(, vec.at(i)): Incompatible attr in node at 1-th input: expected [16], got [32] Calls: mx.model.FeedForward.create ... Ops.MXNDArray -> mx.nd.internal.dispatch.Ops -> .External Execution halted Output from R CMD looks normal. > ** building package indices ** installing vignettes ** testing if installed package can be loaded from temporary location ** testing if installed package can be loaded from final location ** testing if installed package keeps a record of temporary installation path >* MD5 sums zip I/O error: No such file or directory zip error: Temporary file failure (D:/zih8Nowg) running 'zip' failed >* DONE (mxnet) But interaction with MKL-DNN looks wierd. Somehow **batch size is doubled**. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-507528863 Yes, and it's strange. Installed side by side R 3.6.0 build from source and downloaded AWS for 3.5.3. > Error in mx.varg.io.ImageRecordIter(list(...)) : [07:47:26] c:\incubator-mxnet\src\io\iter_image_recordio_2.cc:254: ImageRec need opencv to process Previously with the same network on R 3.5.3 with no custom MKL-DNN build there were no problems, except speed. ;-) > use 4 threads for decoding.. Start training with 1 devices Batch [17] Speed: 0.623502075662353 samples/sec Train-accuracy=0.988970588235294 Batch [34] Speed: 0.608181449057359 samples/sec Train-accuracy=0.988970588235294 Batch [51] Speed: 0.671458334887314 samples/sec Train-accuracy=0.990196078431373 Batch [68] Speed: 0.649002161683901 samples/sec Train-accuracy=0.990349264705882 Batch [85] Speed: 0.701097383006278 samples/sec Train-accuracy=0.991176470588235 Batch [102] Speed: 0.673339123131478 samples/sec Train-accuracy=0.991115196078431 Batch [119] Speed: 0.673430530321397 samples/sec Train-accuracy=0.990808823529412 Batch [136] Speed: 0.652799241039701 samples/sec Train-accuracy=0.991038602941177 [16] Train-accuracy=0.990875912408759 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-507500789 This will be hard, but... `traceback()` > 9: stop(list(message = "Unknown exception", call = mx.nd.internal.as.array(nd), >cppstack = list(file = "", line = -1L, stack = "C++ stack not available on this system"))) > 8: .External(list(name = "InternalFunction_invoke", address = , >dll = list(name = "Rcpp", path = ".../R/win-library/3.6/Rcpp/libs/x64/Rcpp.dll", >dynamicLookup = TRUE, handle = , >info = ), numParameters = -1L), >, ...) > 7: mx.nd.internal.as.array(nd) > 6: as.array.MXNDArray(res) > 5: as.array(res) > 4: feval(label, pred) > 3: metric$update(label = labels[[i]], pred = preds[[i]], state = train.metric) > 2: mx.model.train(symbol, ctx, input.shape, output.shape, params$arg.params, >params$aux.params, begin.round, num.round, optimizer = optimizer, >train.data = X, eval.data = eval.data, metric = eval.metric, >epoch.end.callback = epoch.end.callback, batch.end.callback = batch.end.callback, >kvstore = kvstore, fixed.param = fixed.param, verbose = verbose, >metric_cpu = metric_cpu) > 1: mx.model.FeedForward.create(softmax, initializer = mx.init.Xavier(factor_type = "in", >magnitude = 2), X = dane, ctx = devices, num.round = 300, >begin.round = epoch + 1, eval.data = NULL, optimizer = mx.opt.create("sgd", >learning.rate = 0.005, momentum = 0.9, wd = 0, lr_scheduler = fs), >eval.metric = mx.metric.accuracy, batch.end.callback = mx.callback.log.speedometer(...)), epoch.end.callback = mx.callback.save.checkpoint(paste0("...", >...), 1)) Surprisingly, examples from [tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/r/fiveMinutesNeuralNetwork.html) works fine. Problem occurs when I use `eval.metric` with ` mx.metric.accuracy` arg. If it's changed to `mx.metric.logloss` > Start training with 1 devices Error in mx.nd.internal.dispatch.Ops(.Generic, e1, e2) : [06:03:51] c:\build_mxnet\with_mkldnn\incubator-mxnet\src\operator\tensor\../elemwise_op_common.h:135: Check failed: assign(, vec.at(i)): Incompatible attr in node at 1-th input: expected [8], got [16] Calls: mx.model.FeedForward.create ... Ops.MXNDArray -> mx.nd.internal.dispatch.Ops -> .External Execution halted If I set it to NULL, network is learning but saving results is impossible : > Start training with 1 devices Error in mx.nd.internal.save(ndarray, filename) : Unknown exception The only thing that's may difer in my network: `InstanceBatchNorm <- function(data, name, eps = 2e-5) { data_split <- mx.symbol.split(data = data, num_outputs = 2, axis = 1, name = paste0(name, "_split")) data_split_in1 <- mx.symbol.InstanceNorm(data = data_split[[1]], eps= eps, name = paste0(name, "_split_in1")) data_split_bn2 <- mx.symbol.BatchNorm(data = data_split[[2]], eps = eps, fix.gamma = FALSE, name = paste0(name, "_split_bn2")) con <- mx.symbol.concat(list(data_split_in1, data_split_bn2), num.args = 2, dim = 1, name = paste0(name, "_ibn")) return(con) }` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-507500789 This will be hard, but... `traceback()` > 9: stop(list(message = "Unknown exception", call = mx.nd.internal.as.array(nd), >cppstack = list(file = "", line = -1L, stack = "C++ stack not available on this system"))) > 8: .External(list(name = "InternalFunction_invoke", address = , >dll = list(name = "Rcpp", path = ".../R/win-library/3.6/Rcpp/libs/x64/Rcpp.dll", >dynamicLookup = TRUE, handle = , >info = ), numParameters = -1L), >, ...) > 7: mx.nd.internal.as.array(nd) > 6: as.array.MXNDArray(res) > 5: as.array(res) > 4: feval(label, pred) > 3: metric$update(label = labels[[i]], pred = preds[[i]], state = train.metric) > 2: mx.model.train(symbol, ctx, input.shape, output.shape, params$arg.params, >params$aux.params, begin.round, num.round, optimizer = optimizer, >train.data = X, eval.data = eval.data, metric = eval.metric, >epoch.end.callback = epoch.end.callback, batch.end.callback = batch.end.callback, >kvstore = kvstore, fixed.param = fixed.param, verbose = verbose, >metric_cpu = metric_cpu) > 1: mx.model.FeedForward.create(softmax, initializer = mx.init.Xavier(factor_type = "in", >magnitude = 2), X = dane, ctx = devices, num.round = 300, >begin.round = epoch + 1, eval.data = NULL, optimizer = mx.opt.create("sgd", >learning.rate = 0.005, momentum = 0.9, wd = 0, lr_scheduler = fs), >eval.metric = mx.metric.accuracy, batch.end.callback = mx.callback.log.speedometer(...)), epoch.end.callback = mx.callback.save.checkpoint(paste0("...", >...), 1)) Surprisingly, examples from [tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/r/fiveMinutesNeuralNetwork.html) works fine. Problem occurs when I use `eval.metric` with ` mx.metric.accuracy` arg. If it's changed to `mx.metric.logloss` > Start training with 1 devices Error in mx.nd.internal.dispatch.Ops(.Generic, e1, e2) : [06:03:51] c:\build_mxnet\with_mkldnn\incubator-mxnet\src\operator\tensor\../elemwise_op_common.h:135: Check failed: assign(, vec.at(i)): Incompatible attr in node at 1-th input: expected [8], got [16] Calls: mx.model.FeedForward.create ... Ops.MXNDArray -> mx.nd.internal.dispatch.Ops -> .External Execution halted If I set it to NULL, network is learning but saving results is impossible : > Start training with 1 devices Error in mx.nd.internal.save(ndarray, filename) : Unknown exception This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-507500789 This will be hard, but... `traceback()` > 9: stop(list(message = "Unknown exception", call = mx.nd.internal.as.array(nd), >cppstack = list(file = "", line = -1L, stack = "C++ stack not available on this system"))) > 8: .External(list(name = "InternalFunction_invoke", address = , >dll = list(name = "Rcpp", path = ".../R/win-library/3.6/Rcpp/libs/x64/Rcpp.dll", >dynamicLookup = TRUE, handle = , >info = ), numParameters = -1L), >, ...) > 7: mx.nd.internal.as.array(nd) > 6: as.array.MXNDArray(res) > 5: as.array(res) > 4: feval(label, pred) > 3: metric$update(label = labels[[i]], pred = preds[[i]], state = train.metric) > 2: mx.model.train(symbol, ctx, input.shape, output.shape, params$arg.params, >params$aux.params, begin.round, num.round, optimizer = optimizer, >train.data = X, eval.data = eval.data, metric = eval.metric, >epoch.end.callback = epoch.end.callback, batch.end.callback = batch.end.callback, >kvstore = kvstore, fixed.param = fixed.param, verbose = verbose, >metric_cpu = metric_cpu) > 1: mx.model.FeedForward.create(softmax, initializer = mx.init.Xavier(factor_type = "in", >magnitude = 2), X = dane, ctx = devices, num.round = 300, >begin.round = epoch + 1, eval.data = NULL, optimizer = mx.opt.create("sgd", >learning.rate = 0.005, momentum = 0.9, wd = 0, lr_scheduler = fs), >eval.metric = mx.metric.accuracy, batch.end.callback = mx.callback.log.speedometer(...)), epoch.end.callback = mx.callback.save.checkpoint(paste0("...", >...), 1)) Surprisingly, examples from [tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/r/fiveMinutesNeuralNetwork.html) works fine. Problem occurs when I use `eval.metric` with ` mx.metric.accuracy` arg. If it's changed to `mx.metric.logloss` > Start training with 1 devices Error in mx.nd.internal.dispatch.Ops(.Generic, e1, e2) : [06:03:51] c:\build_mxnet\with_mkldnn\incubator-mxnet\src\operator\tensor\../elemwise_op_common.h:135: Check failed: assign(, vec.at(i)): Incompatible attr in node at 1-th input: expected [8], got [16] If I set it to NULL, network is learning but saving results is impossible : > Start training with 1 devices Error in mx.nd.internal.save(ndarray, filename) : Unknown exception This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-507500789 This will be hard, but... `traceback()` > 9: stop(list(message = "Unknown exception", call = mx.nd.internal.as.array(nd), >cppstack = list(file = "", line = -1L, stack = "C++ stack not available on this system"))) > 8: .External(list(name = "InternalFunction_invoke", address = , >dll = list(name = "Rcpp", path = ".../R/win-library/3.6/Rcpp/libs/x64/Rcpp.dll", >dynamicLookup = TRUE, handle = , >info = ), numParameters = -1L), >, ...) > 7: mx.nd.internal.as.array(nd) > 6: as.array.MXNDArray(res) > 5: as.array(res) > 4: feval(label, pred) > 3: metric$update(label = labels[[i]], pred = preds[[i]], state = train.metric) > 2: mx.model.train(symbol, ctx, input.shape, output.shape, params$arg.params, >params$aux.params, begin.round, num.round, optimizer = optimizer, >train.data = X, eval.data = eval.data, metric = eval.metric, >epoch.end.callback = epoch.end.callback, batch.end.callback = batch.end.callback, >kvstore = kvstore, fixed.param = fixed.param, verbose = verbose, >metric_cpu = metric_cpu) > 1: mx.model.FeedForward.create(softmax, initializer = mx.init.Xavier(factor_type = "in", >magnitude = 2), X = dane, ctx = devices, num.round = 300, >begin.round = epoch + 1, eval.data = NULL, optimizer = mx.opt.create("sgd", >learning.rate = 0.005, momentum = 0.9, wd = 0, lr_scheduler = fs), >eval.metric = mx.metric.accuracy, batch.end.callback = mx.callback.log.speedometer(...)), epoch.end.callback = mx.callback.save.checkpoint(paste0("...", >...), 1)) Surprisingly, examples from [tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/r/fiveMinutesNeuralNetwork.html) works fine. Problem occurs when I use `eval.metric` with ` mx.metric.accuracy` arg. If it's changed to `mx.metric.logloss` > Start training with 1 devices Error in mx.nd.internal.dispatch.Ops(.Generic, e1, e2) : [06:03:51] c:\build_mxnet\with_mkldnn\incubator-mxnet\src\operator\tensor\../elemwise_op_common.h:135: Check failed: assign(, vec.at(i)): Incompatible attr in node at 1-th input: expected [8], got [16] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 edited a comment on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-507500789 This will be hard, but... `traceback()` > 9: stop(list(message = "Unknown exception", call = mx.nd.internal.as.array(nd), cppstack = list(file = "", line = -1L, stack = "C++ stack not available on this system"))) 8: .External(list(name = "InternalFunction_invoke", address = , dll = list(name = "Rcpp", path = ".../R/win-library/3.6/Rcpp/libs/x64/Rcpp.dll", dynamicLookup = TRUE, handle = , info = ), numParameters = -1L), , ...) 7: mx.nd.internal.as.array(nd) 6: as.array.MXNDArray(res) 5: as.array(res) 4: feval(label, pred) 3: metric$update(label = labels[[i]], pred = preds[[i]], state = train.metric) 2: mx.model.train(symbol, ctx, input.shape, output.shape, params$arg.params, params$aux.params, begin.round, num.round, optimizer = optimizer, train.data = X, eval.data = eval.data, metric = eval.metric, epoch.end.callback = epoch.end.callback, batch.end.callback = batch.end.callback, kvstore = kvstore, fixed.param = fixed.param, verbose = verbose, metric_cpu = metric_cpu) 1: mx.model.FeedForward.create(softmax, initializer = mx.init.Xavier(factor_type = "in", magnitude = 2), X = dane, ctx = devices, num.round = 300, begin.round = epoch + 1, eval.data = NULL, optimizer = mx.opt.create("sgd", learning.rate = 0.005, momentum = 0.9, wd = 0, lr_scheduler = fs), eval.metric = mx.metric.accuracy, batch.end.callback = mx.callback.log.speedometer(...)), epoch.end.callback = mx.callback.save.checkpoint(paste0("...", ...), 1)) Surprisingly, examples from [tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/r/fiveMinutesNeuralNetwork.html) works fine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services