[GitHub] [incubator-mxnet] Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-510034165 @RuilinZhuIntel, @TaoLv Thanks guys! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-50720 I'm lost too, because every time something new is popping out. Seems to me it's a OpenCV problem (?). 1. Build R package with MKL-DNN, OpenCV 4.1.0, Visual Studio 15. 2. Reproducible example (at last), data from [Kaggle](https://www.kaggle.com/jessicali9530/caltech256). `python im2rec.py --resize 224 --encoding .jpg --quality 60` `library(mxnet) data<-mx.io.ImageRecordIter(path.imgrec="D:/256_ObjectCategories/caltech256_train.rec", path.imglist="D:/256_ObjectCategories/caltech256_train.lst", batch.size = 8, data_shape = c(224, 224, 3)) data <- mx.symbol.Variable("data") conv <- mx.symbol.Convolution(data, kernel = c(7, 7) , stride = c(2, 2), pad = c(0, 0), num.filter = 24, name = paste0("test", "_conv1")) act <- mx.symbol.LeakyReLU(conv, act.type = "leaky", name = paste0("test", "_act1")) fc <- mx.symbol.FullyConnected(act, num_hidden = 256, name = paste0("test", "_FC")) softmax <- mx.symbol.SoftmaxOutput(fc, name = "softmax") devices <- mx.cpu() model <- mx.model.FeedForward.create(softmax, initializer=mx.init.Xavier(factor_type = "in", magnitude=2), X=data, ctx=devices, num.round=2, begin.round=epoch+1, eval.data=NULL, optimizer=mx.opt.create("sgd", learning.rate = 0.005, momentum = 0.9, wd = 0, lr_scheduler = NULL), eval.metric=mx.metric.accuracy)` Now I got... > Error in mx.model.init.iter(X, y, batch.size = array.batch.size, is.train = TRUE) : Need to provide parameter y for training with R arrays. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-507582770 Reproducible case (sort of): `data <- mx.symbol.Variable("data")` `conv <- mx.symbol.Convolution(data = data, kernel = c(7, 7) , stride = c(2, 2), pad = c(0, 0), num.filter = 24, name = paste0("test", "_conv1"))` `act <- mx.symbol.LeakyReLU(data = conv, act.type = "elu", name = paste0("test", "_act1")) ` `softmax <- FC(data = act, name = "out")` Error: > Start training with 1 devices Error in mx.nd.internal.dispatch.Ops(.Generic, e1, e2) : [10:40:29] c:\build_mxnet\with_mkldnn\incubator-mxnet\src\operator\tensor\../elemwise_op_common.h:135: Check failed: assign(, vec.at(i)): Incompatible attr in node at 1-th input: expected [8], got [16] Calls: mx.model.FeedForward.create ... Ops.MXNDArray -> mx.nd.internal.dispatch.Ops -> .External Execution halted Output from R CMD looks normal. > ** building package indices ** installing vignettes ** testing if installed package can be loaded from temporary location ** testing if installed package can be loaded from final location ** testing if installed package keeps a record of temporary installation path >* MD5 sums zip I/O error: No such file or directory zip error: Temporary file failure (D:/zih8Nowg) running 'zip' failed >* DONE (mxnet) But interaction with MKL-DNN looks wierd. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-507537682 Yes, error is the same despite of batch norm type. For R 3.6.0 `install.packages("https://s3.ca-central-1.amazonaws.com/jeremiedb/share/mxnet/CPU/3.6/mxnet.zip;, repos = NULL)` I used this version with no OpenCV errors. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-507532542 Of course. But why it fails with official release now? `c:\build_mxnet\with_mkldnn\incubator-mxnet\build>cmake -G "Visual Studio 15 Win6 4" .. -DUSE_CUDA=0 -DUSE_CUDNN=0 -DUSE_NVRTC=0 -DUSE_OPENCV=1 -DUSE_OPENMP=1 -DU SE_PROFILER=1 -DUSE_BLAS=mkl -DUSE_LAPACK=1 -DUSE_DIST_KVSTORE=0 -DCUDA_ARCH_NAM E=All -DUSE_MKLDNN=1 -DCMAKE_BUILD_TYPE=Release -DMKL_ROOT="C:\Program Files (x8 6)\IntelSWTools\compilers_and_libraries\windows\mkl"` I've checked with` InstanceBatchNorm` and `mx.symbol.BatchNorm`, nothing changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-507528863 Yes, and it's strange. Installed side by side R 3.6.0 build from source and downloaded AWS for 3.5.3. > Error in mx.varg.io.ImageRecordIter(list(...)) : [07:47:26] c:\incubator-mxnet\src\io\iter_image_recordio_2.cc:254: ImageRec need opencv to process This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array
Crunchy9 commented on issue #15420: [R] MKL-DNN support: "Unknown exception" in mx.nd.internal.as.array URL: https://github.com/apache/incubator-mxnet/issues/15420#issuecomment-507500789 This will be hard, but... `traceback()` > 9: stop(list(message = "Unknown exception", call = mx.nd.internal.as.array(nd), cppstack = list(file = "", line = -1L, stack = "C++ stack not available on this system"))) 8: .External(list(name = "InternalFunction_invoke", address = , dll = list(name = "Rcpp", path = ".../R/win-library/3.6/Rcpp/libs/x64/Rcpp.dll", dynamicLookup = TRUE, handle = , info = ), numParameters = -1L), , ...) 7: mx.nd.internal.as.array(nd) 6: as.array.MXNDArray(res) 5: as.array(res) 4: feval(label, pred) 3: metric$update(label = labels[[i]], pred = preds[[i]], state = train.metric) 2: mx.model.train(symbol, ctx, input.shape, output.shape, params$arg.params, params$aux.params, begin.round, num.round, optimizer = optimizer, train.data = X, eval.data = eval.data, metric = eval.metric, epoch.end.callback = epoch.end.callback, batch.end.callback = batch.end.callback, kvstore = kvstore, fixed.param = fixed.param, verbose = verbose, metric_cpu = metric_cpu) 1: mx.model.FeedForward.create(softmax, initializer = mx.init.Xavier(factor_type = "in", magnitude = 2), X = dane, ctx = devices, num.round = 300, begin.round = epoch + 1, eval.data = NULL, optimizer = mx.opt.create("sgd", learning.rate = 0.005, momentum = 0.9, wd = 0, lr_scheduler = fs), eval.metric = mx.metric.accuracy, batch.end.callback = mx.callback.log.speedometer(...)), epoch.end.callback = mx.callback.save.checkpoint(paste0("...", ...), 1)) Surprisingly, examples [tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/r/fiveMinutesNeuralNetwork.html) works fine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services