alexmosc opened a new issue #9358: Why do running 1 round of an MXNET model training produce Train-mse=NaN? URL: https://github.com/apache/incubator-mxnet/issues/9358 If I run just 1 round of an MXNET model training with `mx.model.FeedForward.create` I get NaN as a training error. Is this for a purpose? ``` > sessionInfo() R version 3.4.0 (2017-04-21) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] mxnet_0.10.1 pryr_0.1.3 quantregForest_1.3-6 RColorBrewer_1.1-2 randomForest_4.6-12 ggjoy_0.4.0 ggridges_0.4.1 [8] DT_0.2 caret_6.0-77 lattice_0.20-35 FSelector_0.21 scales_0.5.0 nnet_7.3-12 infotheo_1.2.0 [15] cluster_2.0.6 forecast_8.2 gridExtra_2.3 kableExtra_0.6.1 knitr_1.17 rmarkdown_1.8 markdown_0.8 [22] TTR_0.23-2 tseries_0.10-42 ggplot2_2.2.1 magrittr_1.5 data.table_1.10.4-3 loaded via a namespace (and not attached): [1] colorspace_1.3-2 class_7.3-14 rprojroot_1.2 rstudioapi_0.7 DRR_0.0.2 prodlim_1.6.1 lubridate_1.7.1 xml2_1.1.1 [9] codetools_0.2-15 splines_3.4.0 mnormt_1.5-5 robustbase_0.92-8 RcppRoll_0.2.2 jsonlite_1.5 entropy_1.2.1 rJava_0.9-9 [17] broom_0.4.3 ddalpha_1.3.1 kernlab_0.9-25 sfsmisc_1.1-1 DiagrammeR_0.9.2 readr_1.1.1 compiler_3.4.0 httr_1.3.1 [25] backports_1.1.1 assertthat_0.2.0 Matrix_1.2-9 lazyeval_0.2.1 visNetwork_2.0.1 htmltools_0.3.6 tools_3.4.0 bindrcpp_0.2 [33] igraph_1.1.2 gtable_0.2.0 glue_1.2.0 reshape2_1.4.2 dplyr_0.7.4 Rcpp_0.12.14 rgexf_0.15.3 fracdiff_1.4-2 [41] nlme_3.1-131 iterators_1.0.8 psych_1.7.8 lmtest_0.9-35 timeDate_3042.101 gower_0.1.2 stringr_1.2.0 rvest_0.3.2 [49] RWekajars_3.9.1-5 XML_3.98-1.9 DEoptimR_1.0-8 MASS_7.3-47 zoo_1.8-0 ipred_0.9-6 hms_0.4.0 parallel_3.4.0 [57] quantmod_0.4-11 curl_3.0 downloader_0.4 rpart_4.1-11 stringi_1.1.6 Rook_1.1-1 foreach_1.4.3 RWeka_0.4-36 [65] lava_1.5.1 rlang_0.1.4 pkgconfig_2.0.1 evaluate_0.10.1 purrr_0.2.4 bindr_0.1 recipes_0.1.1 htmlwidgets_0.9 [73] CVST_0.2-1 tidyselect_0.2.3 plyr_1.8.4 R6_2.2.2 dimRed_0.1.0 foreign_0.8-67 withr_2.1.0 xts_0.10-0 [81] survival_2.41-3 tibble_1.3.4 viridis_0.4.0 grid_3.4.0 influenceR_0.1.0 ModelMetrics_1.1.0 digest_0.6.12 tidyr_0.7.2 [89] brew_1.0-6 stats4_3.4.0 munsell_0.4.3 viridisLite_0.2.0 quadprog_1.5-5 ``` Console: ``` Start training with 1 devices [1] Train-mse=NaN ``` ``` library(mxnet) hidden_u_1 <- 10 activ_hidden_1 <- 'tanh' hidden_u_2 <- 1 learn_rate <- 0.001 initializer <- mx.init.uniform(1) optimizer <- 'rmsprop' #sgd loss <- mx.metric.mse device.cpu <- mx.cpu() mini_batch <- 64 #8 rounds <- 1 #2 ## data symbols nn_data <- mx.symbol.Variable('data') nn_label <- mx.symbol.Variable('label') ## first fully connected layer fc1 <- mx.symbol.FullyConnected(data = nn_data , num_hidden = hidden_u_1) activ1 <- mx.symbol.Activation(data = fc1, act.type = activ_hidden_1) ## second fully connected layer fc2 <- mx.symbol.FullyConnected(data = activ1, num_hidden = hidden_u_2) q_func <- mx.symbol.LinearRegressionOutput(data = fc2, label = nn_label, name = 'regr') # initialize NN train.x <- matrix(rnorm(mini_batch * 10, 0, 1), ncol = 10) train.y = rnorm(64, 0, 1) nn_model <- mx.model.FeedForward.create( symbol = q_func, X = train.x, y = train.y, ctx = device.cpu, num.round = rounds, array.batch.size = mini_batch, #60 optimizer = optimizer, eval.metric = loss, learning.rate = learn_rate, initializer = initializer ) ``` ## What have you tried to solve it? If I use 2 or more rounds, or minibatch of the size smaller than the number of samples in my dataset, I get a numeric value of train error.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
