[GitHub] alexmosc opened a new issue #9358: Why do running 1 round of an MXNET model training produce Train-mse=NaN?

GitBox Tue, 09 Jan 2018 02:44:18 -0800

alexmosc opened a new issue #9358: Why do running 1 round of an MXNET model 
training produce Train-mse=NaN?
URL: https://github.com/apache/incubator-mxnet/issues/9358
 
 
   If I run just 1 round of an MXNET model training with 
`mx.model.FeedForward.create` I get NaN as a training error. Is this for a 
purpose?
   
   
   ```
   > sessionInfo()
   R version 3.4.0 (2017-04-21)
   Platform: x86_64-w64-mingw32/x64 (64-bit)
   Running under: Windows >= 8 x64 (build 9200)
   
   Matrix products: default
   
   locale:
   [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United 
States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C              
            
   [5] LC_TIME=English_United States.1252    
   
   attached base packages:
   [1] stats     graphics  grDevices utils     datasets  methods   base     
   
   other attached packages:
    [1] mxnet_0.10.1         pryr_0.1.3           quantregForest_1.3-6 
RColorBrewer_1.1-2   randomForest_4.6-12  ggjoy_0.4.0          ggridges_0.4.1   
   
    [8] DT_0.2               caret_6.0-77         lattice_0.20-35      
FSelector_0.21       scales_0.5.0         nnet_7.3-12          infotheo_1.2.0   
   
   [15] cluster_2.0.6        forecast_8.2         gridExtra_2.3        
kableExtra_0.6.1     knitr_1.17           rmarkdown_1.8        markdown_0.8     
   
   [22] TTR_0.23-2           tseries_0.10-42      ggplot2_2.2.1        
magrittr_1.5         data.table_1.10.4-3 
   
   loaded via a namespace (and not attached):
    [1] colorspace_1.3-2   class_7.3-14       rprojroot_1.2      rstudioapi_0.7 
    DRR_0.0.2          prodlim_1.6.1      lubridate_1.7.1    xml2_1.1.1        
    [9] codetools_0.2-15   splines_3.4.0      mnormt_1.5-5       
robustbase_0.92-8  RcppRoll_0.2.2     jsonlite_1.5       entropy_1.2.1      
rJava_0.9-9       
   [17] broom_0.4.3        ddalpha_1.3.1      kernlab_0.9-25     sfsmisc_1.1-1  
    DiagrammeR_0.9.2   readr_1.1.1        compiler_3.4.0     httr_1.3.1        
   [25] backports_1.1.1    assertthat_0.2.0   Matrix_1.2-9       lazyeval_0.2.1 
    visNetwork_2.0.1   htmltools_0.3.6    tools_3.4.0        bindrcpp_0.2      
   [33] igraph_1.1.2       gtable_0.2.0       glue_1.2.0         reshape2_1.4.2 
    dplyr_0.7.4        Rcpp_0.12.14       rgexf_0.15.3       fracdiff_1.4-2    
   [41] nlme_3.1-131       iterators_1.0.8    psych_1.7.8        lmtest_0.9-35  
    timeDate_3042.101  gower_0.1.2        stringr_1.2.0      rvest_0.3.2       
   [49] RWekajars_3.9.1-5  XML_3.98-1.9       DEoptimR_1.0-8     MASS_7.3-47    
    zoo_1.8-0          ipred_0.9-6        hms_0.4.0          parallel_3.4.0    
   [57] quantmod_0.4-11    curl_3.0           downloader_0.4     rpart_4.1-11   
    stringi_1.1.6      Rook_1.1-1         foreach_1.4.3      RWeka_0.4-36      
   [65] lava_1.5.1         rlang_0.1.4        pkgconfig_2.0.1    
evaluate_0.10.1    purrr_0.2.4        bindr_0.1          recipes_0.1.1      
htmlwidgets_0.9   
   [73] CVST_0.2-1         tidyselect_0.2.3   plyr_1.8.4         R6_2.2.2       
    dimRed_0.1.0       foreign_0.8-67     withr_2.1.0        xts_0.10-0        
   [81] survival_2.41-3    tibble_1.3.4       viridis_0.4.0      grid_3.4.0     
    influenceR_0.1.0   ModelMetrics_1.1.0 digest_0.6.12      tidyr_0.7.2       
   [89] brew_1.0-6         stats4_3.4.0       munsell_0.4.3      
viridisLite_0.2.0  quadprog_1.5-5
   ```
   
   Console:
   ```
   Start training with 1 devices
   [1] Train-mse=NaN
   ```
   
   ```
   library(mxnet)
   
   hidden_u_1 <- 10
   activ_hidden_1 <- 'tanh'
   
   hidden_u_2 <- 1
   
   learn_rate <- 0.001
   
   initializer <- mx.init.uniform(1)
   
   optimizer <- 'rmsprop' #sgd
   
   loss <- mx.metric.mse
   
   device.cpu <- mx.cpu()
   
   mini_batch <- 64 #8
   
   rounds <- 1 #2
   
   
   ## data symbols
   
   nn_data <- mx.symbol.Variable('data')
   nn_label <- mx.symbol.Variable('label')
   
   
   ## first fully connected layer
   
   fc1 <- mx.symbol.FullyConnected(data = nn_data
                                   , num_hidden = hidden_u_1)
   
   activ1 <- mx.symbol.Activation(data = fc1, act.type = activ_hidden_1)
   
   ## second fully connected layer
   
   fc2 <- mx.symbol.FullyConnected(data = activ1, num_hidden = hidden_u_2)
   
   q_func <- mx.symbol.LinearRegressionOutput(data = fc2, label = nn_label, 
name = 'regr')
   
   
   # initialize NN
   
   train.x <- matrix(rnorm(mini_batch * 10, 0, 1), ncol = 10)
   
   train.y = rnorm(64, 0, 1)
   
   nn_model <- mx.model.FeedForward.create(
        symbol = q_func,
        X = train.x,
        y = train.y,
        ctx = device.cpu,
        num.round = rounds,
        array.batch.size = mini_batch, #60
        optimizer = optimizer,
        eval.metric = loss,
        learning.rate = learn_rate,
        initializer = initializer
   )
   ```
   
   ## What have you tried to solve it?
   
   If I use 2 or more rounds, or minibatch of the size smaller than the number 
of samples in my dataset, I get a numeric value of train error.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] alexmosc opened a new issue #9358: Why do running 1 round of an MXNET model training produce Train-mse=NaN?

Reply via email to