kurt-o-sys opened a new issue #7472: continuously train rnn - training data stream? URL: https://github.com/apache/incubator-mxnet/issues/7472 ## Question Usually, a neural network is trained by using a training, validation and test set. Having a continuous series of data, an event (new training data) occurring every 1-5 seconds, is it possible to continuously train (update) a recurrent neural network using mxnet? I don't need to care to reuse previous (training) data points: I just want to update the weights slightly(!) on each new event. It's for a behaviour/game like system: depending on the (expressed/intentional) behaviour of the players (the features), the output of the system should be estimated and continuously adapted (for further processing). The system has to learn on the way, and being able to cope with, to a certain extend, changing player behaviour and it needs to remember certain patterns from weeks and if possible, months, ago. (I'd probably be mainly an LSTM.) Storing all data and retrain the system on that data is close to impossible because: 1. I estimate there's about 10-100GB of data per day (will be varying) 2. retraining every time, let's say, 10 seconds, on all existing data would take too long. I want a system that continuously trains itself on the real data, not splitting into training/testing/validation sets: 1. The training set is the real data, comparing the actual state of the system with the prediction previously made 2. There's not validation, besides the fact that the system validates itself 3. Testing is done on every new event. The predictive power will be continuously determined. Can this be done with mxnet, having a training data stream? ## Environment info This is not really relevant, but well, I don't mind providing it :) Operating System: ``` $ uname -ar Linux flipflap 4.4.0-57-generic #78-Ubuntu SMP Fri Dec 9 23:50:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux ``` Compiler: ? Package used (Python/R/Scala/Julia): R MXNet version: ``` > packageVersion("mxnet") [1] ?0.10.1? > sessionInfo() R version 3.4.1 (2017-06-30) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Linux Mint 18 Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/libopenblasp-r0.2.18.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=de_BE.UTF-8 [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=de_BE.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=de_BE.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] mxnet_0.10.1 httr_1.2.1 jsonlite_1.5 loaded via a namespace (and not attached): [1] Rcpp_0.12.12 compiler_3.4.1 RColorBrewer_1.1-2 influenceR_0.1.0 plyr_1.8.4 bindr_0.1 viridis_0.4.0 [8] tools_3.4.1 digest_0.6.12 tibble_1.3.3 gtable_0.2.0 viridisLite_0.2.0 rgexf_0.15.3 pkgconfig_2.0.1 [15] rlang_0.1.1 igraph_1.1.2 rstudioapi_0.6 curl_2.4 bindrcpp_0.2 gridExtra_2.2.1 stringr_1.2.0 [22] DiagrammeR_0.9.0 dplyr_0.7.2 htmlwidgets_0.9 grid_3.4.1 glue_1.1.1 R6_2.2.2 Rook_1.1-1 [29] XML_3.98-1.9 ggplot2_2.2.1 magrittr_1.5 codetools_0.2-15 scales_0.4.1 htmltools_0.3.6 assertthat_0.1 [36] colorspace_1.3-2 brew_1.0-6 stringi_1.1.5 visNetwork_2.0.1 lazyeval_0.2.0 munsell_0.4.3 ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
With regards, Apache Git Services