guliashvili opened a new issue #11575: inconsistent results of mae acc rmse URL: https://github.com/apache/incubator-mxnet/issues/11575 Hi, I'm trying to train my model with the following code. It has only to classes to predict from 0 and 1. ``` def fit(symbol, arg_params, aux_params, train, val, test, batch_size, num_gpus): devs = [mx.gpu(i) for i in range(num_gpus)] mod = mx.mod.Module(symbol=symbol, context=devs) metrics = [mx.metric.Accuracy(), mx.metric.RMSE(), mx.metric.MAE()] mod.fit(train, val, num_epoch=args.epoch, arg_params=arg_params, aux_params=aux_params, allow_missing=True, batch_end_callback = mx.callback.Speedometer(batch_size, 100), epoch_end_callback = mx.callback.do_checkpoint(args.prefix, 1), kvstore='KVStore', optimizer='sgd', optimizer_params={'learning_rate':0.01}, initializer=mx.init.Xavier(rnd_type='gaussian', factor_type="in", magnitude=2), eval_metric=metrics) return mod.score(test, metrics) ``` Here are the results : . > 2018-07-06 00:54:38,926 Epoch[0] Batch [100] Speed: 20.37 samples/sec accuracy=0.544554 rmse=0.591505 mae=0.500000 > 2018-07-06 00:54:43,694 Epoch[0] Batch [200] Speed: 20.97 samples/sec accuracy=0.470000 rmse=0.586368 mae=0.500000 > 2018-07-06 00:54:48,509 Epoch[0] Batch [300] Speed: 20.77 samples/sec accuracy=0.520000 rmse=0.587208 mae=0.500000 > 2018-07-06 00:54:53,286 Epoch[0] Batch [400] Speed: 20.93 samples/sec accuracy=0.560000 rmse=0.602946 mae=0.500000 > 2018-07-06 00:54:58,057 Epoch[0] Batch [500] Speed: 20.96 samples/sec accuracy=0.460000 rmse=0.576723 mae=0.500000 > 2018-07-06 00:55:02,886 Epoch[0] Batch [600] Speed: 20.71 samples/sec accuracy=0.490000 rmse=0.577645 mae=0.500000 > 2018-07-06 00:55:07,703 Epoch[0] Batch [700] Speed: 20.76 samples/sec accuracy=0.600000 rmse=0.585552 mae=0.500000 > 2018-07-06 00:55:12,453 Epoch[0] Batch [800] Speed: 21.05 samples/sec accuracy=0.560000 rmse=0.585788 mae=0.500000 > 2018-07-06 00:55:17,236 Epoch[0] Batch [900] Speed: 20.91 samples/sec accuracy=0.500000 rmse=0.567332 mae=0.500000 > 2018-07-06 00:55:21,993 Epoch[0] Batch [1000] Speed: 21.02 samples/sec accuracy=0.590000 rmse=0.580251 mae=0.500000 > 2018-07-06 00:55:26,776 Epoch[0] Batch [1100] Speed: 20.91 samples/sec accuracy=0.550000 rmse=0.564997 mae=0.500000 > 2018-07-06 00:55:31,532 Epoch[0] Batch [1200] Speed: 21.02 samples/sec accuracy=0.620000 rmse=0.564062 mae=0.500000 > 2018-07-06 00:55:36,281 Epoch[0] Batch [1300] Speed: 21.06 samples/sec accuracy=0.650000 rmse=0.566788 mae=0.500000 > 2018-07-06 00:55:41,062 Epoch[0] Batch [1400] Speed: 20.92 samples/sec accuracy=0.590000 rmse=0.574353 mae=0.500000 > 2018-07-06 00:55:45,845 Epoch[0] Batch [1500] Speed: 20.91 samples/sec accuracy=0.690000 rmse=0.569736 mae=0.500000 > 2018-07-06 00:55:50,623 Epoch[0] Batch [1600] Speed: 20.93 samples/sec accuracy=0.700000 rmse=0.579918 mae=0.500000 > 2018-07-06 00:55:55,407 Epoch[0] Batch [1700] Speed: 20.90 samples/sec accuracy=0.730000 rmse=0.585157 mae=0.500000 > 2018-07-06 00:56:00,170 Epoch[0] Batch [1800] Speed: 20.99 samples/sec accuracy=0.810000 rmse=0.590722 mae=0.500000 > 2018-07-06 00:56:04,944 Epoch[0] Batch [1900] Speed: 20.95 samples/sec accuracy=0.860000 rmse=0.601612 mae=0.500000 > 2018-07-06 00:56:09,704 Epoch[0] Batch [2000] Speed: 21.01 samples/sec accuracy=0.800000 rmse=0.593426 mae=0.500000 > 2018-07-06 00:56:14,460 Epoch[0] Batch [2100] Speed: 21.02 samples/sec accuracy=0.860000 rmse=0.618266 mae=0.500000 > 2018-07-06 00:56:19,215 Epoch[0] Batch [2200] Speed: 21.03 samples/sec accuracy=0.760000 rmse=0.605838 mae=0.500000 > 2018-07-06 00:56:23,997 Epoch[0] Batch [2300] Speed: 20.92 samples/sec accuracy=0.840000 rmse=0.616089 mae=0.500000 > 2018-07-06 00:56:28,773 Epoch[0] Batch [2400] Speed: 20.94 samples/sec accuracy=0.850000 rmse=0.620063 mae=0.500000 > 2018-07-06 00:56:33,562 Epoch[0] Batch [2500] Speed: 20.88 samples/sec accuracy=0.820000 rmse=0.608637 mae=0.500000 > 2018-07-06 00:56:38,401 Epoch[0] Batch [2600] Speed: 20.67 samples/sec accuracy=0.880000 rmse=0.631159 mae=0.500000 > 2018-07-06 00:56:43,179 Epoch[0] Batch [2700] Speed: 20.93 samples/sec accuracy=0.850000 rmse=0.624896 mae=0.500000 > 2018-07-06 00:56:47,975 Epoch[0] Batch [2800] Speed: 20.85 samples/sec accuracy=0.830000 rmse=0.631906 mae=0.500000 > 2018-07-06 00:56:52,744 Epoch[0] Batch [2900] Speed: 20.97 samples/sec accuracy=0.890000 rmse=0.634412 mae=0.500000 > 2018-07-06 00:56:57,510 Epoch[0] Batch [3000] Speed: 20.98 samples/sec accuracy=0.810000 rmse=0.628204 mae=0.500000 > 2018-07-06 00:57:02,293 Epoch[0] Batch [3100] Speed: 20.91 samples/sec accuracy=0.900000 rmse=0.648476 mae=0.500000 > 2018-07-06 00:57:07,066 Epoch[0] Batch [3200] Speed: 20.95 samples/sec accuracy=0.920000 rmse=0.642261 mae=0.500000 This result looks very strange to me. 1) mae is always the same. 2) accuracy is almost optimal (near to 1) 3) rmse is worse then the random How can these 3 things happen at the same time?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services