absalama opened a new issue #11393: Validation Accuracy is higher than training accuracy. URL: https://github.com/apache/incubator-mxnet/issues/11393 I am training Imagenet (1k) on alexnet. I used the im2rec tool to split 5% of the training data to be used by the validation phase. The results are two set of records files (I use chunks) one set for training and on set for validation. The log shows the following: ``` top_k_accuracy_5=0.159258 cross-entropy=5.465672 INFO:root:Epoch[0] Batch [9400] Speed: 395.94 samples/sec accuracy=0.052461 top_k_accuracy_5=0.160195 cross-entropy=5.454822 **INFO:root:Epoch[0] Train-accuracy=0.056324 INFO:root:Epoch[0] Train-top_k_accuracy_5=0.165848 INFO:root:Epoch[0] Train-cross-entropy=5.416635 INFO:root:Epoch[0] Time cost=3079.019 INFO:root:Saved checkpoint to "mxnet_alexnet_single_gpu_all_data_set_256-0001.params" INFO:root:Epoch[0] `Validation-accuracy=0.078869 INFO:root:Epoch[0] Validation-top_k_accuracy_5=0.216859 INFO:root:Epoch[0] Validation-cross-entropy=5.142231** ``` The validation here is higher than training accuracy and this increases with further epochs (Until the time I write this issue the epoch 10 , the the validation is higher than the training accuracy with around 7%). **The commands used for data preprocessing:** `python3 im2rec.py --list --recursive --chunks 1024 --train-ratio 0.95 ${IMAGENET_ROOT}/record_io_all_raw_data/metadata-train256/imagenet1k ${IMAGENET_EXTRACTED}/train ` `python3 im2rec.py --resize 256 --quality 95 --num-thread 16 ${IMAGENET_ROOT}/record_io_all_raw_data/metadata-train256/imagenet1k ${IMAGENET_EXTRACTED}/train` python3 im2rec.py --resize 256 --quality 95 --num-thread 16 `${IMAGENET_ROOT}/record_io_all_raw_data/metadata-val256/imagenet1k ${IMAGENET_EXTRACTED}/train` **The arguments used for training:** ``` Namespace(batch_size=128, benchmark=0, data_nthreads=4, data_train='/work/projects/Project00755/datasets/imagenet/record_io_all_raw_data/train256/', data_train_ idx='', data_val='/work/projects/Project00755/datasets/imagenet/record_io_all_raw_data/val256/', data_val_idx='', disp_batches=200, dtype='float32', gc_threshold=0.5, gc_type='none', gpus='0' , image_shape='3,227,227', initializer='default', kv_store='device', load_epoch=None, loss='ce', lr=0.01, lr_factor=0.1, lr_step_epochs='30,60', macrobatch_size=0, max_random_aspect_ratio=0.2 5, max_random_h=36, max_random_l=50, max_random_rotate_angle=10, max_random_s=50, max_random_scale=1, max_random_shear_ratio=0.1, min_random_scale=1, model_prefix='mxnet_alexnet_single_gpu_al l_data_set_256', mom=0.9, monitor=0, network='alexnet', num_classes=1000, num_epochs=80, num_examples=1216718, num_layers=8, optimizer='sgd', pad_size=0, random_crop=1, random_mirror=1, rgb_m ean='123.68,116.779,103.939', save_period=1, test_io=0, top_k=5, warmup_epochs=5, warmup_strategy='linear', wd=0.0005) ``` Any help will be appreciated. Thanks
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
