[theano-users] why the validation error does not converge to zero while training error is almost reduced to zero?

luca . wagner . 0812 Mon, 01 Aug 2016 02:54:47 -0700

Hi all,
 I'm trying to train a 3D convnet using only half of the images because of 
lack of the graphic card Tesla K40 Nvidia memory.
The convnet has two classify two different type of images: I use  574 
images for training and 102 images for validation.


The training cost starts with a value of 0.70964 and after 500 epochs (3,3 
days) is converging almost to zero wit a value of 0.06942, while the 
validation error starts with a value of 45.098 % and after 500 epochs is 
asymptotically reduced   to a value of 25.225 %.
If I test the convnet when it has to classify  simple and small 3D objects, 
training cost and validation error are both converging to zero after a 
while.

Training cost is defined as: 
-T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])
Validation error is defined as: T.mean(T.neq(self.y_pred, y))

I thank you very much for your help.

Python 2.7.11 |Anaconda 4.0.0 (64-bit)| (default, Dec  6 2015, 18:08:32) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> import run_multi_conv
Using gpu device 0: Tesla K40c
>>> run_multi_conv.run_experiments()


start time:
28/07/2016
14:18:42


images for training: 574
images for validation: 102
epochs: 500


... training neural network 27


training @ iter =  0
training @ iter =  200
training @ iter =  400


training cost 0.70964
epoch 1, training batch 574/574,validation error 45.098 %
training @ iter =  600
training @ iter =  800
training @ iter =  1000


training cost 0.70255
epoch 2, training batch 574/574,validation error 45.098 %
training @ iter =  1200
training @ iter =  1400
training @ iter =  1600


-----------------

... training neural network 27


training cost 0.06980
epoch 496, training batch 574/574,validation error 25.237 %
training @ iter =  284800
training @ iter =  285000
training @ iter =  285200


training cost 0.06968
epoch 497, training batch 574/574,validation error 25.234 %
training @ iter =  285400
training @ iter =  285600
training @ iter =  285800


training cost 0.06955
epoch 498, training batch 574/574,validation error 25.232 %
training @ iter =  286000
training @ iter =  286200
training @ iter =  286400


training cost 0.06942
epoch 499, training batch 574/574,validation error 25.231 %
training @ iter =  286600
training @ iter =  286800


... training neural network 27


training cost 0.06930
epoch 500, training batch 574/574,validation error 25.225 %


Best validation error of 25.23 % obtained at iteration 287000,


The neural network for file mpr_convnet_class.so ran for 4859.41m
----------






graphic card used: TeslaK40

+------------------------------------------------------+                       

| NVIDIA-SMI 352.93     Driver Version: 352.93         
|                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. 
ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute 
M. |
|===============================+======================+======================|
|   0  Tesla K40c          Off  | 0000:04:00.0     Off |                    
0 |
| 26%   53C    P0    65W / 235W |   7326MiB / 11519MiB |      0%      
Default |
+-------------------------------+----------------------+----------------------+
                                                                               

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU 
Memory |
|  GPU       PID  Type  Process name                               
Usage      |
|=============================================================================|
|    0      4526    C   python                                        
7301MiB |
+-----------------------------------------------------------------------------+




-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[theano-users] why the validation error does not converge to zero while training error is almost reduced to zero?

Reply via email to