Dear all,
I have come across some surprising issues when using Theano under Keras and
scikit-learn for training an MLP network against the Pima Indians dataset. It
is related to the reproducibility of the performance. I am going to try to
explain you the issue and my guesses about it in 2 hypothesis: (1) Training
the MLP with fixed configuration and (2) Training the MLP against a grid
search. I attach runnable files for each case.
*Hypothesis 1: Training of the MLP with Manual and Automatic Cross
Validation*
In the Hyp1.py there is the comparison of two MLPs trained against the same
dataset and against the same train/test split following a 10-fold cross
validation strategy. Well, I don’t know why, but if you its seems that the
numpy’s reproducibility configuration does not remain constant from one
training to the other. If you execute the file just as it is, you will
observed different scores while they should be the same. The first MLP
scores an average value of 75.25 while the second socres 74.75. However, if
you decomment the numpy.random.seed(seed) command (line 55), you will
observe the same scores: 75.25%.
So, does it mean that the seed for the numpy random generation or the train
and test splits do not remain constant after the first training?
*Hypothesis 2: Training of the MLP with Automatic Cross Validation and Grid
Search*
If you open the Hyp2.py file you can see a grid search for a MLP
architecture against the same train/test split following a 10-fold cross
validation strategy. You can see also how the code is thought to sweep
along more hyperparameters but, for speed reasons, I just sweep the batches
parameter. Again, the same problem with the reproducibility configuration
occurs. If you execute the file just as it is, just one model is trained
with the same configuration than that one from the hypothesis 1. So, the
same result is expected and obtained: 75.25%.
Best: 75.26% using {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch':
150, 'batch_size': 10}
---------------------------
75.25% (3.40%) with: {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch':
150, 'batch_size': 10}
But if now we add another batch value to the grid (5 for instance) again a
different score is obtained for the previous configuration. And that makes
absolutely no sense for me.
Best: 74.74% using {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch':
150, 'batch_size': 10}
---------------------------
74.34% (3.68%) with: {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch':
150, 'batch_size': 5}
74.75% (3.95%) with: {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch':
150, 'batch_size': 10}
I have found one solution. Although the score for the same configuration is
not the same, at least the scores are reproducible. The solution is to add
a numpy.random.seed(seed) command to the create_model function (line 18) so
that for each iteration in the grid search, the random generation of numpy
is reset.
If you execute it with just one batch value (10), the result is 74.22%.
Best: 74.22% using {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch':
150, 'batch_size': 10}
---------------------------
74.22% (3.13%) with: {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch':
150, 'batch_size': 10}
If now we add another batch value to the grid (5 as before) the same score
is obtained for the previous configuration. And that makes sense for me.
Best: 77.21% using {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch':
150, 'batch_size': 5}
---------------------------
77.21% (3.97%) with: {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch':
150, 'batch_size': 5}
74.22% (3.13%) with: {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch':
150, 'batch_size': 10}
Perhaps, the topic description is quite long, but I have preferred to
detail the results and my concerns on it. Please feel free to share with me
any doubt, suggestion or comment you may have.
Best regards,
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.
'''
Created on 1 de ago. de 2016
@author: bheriz
'''
import pandas
import numpy
from sklearn.cross_validation import StratifiedKFold
from sklearn.cross_validation import cross_val_score
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
# Function to create model, required for KerasClassifier.
# Fully Connected MLP with 8-12-8-1 architecture. Hidden Layers with Rectifier activation function
def create_model():
# create model
model = Sequential()
model.add(Dense(12, input_dim=8, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
# Load data (Pima Indians Onset Dataset) an parametrize precision
url = "https://goo.gl/vhm1eU"
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
# Split array into input (X) and output (Y) variables
dataframe = pandas.read_csv(url, names=names)
pandas.set_option('display.width', 100)
pandas.set_option('precision', 3)
array = dataframe.values
X = array[:,0:8]
Y = array[:,8]
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# define 10-fold cross validation test harness
kfold = StratifiedKFold(y=Y, n_folds=10, shuffle=True, random_state=seed)
# 1. Automatic using KerasClassifier
#-----------------------------------------------
model1 = KerasClassifier(build_fn=create_model, nb_epoch=150, batch_size=10, verbose=0) # create model
results = cross_val_score(model1, X, Y, cv=kfold)
print("model1 %.2f%%" %(results.mean()*100))
# 2. Manual K-fold Cross Validation
# Note: If the same seed from Test 1 is used, different performance metrics are obtained
#------------------------------------------------------------------------
# fix random seed for reproducibility
#numpy.random.seed(seed)
cvscores = []
for i, (train, test) in enumerate(kfold):
# create model
model2 = create_model()
# Fit/Train the model
model2.fit(X[train], Y[train], nb_epoch=150, batch_size=10, verbose=0)
# evaluate the model
scores2 = model2.evaluate(X[test], Y[test], verbose=0)
#print(" model4 %s: %.2f%%" % (model4.metrics_names[1], scores2[1]*100))
cvscores.append(scores2[1] * 100)
print(" model2 (same result expected) %.2f%% (+/- %.2f%%)" % (numpy.mean(cvscores), numpy.std(cvscores)))
'''
Created on 1 de ago. de 2016
@author: bheriz
'''
import pandas
import numpy
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.cross_validation import StratifiedKFold
from sklearn.grid_search import GridSearchCV
# Function to create model, required for KerasClassifier.
# Fully Connected MLP with 8-12-8-1 architecture. Hidden Layers with Rectifier activation function
def create_model(optimizer='rmsprop', init='glorot_uniform'):
#numpy.random.seed(7)
# create model
model = Sequential()
model.add(Dense(12, input_dim=8, init=init, activation='relu'))
model.add(Dense(8, init=init, activation='relu'))
model.add(Dense(1, init=init, activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
return model
# Load data (Pima Indians Onset Dataset) an parametrize precision
url = "https://goo.gl/vhm1eU"
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
# Split array into input (X) and output (Y) variables
dataframe = pandas.read_csv(url, names=names)
pandas.set_option('display.width', 100)
pandas.set_option('precision', 3)
array = dataframe.values
X = array[:,0:8]
Y = array[:,8]
# grid search epochs, batch size and optimizer
"""
optimizers = ['rmsprop', 'adam']
init = ['glorot_uniform', 'normal', 'uniform']
epochs = numpy.array([50, 100, 150])
batches = numpy.array([5, 10, 20])
"""
optimizers = ['adam']
init = ['uniform']
epochs = numpy.array([150])
batches = numpy.array([10])
param_grid = dict(optimizer=optimizers, nb_epoch=epochs, batch_size=batches, init=init)
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# define 10-fold cross validation test harness
kfold = StratifiedKFold(y=Y, n_folds=10, shuffle=True, random_state=seed)
# create and train model
model = KerasClassifier(build_fn=create_model, verbose=0)
grid = GridSearchCV(estimator=model, param_grid=param_grid,cv=kfold)
grid_result = grid.fit(X, Y)
# summarize results
print("Best: %.2f%% using %s" % (100*grid_result.best_score_, grid_result.best_params_))
print("---------------------------")
for params, mean_score, scores in grid_result.grid_scores_:
print("%.2f%% (%.2f%%) with: %r" % (100*scores.mean(), 100*scores.std(), params))