Dear all,

I have come across some surprising issues when using Theano under Keras and 
scikit-learn for training an MLP network against the Pima Indians dataset. It 
is related to the reproducibility of the performance. I am going to try to 
explain you the issue and my guesses about it in 2 hypothesis: (1) Training 
the MLP with fixed configuration and (2) Training the MLP against a grid 
search. I attach runnable files for each case.

*Hypothesis 1: Training of the MLP with Manual and Automatic Cross 
Validation*

In the Hyp1.py there is the comparison of two MLPs trained against the same 
dataset and against the same train/test split following a 10-fold cross 
validation strategy. Well, I don’t know why, but if you its seems that the 
numpy’s reproducibility configuration does not remain constant from one 
training to the other. If you execute the file just as it is, you will 
observed different scores while they should be the same. The first MLP 
scores an average value of 75.25 while the second socres 74.75. However, if 
you decomment the numpy.random.seed(seed) command (line 55), you will 
observe the same scores: 75.25%.

So, does it mean that the seed for the numpy random generation or the train 
and test splits do not remain constant after the first training?

*Hypothesis 2: Training of the MLP with Automatic Cross Validation and Grid 
Search*

If you open the Hyp2.py file you can see a grid search for a MLP 
architecture against the same train/test split following a 10-fold cross 
validation strategy. You can see also how the code is thought to sweep 
along more hyperparameters but, for speed reasons, I just sweep the batches 
parameter. Again, the same problem with the reproducibility configuration 
occurs.  If you execute the file just as it is, just one model is trained 
with the same configuration than that one from the hypothesis 1. So, the 
same result is expected and obtained: 75.25%.

Best: 75.26% using {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch': 
150, 'batch_size': 10}
---------------------------
75.25% (3.40%) with: {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch': 
150, 'batch_size': 10}

But if now we add another batch value to the grid (5 for instance) again a 
different score is obtained for the previous configuration. And that makes 
absolutely no sense for me.

Best: 74.74% using {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch': 
150, 'batch_size': 10}
---------------------------
74.34% (3.68%) with: {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch': 
150, 'batch_size': 5}

74.75% (3.95%) with: {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch': 
150, 'batch_size': 10}  

I have found one solution. Although the score for the same configuration is 
not the same, at least the scores are reproducible. The solution is to add 
a numpy.random.seed(seed) command to the create_model function (line 18) so 
that for each iteration in the grid search, the random generation of numpy 
is reset.

If you execute it with just one batch value (10), the result is 74.22%.

Best: 74.22% using {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch': 
150, 'batch_size': 10}
---------------------------
74.22% (3.13%) with: {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch': 
150, 'batch_size': 10}

If now we add another batch value to the grid (5 as before) the same score 
is obtained for the previous configuration. And that makes sense for me.

Best: 77.21% using {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch': 
150, 'batch_size': 5}
---------------------------
77.21% (3.97%) with: {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch': 
150, 'batch_size': 5}
74.22% (3.13%) with: {'init': 'uniform', 'optimizer': 'adam', 'nb_epoch': 
150, 'batch_size': 10}

Perhaps, the topic description is quite long, but I have preferred to 
detail the results and my concerns on it. Please feel free to share with me 
any doubt, suggestion or comment you may have.

Best regards, 

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.
'''
Created on 1 de ago. de 2016

@author: bheriz
'''
import pandas
import numpy
from sklearn.cross_validation import StratifiedKFold
from sklearn.cross_validation import cross_val_score
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier

# Function to create model, required for KerasClassifier.
# Fully Connected MLP with 8-12-8-1 architecture. Hidden Layers with Rectifier activation function 
def create_model():
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, init='uniform', activation='relu'))
    model.add(Dense(8, init='uniform', activation='relu'))
    model.add(Dense(1, init='uniform', activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# Load data (Pima Indians Onset Dataset) an parametrize precision
url = "https://goo.gl/vhm1eU";
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']

# Split array into input (X) and output (Y) variables
dataframe = pandas.read_csv(url, names=names)
pandas.set_option('display.width', 100)
pandas.set_option('precision', 3)
array = dataframe.values
X = array[:,0:8]
Y = array[:,8]

# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)

# define 10-fold cross validation test harness
kfold = StratifiedKFold(y=Y, n_folds=10, shuffle=True, random_state=seed)

# 1. Automatic using KerasClassifier
#-----------------------------------------------
model1 = KerasClassifier(build_fn=create_model, nb_epoch=150, batch_size=10, verbose=0) # create model
results = cross_val_score(model1, X, Y, cv=kfold)
print("model1 %.2f%%" %(results.mean()*100))

# 2. Manual K-fold Cross Validation
# Note: If the same seed from Test 1 is used, different performance metrics are obtained
#------------------------------------------------------------------------
# fix random seed for reproducibility
#numpy.random.seed(seed)

cvscores = []
for i, (train, test) in enumerate(kfold):
    # create model
    model2 = create_model()
    # Fit/Train the model
    model2.fit(X[train], Y[train], nb_epoch=150, batch_size=10, verbose=0)
    # evaluate the model
    scores2 = model2.evaluate(X[test], Y[test], verbose=0)
    #print(" model4 %s: %.2f%%" % (model4.metrics_names[1], scores2[1]*100))
    cvscores.append(scores2[1] * 100)
print(" model2 (same result expected) %.2f%% (+/- %.2f%%)" % (numpy.mean(cvscores), numpy.std(cvscores)))
'''
Created on 1 de ago. de 2016

@author: bheriz
'''
import pandas
import numpy
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.cross_validation import StratifiedKFold
from sklearn.grid_search import GridSearchCV

# Function to create model, required for KerasClassifier.
# Fully Connected MLP with 8-12-8-1 architecture. Hidden Layers with Rectifier activation function 
def create_model(optimizer='rmsprop', init='glorot_uniform'):
    #numpy.random.seed(7)
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, init=init, activation='relu'))
    model.add(Dense(8, init=init, activation='relu'))
    model.add(Dense(1, init=init, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

# Load data (Pima Indians Onset Dataset) an parametrize precision
url = "https://goo.gl/vhm1eU";
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']

# Split array into input (X) and output (Y) variables
dataframe = pandas.read_csv(url, names=names)
pandas.set_option('display.width', 100)
pandas.set_option('precision', 3)
array = dataframe.values
X = array[:,0:8]
Y = array[:,8]

# grid search epochs, batch size and optimizer
"""
optimizers = ['rmsprop', 'adam']
init = ['glorot_uniform', 'normal', 'uniform']
epochs = numpy.array([50, 100, 150])
batches = numpy.array([5, 10, 20])
"""
optimizers = ['adam']
init = ['uniform']
epochs = numpy.array([150])
batches = numpy.array([10])

param_grid = dict(optimizer=optimizers, nb_epoch=epochs, batch_size=batches, init=init)

# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# define 10-fold cross validation test harness
kfold = StratifiedKFold(y=Y, n_folds=10, shuffle=True, random_state=seed)

# create and train model
model = KerasClassifier(build_fn=create_model, verbose=0)
grid = GridSearchCV(estimator=model, param_grid=param_grid,cv=kfold)
grid_result = grid.fit(X, Y)

# summarize results
print("Best: %.2f%% using %s" % (100*grid_result.best_score_, grid_result.best_params_))
print("---------------------------")
for params, mean_score, scores in grid_result.grid_scores_:
    print("%.2f%% (%.2f%%) with: %r" % (100*scores.mean(), 100*scores.std(), params))

Reply via email to