Hi Thomas, Hi Thomas,
besides that information of Sebastian, you dataset seems to be quite imbalances (48 positive and 1230 negative observations). You could try rebalancing your data using https://github.com/scikit-learn-contrib/imbalanced-learn This package offers some methods for resampling your data (under-sampling the majority class, over-sampling the minority class, etc.) Greets, Piotr On 08.12.2016 01:19, Sebastian Raschka wrote: Hi, Thomas, we had a related thread on the email list some time ago, let me post it for reference further below. Regarding your question, I think you may want make sure that you standardized the features (which makes the learning generally it less sensitive to learning rate and random weight initialization). However, even then, I would try at least 1-3 different random seeds and look at the cost vs time — what can happen is that you land in different minima depending on the weight initialization as demonstrated in the example below (in MLPs you have the problem of a complex cost surface). Best, Sebastian The default is set 100 units in the hidden layer, but theoretically, it should work with 2 hidden logistic units (I think that’s the typical textbook/class example). I think what happens is that it gets stuck in local minima depending on the random weight initialization. E.g., the following works just fine: from sklearn.neural_network import MLPClassifier X = [[0, 0], [0, 1], [1, 0], [1, 1]] y = [0, 1, 1, 0] clf = MLPClassifier(solver='lbfgs', activation='logistic', alpha=0.0, hidden_layer_sizes=(2,), learning_rate_init=0.1, max_iter=1000, random_state=20) clf.fit(X, y) res = clf.predict([[0, 0], [0, 1], [1, 0], [1, 1]]) print(res) print(clf.loss_) but changing the random seed to 1 leads to: [0 1 1 1] 0.34660921283 For comparison, I used a more vanilla MLP (1 hidden layer with 2 units and logistic activation as well; <https://github.com/rasbt/python-machine-learning-> https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch12/ch12.ipynb), essentially resulting in the same problem: [cid:[email protected]][cid:[email protected]] On Dec 7, 2016, at 6:45 PM, Thomas Evangelidis <[email protected]<mailto:[email protected]>> wrote: I tried the sklearn.neural_network.MLPClassifier with the default parameters using the input data I quoted in my previous post about Nu-Support Vector Classifier. The predictions are great but the problem is that sometimes when I rerun the MLPClassifier it predicts no positive observations (class 1). I have noticed that this can be controlled by the random_state parameter, e.g. MLPClassifier(random_state=0) gives always no positive predictions. My question is how can I chose the right random_state value in a real blind test case? thanks in advance Thomas -- ====================================================================== Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic email: [email protected]<mailto:[email protected]> [email protected]<mailto:[email protected]> website: https://sites.google.com/site/thomasevangelidishomepage/ _______________________________________________ scikit-learn mailing list [email protected]<mailto:[email protected]> https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list [email protected]<mailto:[email protected]> https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________ scikit-learn mailing list [email protected] https://mail.python.org/mailman/listinfo/scikit-learn
