> many of them need number of outlier and distance as input parameter in > advance, is there algorithm more intelligently ?
With ‘intelligently’ you mean ‘more automatic’ (fewer hyperparameters to define manually)? In my opinion, “outlier” is a highly context-specific definition, thus, it’s really up to you to decide what to count as an outlier or not for your application. E.g., a simple non-parametric approach would be to say that point P is an outlier if P > Q3 + 1.5 * IQR, or P < Q1 - 1.5 * IQR where Q1 and Q3 are the first and third quartile of the dataset, respectively, and IQR = interquartile range (Q3-Q1). Similarly you could use thresholds based on variance or standard deviation, etc. so that you don’t need to specify the number of outliers if that’s not what you want > On Nov 25, 2016, at 6:38 AM, lin...@ruijie.com.cn wrote: > > Hello everyone, > I use ' IsolationForest' to pick up the outlier data today and I notice > there is a ' contamination ' parameter in IsolationForest function, and its > default value is 0.1 = 10% > So is there a way to pick the outlier without assigning the proportion > of outliers in the data set? > For example, in dataset [2,3,2,4,2,3,1,2,3,1,2, 999, 2,3,2,1,2,3], we > can easily pick the '999' as an outlier entry out of the set according to the > consciousness > And I read some paper about outlier detect recently, many of them need > number of outlier and distance as input parameter in advance, is there > algorithm more intelligently ? > > > > > > -----邮件原件----- > 发件人: scikit-learn > [mailto:scikit-learn-bounces+linjia=ruijie.com...@python.org] 代表 Sebastian > Raschka > 发送时间: 2016年11月25日 10:51 > 收件人: Scikit-learn user and developer mailing list > 主题: Re: [scikit-learn] 答复: question about using > sklearn.neural_network.MLPClassifier? > >> here is another question, when I use neural network lib routine, can I save >> the trained network for use at the next time? > > > Maybe have a look at the model persistence section at > http://scikit-learn.org/stable/modules/model_persistence.html or > http://cmry.github.io/notes/serialize > > Cheers, > Sebastian > > >> On Nov 24, 2016, at 8:08 PM, lin...@ruijie.com.cn wrote: >> >> @ Sebastian Raschka >> thanks for your analyzing , >> here is another question, when I use neural network lib routine, can I save >> the trained network for use at the next time? >> Just like the following: >> >> Foo1.py >> … >> Clf.fit(x,y) >> Result_network = clf.save() >> … >> >> Foo2.py >> … >> Clf = Load(result_network) >> Res = Clf.predict(newsample) >> … >> >> So I needn’t fit the train-set everytime >> 发件人: scikit-learn >> [mailto:scikit-learn-bounces+linjia=ruijie.com...@python.org] 代表 >> Sebastian Raschka >> 发送时间: 2016年11月24日 3:06 >> 收件人: Scikit-learn user and developer mailing list >> 主题: Re: [scikit-learn] question about using >> sklearn.neural_network.MLPClassifier? >> >> If you keep everything at their default values, it seems to work - >> >> ```py >> from sklearn.neural_network import MLPClassifier X = [[0, 0], [0, 1], >> [1, 0], [1, 1]] y = [0, 1, 1, 0] clf = MLPClassifier(max_iter=1000) >> clf.fit(X, y) res = clf.predict([[0, 0], [0, 1], [1, 0], [1, 1]]) >> print(res) >> ``` >> >> The default is set 100 units in the hidden layer, but theoretically, it >> should work with 2 hidden logistic units (I think that’s the typical >> textbook/class example). I think what happens is that it gets stuck in local >> minima depending on the random weight initialization. E.g., the following >> works just fine: >> >> from sklearn.neural_network import MLPClassifier X = [[0, 0], [0, 1], >> [1, 0], [1, 1]] y = [0, 1, 1, 0] clf = MLPClassifier(solver='lbfgs', >> activation='logistic', >> alpha=0.0, >> hidden_layer_sizes=(2,), >> learning_rate_init=0.1, >> max_iter=1000, >> random_state=20) >> clf.fit(X, y) >> res = clf.predict([[0, 0], [0, 1], [1, 0], [1, 1]]) >> print(res) >> print(clf.loss_) >> >> >> but changing the random seed to 1 leads to: >> >> [0 1 1 1] >> 0.34660921283 >> >> For comparison, I used a more vanilla MLP (1 hidden layer with 2 units and >> logistic activation as well; >> https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch12/ch12.ipynb), >> essentially resulting in the same problem: >> >> >> <image001.png><image002.png> >> >> >> >> >> On Nov 23, 2016, at 6:26 AM, lin...@ruijie.com.cn wrote: >> >> Yes,you are right @ Raghav R V, thx! >> >> However, i found the key param is ‘hidden_layer_sizes=[2]’, I wonder if I >> misunderstand the meaning of parameter of hidden_layer_sizes? >> >> Is it related to the topic : >> http://stackoverflow.com/questions/36819287/mlp-classifier-of-scikit-n >> euralnetwork-not-working-for-xor >> >> >> 发件人: scikit-learn >> [mailto:scikit-learn-bounces+linjia=ruijie.com...@python.org] 代表 >> Raghav R V >> 发送时间: 2016年11月23日 19:04 >> 收件人: Scikit-learn user and developer mailing list >> 主题: Re: [scikit-learn] question about using >> sklearn.neural_network.MLPClassifier? >> >> Hi, >> >> If you keep everything at their default values, it seems to work - >> >> ```py >> from sklearn.neural_network import MLPClassifier X = [[0, 0], [0, 1], >> [1, 0], [1, 1]] y = [0, 1, 1, 0] clf = MLPClassifier(max_iter=1000) >> clf.fit(X, y) res = clf.predict([[0, 0], [0, 1], [1, 0], [1, 1]]) >> print(res) >> ``` >> >> On Wed, Nov 23, 2016 at 10:27 AM, <lin...@ruijie.com.cn> wrote: >> Hi everyone >> >> I try to use sklearn.neural_network.MLPClassifier to test the XOR >> operation, but I found the result is not satisfied. The following is code, >> can you tell me if I use the lib incorrectly? >> >> from sklearn.neural_network import MLPClassifier X = [[0, 0], [0, 1], >> [1, 0], [1, 1]] y = [0, 1, 1, 0] clf = MLPClassifier(solver='adam', >> activation='logistic', alpha=1e-3, hidden_layer_sizes=(2,), >> max_iter=1000) clf.fit(X, y) res = clf.predict([[0, 0], [0, 1], [1, >> 0], [1, 1]]) >> print(res) >> >> >> #result is [0 0 0 0], score is 0.5 >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> >> >> >> -- >> Raghav RV >> https://github.com/raghavrv >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn