> many of them need number of outlier and distance as input parameter in 
> advance, is there algorithm more intelligently ?

With ‘intelligently’ you mean ‘more automatic’ (fewer hyperparameters to define 
manually)? In my opinion, “outlier” is a highly context-specific definition, 
thus, it’s really up to you to decide what to count as an outlier or not for 
your application.

E.g., a simple non-parametric approach would be to say that point P is an 
outlier if

P > Q3 + 1.5 * IQR, 
or P < Q1  - 1.5 * IQR

where Q1 and Q3 are the first and third quartile of the dataset, respectively, 
and IQR = interquartile range (Q3-Q1). Similarly you could use thresholds based 
on variance or standard deviation, etc. so that you don’t need to specify the 
number of outliers if that’s not what you want

> On Nov 25, 2016, at 6:38 AM, lin...@ruijie.com.cn wrote:
> 
> Hello everyone, 
>      I use ' IsolationForest' to pick up the outlier data today and I notice 
> there is a ' contamination ' parameter in IsolationForest function, and its 
> default value is 0.1 = 10%
>      So is there a way to pick the outlier without assigning the proportion 
> of outliers in the data set?
>      For example, in dataset [2,3,2,4,2,3,1,2,3,1,2, 999, 2,3,2,1,2,3], we 
> can easily pick the '999' as an outlier entry out of the set according to the 
> consciousness
>      And I read some paper about outlier detect recently, many of them need 
> number of outlier and distance as input parameter in advance, is there 
> algorithm more intelligently ?
> 
> 
> 
> 
> 
> -----邮件原件-----
> 发件人: scikit-learn 
> [mailto:scikit-learn-bounces+linjia=ruijie.com...@python.org] 代表 Sebastian 
> Raschka
> 发送时间: 2016年11月25日 10:51
> 收件人: Scikit-learn user and developer mailing list
> 主题: Re: [scikit-learn] 答复: question about using 
> sklearn.neural_network.MLPClassifier?
> 
>> here is another question, when I use neural network lib routine, can I save 
>> the trained network for use at the next time?
> 
> 
> Maybe have a look at the model persistence section at 
> http://scikit-learn.org/stable/modules/model_persistence.html or 
> http://cmry.github.io/notes/serialize
> 
> Cheers,
> Sebastian
> 
> 
>> On Nov 24, 2016, at 8:08 PM, lin...@ruijie.com.cn wrote:
>> 
>> @ Sebastian Raschka
>> thanks for your analyzing ,
>> here is another question, when I use neural network lib routine, can I save 
>> the trained network for use at the next time?
>> Just like the following:
>> 
>> Foo1.py
>> …
>> Clf.fit(x,y)
>> Result_network = clf.save()
>> …
>> 
>> Foo2.py
>> …
>> Clf = Load(result_network)
>> Res = Clf.predict(newsample)
>> …
>> 
>> So I needn’t fit the train-set everytime
>> 发件人: scikit-learn 
>> [mailto:scikit-learn-bounces+linjia=ruijie.com...@python.org] 代表 
>> Sebastian Raschka
>> 发送时间: 2016年11月24日 3:06
>> 收件人: Scikit-learn user and developer mailing list
>> 主题: Re: [scikit-learn] question about using 
>> sklearn.neural_network.MLPClassifier?
>> 
>> If you keep everything at their default values, it seems to work -
>> 
>> ```py
>> from sklearn.neural_network import MLPClassifier X = [[0, 0], [0, 1], 
>> [1, 0], [1, 1]] y = [0, 1, 1, 0] clf = MLPClassifier(max_iter=1000) 
>> clf.fit(X, y) res = clf.predict([[0, 0], [0, 1], [1, 0], [1, 1]])
>> print(res)
>> ```
>> 
>> The default is set 100 units in the hidden layer, but theoretically, it 
>> should work with 2 hidden logistic units (I think that’s the typical 
>> textbook/class example). I think what happens is that it gets stuck in local 
>> minima depending on the random weight initialization. E.g., the following 
>> works just fine:
>> 
>> from sklearn.neural_network import MLPClassifier X = [[0, 0], [0, 1], 
>> [1, 0], [1, 1]] y = [0, 1, 1, 0] clf = MLPClassifier(solver='lbfgs',
>>                    activation='logistic', 
>>                    alpha=0.0, 
>>                    hidden_layer_sizes=(2,),
>>                    learning_rate_init=0.1,
>>                    max_iter=1000,
>>                    random_state=20)
>> clf.fit(X, y)
>> res = clf.predict([[0, 0], [0, 1], [1, 0], [1, 1]])
>> print(res)
>> print(clf.loss_)
>> 
>> 
>> but changing the random seed to 1 leads to:
>> 
>> [0 1 1 1]
>> 0.34660921283
>> 
>> For comparison, I used a more vanilla MLP (1 hidden layer with 2 units and 
>> logistic activation as well; 
>> https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch12/ch12.ipynb),
>>  essentially resulting in the same problem:
>> 
>> 
>> <image001.png><image002.png>
>> 
>> 
>> 
>> 
>> On Nov 23, 2016, at 6:26 AM, lin...@ruijie.com.cn wrote:
>> 
>> Yes,you are right @ Raghav R V, thx!
>> 
>> However, i found the key param is ‘hidden_layer_sizes=[2]’,  I wonder if I 
>> misunderstand the meaning of parameter of hidden_layer_sizes?
>> 
>> Is  it related to the topic : 
>> http://stackoverflow.com/questions/36819287/mlp-classifier-of-scikit-n
>> euralnetwork-not-working-for-xor
>> 
>> 
>> 发件人: scikit-learn 
>> [mailto:scikit-learn-bounces+linjia=ruijie.com...@python.org] 代表 
>> Raghav R V
>> 发送时间: 2016年11月23日 19:04
>> 收件人: Scikit-learn user and developer mailing list
>> 主题: Re: [scikit-learn] question about using 
>> sklearn.neural_network.MLPClassifier?
>> 
>> Hi,
>> 
>> If you keep everything at their default values, it seems to work -
>> 
>> ```py
>> from sklearn.neural_network import MLPClassifier X = [[0, 0], [0, 1], 
>> [1, 0], [1, 1]] y = [0, 1, 1, 0] clf = MLPClassifier(max_iter=1000) 
>> clf.fit(X, y) res = clf.predict([[0, 0], [0, 1], [1, 0], [1, 1]])
>> print(res)
>> ```
>> 
>> On Wed, Nov 23, 2016 at 10:27 AM, <lin...@ruijie.com.cn> wrote:
>> Hi everyone
>> 
>>      I try to use sklearn.neural_network.MLPClassifier to test the XOR 
>> operation, but I found the result is not satisfied. The following is code, 
>> can you tell me if I use the lib incorrectly?
>> 
>> from sklearn.neural_network import MLPClassifier X = [[0, 0], [0, 1], 
>> [1, 0], [1, 1]] y = [0, 1, 1, 0] clf = MLPClassifier(solver='adam', 
>> activation='logistic', alpha=1e-3, hidden_layer_sizes=(2,), 
>> max_iter=1000) clf.fit(X, y) res = clf.predict([[0, 0], [0, 1], [1, 
>> 0], [1, 1]])
>> print(res)
>> 
>> 
>> #result is [0 0 0 0], score is 0.5
>> 
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>> 
>> 
>> 
>> 
>> --
>> Raghav RV
>> https://github.com/raghavrv
>> 
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>> 
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to