you should not use your method in any common setting. the difference when
using scaler is that it will remember mean and variance of the Training set and
reuse that for the test set.
Gianni Iannelli schrieb:
>what I usually do is scale the training set and the dataset separately,
>but I'm d
no, just replacing () with [].
Jaques Grobler schrieb:
>Hey Andy, sorry been busy all day. You mean something like this to make
>it
>more clear ?
>
>>>> kernel_param = {'kernel':('linear', 'rbf')}
>>>> C_param = {'C':[1,10]}
parameters = (kernel_param, C_param) #List of parameter
how do we represent missing values here?
Mathieu Blondel schrieb:
>On Tue, Mar 26, 2013 at 9:25 PM, Lee Zamparo wrote:
>> AFAIK, you might not want all the missing values to be imputed at
>once,
>> especially if the dimensions of X are large. Maybe something like:
>>
>>
>> X_transformed = es
i thought OneHotEncoder solves that.
Lars Buitinck schrieb:
>2013/3/27 Anne Dwyer :
>> Just to clarify, you are saying that there is no procedure in scikit
>that
>> will transform categorical feature values into numerical values like
>I was
>> trying to do here. Correct?
>
>Not that I know of.
did you see my earlier reply?
Roman Sinayev schrieb:
>min_df=2 in the second and min_df=1 in the first.
>
>On Thu, Mar 14, 2013 at 7:19 PM, Ark <[email protected]> wrote:
>>
>>>
>>> This is unexpected. Can you inspect the vocabulary_ on both
>>> vectorizers? Try computing their set.intersectio
I want to have a non-empty menu for the user guide. the template just uses the
build in toc variable. there is also a toc_tree function but that gives the
whole toc tree, not just below the current page. I think I know how to get what
i want in rst but i have no idea how to tell sphinx to render
Exactly. Not only would you need cython, it also needs to be a recent version.
people with older versions would get cryptic error messages, leading to
frustrated users and busy mailing lists.
Matthieu Brucher schrieb:
>Hi,
>
>If I remember correctly, this is done to avoid an explicit Cython
The classifiers have a 'classes_' attribute that contains the original class
labels.
ShNaYkHs ShNaYkHs schrieb:
>Let x an example to classify:
>probas = model_svm.predict_proba([x])[0]
>how can I know what is the label (a string) corresponding to each
>predicted
>probability ? That is, probas
sgdclassifier using partial_fit. I want to do naive bayes soon
ShNaYkHs ShNaYkHs schrieb:
>Is there any incremental classifier in sklearn, that can be trained
>incrementally considering one data-point at a time ? An existing one or
>under
>development one ..
>
>
>-
not all estimators, but those that are needed for the kind of estimator it
represents, maybe? there are only four kinds of estimators, right? we should
really write those api docs ;)
Lars Buitinck schrieb:
>2013/2/26 :
>> actually i think i share tadej's view on being able to exchange
>Pipe
actually i think i share tadej's view on being able to exchange Pipelines and
classifiers. since 0.13 classes_ is basically part of the public classifier
api. so a pipeline should also have it, i guess.
"Tadej Janež" schrieb:
>On Tue, 2013-02-26 at 14:39 +0100, Lars Buitinck wrote:
>>
>>
the missing 2 in tokenizing 2.50 is indeed a bit weird, though.
Tom Fawcett schrieb:
>First, thanks for all your great work on scikits.learn! It’s making my
>life easier.
>
>Second, I found surprising behavior in sklearn.feature_extraction.text.
>I’m using TfidfVectorizer and CountVectorizer
for the missing 'r' in the docs: it looks like a sphnix glitch to me and I have
not found a way to fix. for the tokenization: the sklearn regexp seems like a
sensible default to me. what would you change it to so as to still be robust?
Tom Fawcett schrieb:
>First, thanks for all your great w
btw you could also use a different multiclass strategy like error correcting
output codes (exists in sklearn) or a binary tree of classifiers (would have to
implement yourself)
Ark <[email protected]> schrieb:
>>
>> The size is dominated by the n_features * n_classes coef_ matrix,
>> which y
you could try some backward feature selection like recursive feature
elimination or just dropping features with neglectible coeficients. group l1
penalty on the weigths would probably be the way to go but we don't have that
...
Ark <[email protected]> schrieb:
>>
>> The size is dominated by
you only need coef_ and intercept_ to make predictions but not much else should
be stored. if there is a gain from storing coef yourself it is probably a bug.
what is the number of features and classes?
Ark <[email protected]> schrieb:
>I have been wondering about what makes the size of an S
how about softmax?
David Lambert schrieb:
>Given the method of determining the class predictions in the extreme
>learning machine classifier:
>
> class_predictions = np.argmax(raw_predictions, axis=1)
>
>where raw_predictions are the (potentially negative) linear regression
>outputs
> (see
matlab doc online says linear classifier is lda by default.
Andrew Winterman schrieb:
>Logistic regression can be used as a linear classifier. Maybe that's
>matlab's linear classifier?
>
>On Thursday, February 14, 2013, David Reed wrote:
>
>> I was mistaken, R is providing the exact same resu
I have a pull request for randomized seaech but I need to update it as it is
quite old...
Ronnie Ghose schrieb:
>afaik yes. Please tell me if i'm wrong, more experienced scikitters :)
>
>
>On Sun, Feb 10, 2013 at 9:23 PM, Yaser Martinez
>wrote:
>
>> Any further development on this? Is a "brut
please check out current master, there was a bug in minibatch k means in the
release.
"Vinay B," schrieb:
>So I tried your recommendations. The partial fit seems to operate to an
>extent. Then BOOM! It looks very similar to the example in
>http://scikit-learn.org/dev/auto_examples/document_cl
+1. in fact I think we should merge simple compatible fixes to master asap,
maybe you could do a pr with the none six changes?
Lars Buitinck schrieb:
>Regarding Python 3 compat, I just started rebasing Olivier's code. Is
>it ok if I push the result a branch py3 in the master repo? I think
>th
there is a fix for that in current master. check arrays now has 'allow lists'.
andy
Robert Layton schrieb:
>When using cross_validation.X, all arrays are checked in the normal way
>--
>using check_arrays.
>I am developing code that uses string documents as input, so I have a
>list
>of strings
in general +1 but actually I'd like to release pretty soon.
Jake Vanderplas schrieb:
>Hi All,
>Just a quick heads-up: thanks to some good work by Pauli Virtanen,
>SciPy
>is currently in the process of moving to a single code-base which
>supports 2.x and 3.x, and it doesn't look extremely dif
+1
Doug Coleman schrieb:
>I guess transforming it would be more in line with other classifiers.
>The
>design decision could be "You should only have to know about
>multi-output
>if you want to use it."
>
>
>On Thu, Nov 29, 2012 at 10:07 AM, Doug Coleman
>wrote:
>
>> Going off of my unit tests
the classes_ attribute is not present in all classifiers and not consistent, as
you noticed.
this is a known issue (see the issue tracker) and it would be great to address
this.
I am not sure about the decision trees in particular.
Doug Coleman schrieb:
>Decision trees' classes are wrapped i
we should probably improve the docs on the ovr. iirc the user guide was already
very explicit, maybe add something to the docstring?
abhi: did you read the user guide on the one vs rest classifier? how could we
improve it to make things more clear?
Mathieu Blondel schrieb:
>On Tue, Nov 6, 20
can you try linking against libatlas manually? that should do it. then i'll to
fix the setup.py
Andrew Godbehere schrieb:
>Hi Andy,
>
>I found _ATL_drotg defined in /opt/local/lib/libatlas.a.
>_ATL_drotg is listed as an undefined reference in libcblas.a,
>libf77blas.a, and libptcblas.a.
>
>T
I agree, but for this specific issue, I thought that the consensus was 'give a
warning' and so now it should be fairly clear what to do. I tried to avoid
tagging issues as easy if high level knowledge was needed, maybe I didn't
succeed.
Andy
--
Diese Nachricht wurde von meinem Android-Mobiltel
+1 but we should adhere to the rule of waiting at least two releases. And
deprecation warnings on renamed parameters never produce spurious warnings.
--
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
Gael Varoquaux schrieb:
On Fri, Aug 31, 2012 at 03:50:15PM +03
We do? Which warnings do you mean? I am not aware of any warnings in the tests
or examples.
--
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
Vlad Niculae schrieb:
We are all annoyed by warnings; we have a ton of them at the moment. Some of
them are scheduled f
Sure, no problem.
--
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
Peter Prettenhofer schrieb:
Hi all,
unfortunately, I'm not available on Saturday and Sunday - if possible,
it would be great if we could post-pone the release until Tuesday.
thanks,
Peter
2012
I might be able to give it a try later on.
Alexandre Gramfort schrieb:
nobody working with current master on windows with mingw ?
Any help would be greatly appreciated.
Alex
On Fri, Aug 3, 2012 at 9:39 PM, Alexandre Gramfort
wrote:
> hi,
>
> can anybody with a windows machine and no blas a
I just read the Post and i was wodering: shouldn't extra trees be faster than
random forests? In the Blog Post they are slower.
Andy
--
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
Olivier Grisel schrieb:
Here is the link:
http://blog.explainmydata.com/2012/0
Hi David. Very Nice Blog post. I'm out so just a short comment for now: both
the difference in Timing and Performance is probably due to the fact that my
Implementation does batch learning and yours does online learning. For
benchmarking cython i recommend you look into Fabians yep Tool. Cheer
Hi Sheila.
I think Peter got the right answer: load_svmlight_File yields a sparse Matrix
that you need to convert to an Array First.
Cheers, andy
--
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
Sheila the angel schrieb:
Hi Andreas,
there is no difference bet
I would try using a chi squared Kernel. You can Start by using the
approximation provided in sklearn.
Cheers, andy
--
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
Philipp Singer schrieb:
Hey there!
I am currently trying to classify a dataset which has the fol
36 matches
Mail list logo