Dear Andy & the rest,

by "StandardScaler" => are you talking about the "Scaler" class of the 
"preprocessing" module?

In my case, I used the "preprocessing.scale" routine:
"
X  = preprocessing.scale(dataDescrs_array)
"
This should call the same routine, at least that's the way I understood 
the documentation.


When now performing the PCA, the explained variance of first two 
components of the PCA are 0.197 and 0.057
=> My interpretation of this result: for my binary classification problem 
("active" and "inactive") of my samples set, the features make no clear 
distinction between the two classes.

I guess now the real work starts... :=)



Cheers & Thanks,
Paul



> This is a general problem if the features are not in the same units.
> As you saw, PCA assumes that features all have equal importance.
> If you want all to have the same weight, you have to rescale (using 
> StandardScaler for example).
> The problem is: it is not clear whether this is the right thing to do.
> 
> Maybe the one component just was much more important and the rest 
> was just noise.
> So you have to use your own knowledge of the data - you just have to
> be aware which
> algorithms make which assumptions.
> 
> 
> On 01/10/2013 06:00 PM, [email protected] wrote:
> Sorry for the confusion, guys. 
> 
> But I did not scale my features - they contain a wild mixture of values: 

> - floats ranging from 0 to 1200 
> - floats ranging from 0 to 60 
> - integers between 0 and 25 
> 
> and so on... 
> 
> 
> My fault! 
> 
> BTW, I tried to re-run the IRIS example (http://scikit-learn.org/
> stable/auto_examples/decomposition/plot_pca_vs_lda.html) on my data 
> without any preprocessing.. 
> 
> 
> Cheers & Thanks, 
> Paul 
> 



This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, you 
must not copy this message or attachment or disclose the contents to any other 
person. If you have received this transmission in error, please notify the 
sender immediately and delete the message and any attachment from your system. 
Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept 
liability for any omissions or errors in this message which may arise as a 
result of E-Mail-transmission or for damages resulting from any unauthorized 
changes of the content of this message and any attachment thereto. Merck KGaA, 
Darmstadt, Germany and any of its subsidiaries do not guarantee that this 
message is free of viruses and does not accept liability for any damages caused 
by any virus transmitted therewith.

Click http://www.merckgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.
------------------------------------------------------------------------------
Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and
much more. Get web development skills now with LearnDevNow -
350+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122812
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to