On Wed, Oct 26, 2011 at 03:02:28PM +0200, SK Sn wrote:
> Hi there, I am trying to apply and test several dimension reduction methods
> on 20Newsgroup data. However, I got errors, which I did not get how, on all
> of them except RandomPCA. Would you please help me to get a better
> understand of the issue?

> X = Vectorizer(max_features=10000).fit_transform(data_set.data)

I think that your problem is that X (returned by the Vectorizer) is a
sparse matrix, and that the different methods other than the
RandomizedPCA do not accept sparse matrices as inputs.

You can make the data dense using 
X = X.todense()

This will consume much more memory, and might not be an option, though.

HTH,

Gaƫl

------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to