2013/7/11 Gad Abraham <gad.abra...@gmail.com>:
> I'm very much a sklearn beginner, and I'd like to use FeatureHasher to
> reduce the dimensionality of a numeric matrix. Any hints on how to do this?
> I've seen the examples showing how to use it with text.

You mean the input is a NumPy array? There's no special support for
that, but the following should work (though it may be slow). Let X be
your array and d the desired dimensionality, then:

    hasher = FeatureHasher(n_features=d, input_type="pair")
    features = map(str, range(X.shape[1]))
    Xh = hasher.transform(zip(features, row) for row in X).toarray()

hashes X into Xh of shape (X.shape[0], d).

You might want to look at the random projection module [1], which can
do somewhat similar transforms much more quickly.

[1] 
http://scikit-learn.org/stable/modules/random_projection.html#random-projection

-- 
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam

------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to