It looks like a bug but I cannot reproduce it: >>> from sklearn.feature_extraction.text import HashingVectorizer >>> vec = HashingVectorizer(n_features=5, binary=True, norm=None) >>> vec.transform(['this simple test']).toarray() array([[ 0., 1., 1., 0., 1.]]) >>> vec.transform(['this simple fest']).toarray() array([[ 0., 1., 1., 0., 0.]])
In this case I have a single collision. -- Olivier ------------------------------------------------------------------------------ Get your SQL database under version control now! Version control is standard for application code, but databases havent caught up. So what steps can you take to put your SQL databases under version control? Why should you start doing it? Read more to find out. http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general