Re: [Scikit-learn-general] Integrating HashingVectorizer into Pipeline

2015-05-07 Thread Adam Goodkind
Thanks Fred, that was the issue. I had to change tfidfvectorizer to tfidftransfrmer. I thought I couldn't use idf either, but according to this example http://scikit-learn.org/stable/auto_examples/text/document_clustering.html#example-text-document-clustering-py it can be done. - Adam On Thu, Ma

Re: [Scikit-learn-general] Integrating HashingVectorizer into Pipeline

2015-05-07 Thread Fred Mailhot
I think possibly you want the TfidfTransformer, *before* the HashingVectorizer...BUT...the documentation for the HashingVectorizer appears to discount the possibility of IDF-weighting: http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.HashingVectorizer.html On 7 Ma