[Scikit-learn-general] Not consistent return types of fit_transform methods in Vectorizer classes

Willi Richert Fri, 04 Jan 2013 03:15:08 -0800

Hi,

I realized that the fit_transform method of TfidfVectorizer returns a CSR
matrix, which supports array indexing, while CountVectorizer returns a COO
matrix, which doesn't. I always liked the clean and interchangeable nature
of sklearn, so I wondered, whether it would  break other pieces if we would
return a CSR matrix in CountVectorizer as well. Or is performance a concern
here?


CountVectorizer's fit_transform:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_extraction/text.py#L530

TfidfVectorizer's fit_transform:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_extraction/text.py#L942

Thanks,
wr

------------------------------------------------------------------------------
Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and
much more. Get web development skills now with LearnDevNow -
350+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122812

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

[Scikit-learn-general] Not consistent return types of fit_transform methods in Vectorizer classes

Reply via email to