2013/6/3 Andreas Mueller <amuel...@ais.uni-bonn.de>: > I named the variable, I think, and it is a bad name :-( > Should we rename it? > > I think giving a count makes more sense than giving a frequency: you want to > exclude outliers that appear only once or twice for example.
I actually hadn't seen this reply. It's not a bad name: it's a minimum for document frequency, df. And yes, absolute counts are more common than relative frequencies; usually, you just set a cutoff to reduce noisy features. -- Lars Buitinck Scientific programmer, ILPS University of Amsterdam ------------------------------------------------------------------------------ Get 100% visibility into Java/.NET code with AppDynamics Lite It's a free troubleshooting tool designed for production Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://p.sf.net/sfu/appdyn_d2d_ap2 _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general