> > Robin, Jeff, I am curious what you think about these ideas relative to > k-means and other clustering techniques. > > This is exciting, having a generic vectorizer, (without the dependency on dictionary) is great!. It will fit right in with the rest of clustering, it all using vectors after all. But wont the similarity metrics need to be different for such a vector?
About the dictionary based trace. I need to actually see how the trace is useful. Do you keep track of the most important feature from those that go into a particular hashed location?. In clustering, we need to show the cluster centroids and the top features in it for text. I don't know if that is useful for types of data other than text. With these vectors how would the cluster dumper change? Robin
