On Fri, Aug 26, 2011 at 8:29 AM, Jeff Hansen <[email protected]> wrote:
> Thanks for the math Ted -- that was very helpful. > NP. > ... I've been playing with smaller matrices > mainly for my own learning purposes -- it's much easier to read through 200 > movies (most of which I've heard of) and get a gut feel, than 10,000 > movies. > I think it would still be a good idea to analyze on a larger data set even if you only analyze a few movies relative to your own ground truth. Let me know if you need the in-core stochastic projection. ... Sometimes > it's important to realize the real world constraints. Picture a company > with very physical locations ... So you really end up with a three tier > market > segmentation -- one strategy works best for the head, another for the body, > and a third for the tail. > This is definitely true. > As far as clusters go -- I really wasn't finding any clusters at the edges > of the data, but that could have more to do with not including the tail > (and > not normalizing appropriately for popularity). > Indeed. ... -- and if you were simply automating all of > this and not reviewing it with common sense, you could end up offending > some > of your users... All I'm saying is there may be trends in the real world > that some people aren't comfortable having pointed out. > This is definitely something we saw at Veoh. Sometimes the right thing to do is have a list of exceptions.
