After we delete hapax, we may have considerably fewer tokens. But the LLR step that Robin implied may have already dealt with that.
On Thu, Feb 25, 2010 at 1:43 PM, Jake Mannix <jake.man...@gmail.com> wrote: > Of course, at this point we've > got > too many terms to properly do the decomposition directly on the input > matrix, > we'd have to do it on the transpose, > -- Ted Dunning, CTO DeepDyve