On Thu, Feb 25, 2010 at 12:38 PM, Robin Anil <robin.a...@gmail.com> wrote:

> Whats the largest dataset available? BixoLabs ? Wikipedia(5 Mil
> articles)...
> I dont know anything public that is that big
>

5 million articles, if you take all the 1,2,3,4, and 5-grams data out of it,
you
could easily hit more than 4B individual matrix entries.

  -jake

Reply via email to