On Thu, Feb 25, 2010 at 12:38 PM, Robin Anil <robin.a...@gmail.com> wrote:
> Whats the largest dataset available? BixoLabs ? Wikipedia(5 Mil > articles)... > I dont know anything public that is that big > 5 million articles, if you take all the 1,2,3,4, and 5-grams data out of it, you could easily hit more than 4B individual matrix entries. -jake