Another suggestion is Google Books Ngrams: http://storage.googleapis.com/books/ngrams/books/datasetsv2.html
On Fri, Dec 7, 2012 at 7:57 AM, Phillip Rhodes <[email protected]>wrote: > On Fri, Dec 7, 2012 at 10:48 AM, Harsh J <[email protected]> wrote: > > > > On Fri, Dec 7, 2012 at 8:31 PM, Yin Steve <[email protected]> wrote: > >> Hello, I'm Steve who need some raw big data for studying mapreduce > >> programming. Where can i find them? especially those about weblog, > traffic > >> info etc. My English is not so well, if you can give me a URL which > directly > >> help me download the big file, That'll be great. > >> Waiting for your reply...... > > Try some of the links off of this Quora thread: > > > http://www.quora.com/Data/Where-can-I-find-large-datasets-for-modeling-confidence-during-the-financial-crisis-which-is-open-to-the-public > > You might also try googling "Enron corpus". Or check out CommonCrawl.org. > > > Phil >
