Hello Yin,
You may find this interesting :
https://github.com/unitedstates
Regards,
Mohammad Tariq
On Sat, Dec 8, 2012 at 3:25 AM, Chris Nauroth <[email protected]>wrote:
> Another suggestion is Google Books Ngrams:
>
> http://storage.googleapis.com/books/ngrams/books/datasetsv2.html
>
>
> On Fri, Dec 7, 2012 at 7:57 AM, Phillip Rhodes
> <[email protected]>wrote:
>
>> On Fri, Dec 7, 2012 at 10:48 AM, Harsh J <[email protected]> wrote:
>> >
>> > On Fri, Dec 7, 2012 at 8:31 PM, Yin Steve <[email protected]> wrote:
>> >> Hello, I'm Steve who need some raw big data for studying mapreduce
>> >> programming. Where can i find them? especially those about weblog,
>> traffic
>> >> info etc. My English is not so well, if you can give me a URL which
>> directly
>> >> help me download the big file, That'll be great.
>> >> Waiting for your reply......
>>
>> Try some of the links off of this Quora thread:
>>
>>
>> http://www.quora.com/Data/Where-can-I-find-large-datasets-for-modeling-confidence-during-the-financial-crisis-which-is-open-to-the-public
>>
>> You might also try googling "Enron corpus". Or check out
>> CommonCrawl.org.
>>
>>
>> Phil
>>
>
>