I am a linguist at the University of Sydney currently studying the
language of microblogging. I would like to build a 100 million word
corpus of tweets. I am trying to determine the best way of collecting
such a corpus. Does Twitter make data available directly or is the
only method scraping tweets using the API( I am not a programmer
myself although I do have access to a programmer who is able to use
the API)?

If I was to use the API would rate limiting mean that it is going to
take ages to reach 100 million tweets?


