I'd like to get somewhere around 100GB of tweets. It doesn't matter
where they are from, when they were sent, etc. I'd just like to have a
relatively large collection of data to use as assignment data for a
class I'm teaching that uses Hadoop.

Is such a collection available for download anywhere, or is there an
existing program I could use to simply record twitter data for some
period of time? (I've heard about both the firehose and the streaming
API, but can't seem to find anything that is ready to run with that
for this particular task....but I might not know where to look).


Ted Pedersen

