I am interested to do something deeper than the surface-level processing of a user's incoming tweets. For this, I will need to create a corpus of the user's friends_timeline over, say, past one month or any computationally feasible period. Basically, a large enough set of, say, 1-100 Million tweets for someone following 100-1000 people. It would be only a one-time download, as afterwards, incremental downloads should suffice.
This would translate into 100MB-10 GB of download for a user. It could be less for people following less or less-active people. Does Twitter API provide support for such corpus creation ? It could be very helpful for Natural Language Processing research if Twitter creates some sample corpus of public_timeline or some selected user's timelines. Looking forward to some help in this regard. Thanks
