David, You can capture a sample of the statuses via the Streaming API and perform the analysis on that data set. The /gardenhose and /spritzer feeds exist precisely for this type of experiementation. There's no practical way to get a copy of the full social graph. (Aside: It's hard enough for us to store and serve the SGS for internal purposes. The size and velocity alone make it, cough, cough, unwieldy.)
If you are interested in doing this sort of analysis full-time, apply for a job! We're a data-driven shop, and we're always crawling over the numbers. -John Kalucki Services, Twitter Inc. On May 20, 3:36 pm, David W <[email protected]> wrote: > Hi there, > > While working with the Twitter API last night, I found myself thinking > of some crazy ideas for use of the full public timeline feed. Proving > these ideas would be pretty simple given a sample of the timeline on > my laptop, and so I was wondering if such a thing is available? > > Basically, I'd like a copy of about 24 hours worth of the equivalent > of the XMPP feed from some arbitrary moment in time, perhaps with a > snapshot of the social graph for the people that tweeted during that > time frame. If something like this isn't already available for > research purposes, I think it'd be a wonderful contribution on > Twitter's part, perhaps even if some anonymization was applied > (although this seems pointless given it *is* the public timeline). > > If nothing else, it'd allow people like me (hacker with a laptop and > 4gb of RAM) to quickly come up with much cooler uses for the Twitter > data. :) > > Thoughts? > > David.
