[twitter-dev] Re: Twitter Data Dumps?
Is there a way to filter based on some criteria? In particular, I want the service to only return tweets that are geo-enabled, i.e. geo is not null. Right now I'm opening a connection to http://stream.twitter.com/1/statuses/sample.json and it works fine, but I was hoping I could have some additional query string like http://stream.twitter.com/1/statuses/sample.json?filter=geo. Obviously, I can filter this out after I've received any tweet, but I would imagine I'd get much more results if I asked the service to only serve up geo-enabled tweets. Thanks, futureboy On Oct 25, 5:43 pm, futureboy future...@gmail.com wrote: Hey everyone, thanks for the feedback, I'm receiving tons of data from the streaming API, very exciting! On Oct 21, 10:26 pm, futureboy future...@gmail.com wrote: Hi folks, I'm interested in doing some of my own twitterdatamining and I was curious if Twitter posts anydatasets covering 24 hour periods, or perhaps even longer intervals. For instance I'd love to have a full dump of all tweets over the month when Michael Jackson died, but even 24 hourdumpswould be great. The format seems super simple, just a timestamp, a username, and the message. Multiply that by a few million records/day or whatever Twitter now experiences. Obviously this would be a decent amount ofdataalthough compression should certainly help. Does Twitter provide suchdatasets anywhere? If so, where and how can I access them?
[twitter-dev] Re: Twitter Data Dumps?
We have some feature plans for geotagging in the streaming api but they probably won't apply to the sampled streams. Just take the streams and filter on your end. We don't mind. Note that the geotagging feature is not yet enabled, but it will be soon. For now, I think the only information you have is in the user's profile. -John Kalucki http://twitter.com/jkalucki Services, Twitter.com On Oct 25, 7:15 pm, futureboy future...@gmail.com wrote: Is there a way to filter based on some criteria? In particular, I want the service to only return tweets that are geo-enabled, i.e. geo is not null. Right now I'm opening a connection tohttp://stream.twitter.com/1/statuses/sample.json and it works fine, but I was hoping I could have some additional query string likehttp://stream.twitter.com/1/statuses/sample.json?filter=geo. Obviously, I can filter this out after I've received any tweet, but I would imagine I'd get much more results if I asked the service to only serve up geo-enabled tweets. Thanks, futureboy On Oct 25, 5:43 pm, futureboy future...@gmail.com wrote: Hey everyone, thanks for the feedback, I'm receiving tons of data from the streaming API, very exciting! On Oct 21, 10:26 pm, futureboy future...@gmail.com wrote: Hi folks, I'm interested in doing some of my own twitterdatamining and I was curious if Twitter posts anydatasets covering 24 hour periods, or perhaps even longer intervals. For instance I'd love to have a full dump of all tweets over the month when Michael Jackson died, but even 24 hourdumpswould be great. The format seems super simple, just a timestamp, a username, and the message. Multiply that by a few million records/day or whatever Twitter now experiences. Obviously this would be a decent amount ofdataalthough compression should certainly help. Does Twitter provide suchdatasets anywhere? If so, where and how can I access them?
[twitter-dev] Re: Twitter Data Dumps?
Hey everyone, thanks for the feedback, I'm receiving tons of data from the streaming API, very exciting! On Oct 21, 10:26 pm, futureboy future...@gmail.com wrote: Hi folks, I'm interested in doing some of my own twitterdatamining and I was curious if Twitter posts anydatasets covering 24 hour periods, or perhaps even longer intervals. For instance I'd love to have a full dump of all tweets over the month when Michael Jackson died, but even 24 hourdumpswould be great. The format seems super simple, just a timestamp, a username, and the message. Multiply that by a few million records/day or whatever Twitter now experiences. Obviously this would be a decent amount ofdataalthough compression should certainly help. Does Twitter provide suchdatasets anywhere? If so, where and how can I access them?
[twitter-dev] Re: Twitter Data Dumps?
Historical data is not available. Grab the /1/statuses/sample.format stream from the Streaming API. Wait a few days and you'll have a corpus to play with. -John Kalucki http://twitter.com/jkalucki Services, Twitter Inc. On Oct 21, 7:26 pm, futureboy future...@gmail.com wrote: Hi folks, I'm interested in doing some of my own twitter data mining and I was curious if Twitter posts any data sets covering 24 hour periods, or perhaps even longer intervals. For instance I'd love to have a full dump of all tweets over the month when Michael Jackson died, but even 24 hour dumps would be great. The format seems super simple, just a timestamp, a username, and the message. Multiply that by a few million records/day or whatever Twitter now experiences. Obviously this would be a decent amount of data although compression should certainly help. Does Twitter provide such data sets anywhere? If so, where and how can I access them?
[twitter-dev] Re: Twitter Data Dumps?
Futureboy, We have historical hashtagged data at TwapperKeeper (#michaeljackson, #iranelection, etc) and many others based upon hashtags that can be exported for review. http://twapperkeeper.com If you have any questions, let me know. v/r, John @jobrieniii On Oct 21, 2009, at 10:26 PM, futureboy wrote: Hi folks, I'm interested in doing some of my own twitter data mining and I was curious if Twitter posts any data sets covering 24 hour periods, or perhaps even longer intervals. For instance I'd love to have a full dump of all tweets over the month when Michael Jackson died, but even 24 hour dumps would be great. The format seems super simple, just a timestamp, a username, and the message. Multiply that by a few million records/day or whatever Twitter now experiences. Obviously this would be a decent amount of data although compression should certainly help. Does Twitter provide such data sets anywhere? If so, where and how can I access them?