[twitter-dev] Re: Twitter Data Dumps?

2009-10-26 Thread futureboy

Is there a way to filter based on some criteria? In particular, I want
the service to only return tweets that are geo-enabled, i.e. geo is
not null.

Right now I'm opening a connection to 
http://stream.twitter.com/1/statuses/sample.json
and it works fine, but I was hoping I could have some additional query
string like http://stream.twitter.com/1/statuses/sample.json?filter=geo.
Obviously, I can filter this out after I've received any tweet, but I
would imagine I'd get much more results if I asked the service to only
serve up geo-enabled tweets.

Thanks,
futureboy

On Oct 25, 5:43 pm, futureboy future...@gmail.com wrote:
 Hey everyone, thanks for the feedback, I'm receiving tons of data from
 the streaming API, very exciting!

 On Oct 21, 10:26 pm, futureboy future...@gmail.com wrote:



  Hi folks, I'm interested in doing some of my own twitterdatamining
  and I was curious if Twitter posts anydatasets covering 24 hour
  periods, or perhaps even longer intervals. For instance I'd love to
  have a full dump of all tweets over the month when Michael Jackson
  died, but even 24 hourdumpswould be great. The format seems super
  simple, just a timestamp, a username, and the message. Multiply that
  by a few million records/day or whatever Twitter now experiences.

  Obviously this would be a decent amount ofdataalthough compression
  should certainly help. Does Twitter provide suchdatasets anywhere?
  If so, where and how can I access them?


[twitter-dev] Re: Twitter Data Dumps?

2009-10-26 Thread John Kalucki

We have some feature plans for geotagging in the streaming api but
they probably won't apply to the sampled streams. Just take the
streams and filter on your end. We don't mind.

Note that the geotagging feature is not yet enabled, but it will be
soon. For now, I think the only information you have is in the user's
profile.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter.com

On Oct 25, 7:15 pm, futureboy future...@gmail.com wrote:
 Is there a way to filter based on some criteria? In particular, I want
 the service to only return tweets that are geo-enabled, i.e. geo is
 not null.

 Right now I'm opening a connection 
 tohttp://stream.twitter.com/1/statuses/sample.json
 and it works fine, but I was hoping I could have some additional query
 string likehttp://stream.twitter.com/1/statuses/sample.json?filter=geo.
 Obviously, I can filter this out after I've received any tweet, but I
 would imagine I'd get much more results if I asked the service to only
 serve up geo-enabled tweets.

 Thanks,
 futureboy

 On Oct 25, 5:43 pm, futureboy future...@gmail.com wrote:

  Hey everyone, thanks for the feedback, I'm receiving tons of data from
  the streaming API, very exciting!

  On Oct 21, 10:26 pm, futureboy future...@gmail.com wrote:

   Hi folks, I'm interested in doing some of my own twitterdatamining
   and I was curious if Twitter posts anydatasets covering 24 hour
   periods, or perhaps even longer intervals. For instance I'd love to
   have a full dump of all tweets over the month when Michael Jackson
   died, but even 24 hourdumpswould be great. The format seems super
   simple, just a timestamp, a username, and the message. Multiply that
   by a few million records/day or whatever Twitter now experiences.

   Obviously this would be a decent amount ofdataalthough compression
   should certainly help. Does Twitter provide suchdatasets anywhere?
   If so, where and how can I access them?


[twitter-dev] Re: Twitter Data Dumps?

2009-10-25 Thread futureboy

Hey everyone, thanks for the feedback, I'm receiving tons of data from
the streaming API, very exciting!

On Oct 21, 10:26 pm, futureboy future...@gmail.com wrote:
 Hi folks, I'm interested in doing some of my own twitterdatamining
 and I was curious if Twitter posts anydatasets covering 24 hour
 periods, or perhaps even longer intervals. For instance I'd love to
 have a full dump of all tweets over the month when Michael Jackson
 died, but even 24 hourdumpswould be great. The format seems super
 simple, just a timestamp, a username, and the message. Multiply that
 by a few million records/day or whatever Twitter now experiences.

 Obviously this would be a decent amount ofdataalthough compression
 should certainly help. Does Twitter provide suchdatasets anywhere?
 If so, where and how can I access them?


[twitter-dev] Re: Twitter Data Dumps?

2009-10-21 Thread John Kalucki

Historical data is not available. Grab the /1/statuses/sample.format
stream from the Streaming API. Wait a few days and you'll have a
corpus to play with.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.



On Oct 21, 7:26 pm, futureboy future...@gmail.com wrote:
 Hi folks, I'm interested in doing some of my own twitter data mining
 and I was curious if Twitter posts any data sets covering 24 hour
 periods, or perhaps even longer intervals. For instance I'd love to
 have a full dump of all tweets over the month when Michael Jackson
 died, but even 24 hour dumps would be great. The format seems super
 simple, just a timestamp, a username, and the message. Multiply that
 by a few million records/day or whatever Twitter now experiences.

 Obviously this would be a decent amount of data although compression
 should certainly help. Does Twitter provide such data sets anywhere?
 If so, where and how can I access them?


[twitter-dev] Re: Twitter Data Dumps?

2009-10-21 Thread JOHN OBRIEN

Futureboy,
We have historical hashtagged data at TwapperKeeper  (#michaeljackson,  
#iranelection, etc) and many others based upon hashtags that can be  
exported for review.


http://twapperkeeper.com

If you have any questions, let me know.

v/r,
John
@jobrieniii


On Oct 21, 2009, at  10:26 PM, futureboy wrote:



Hi folks, I'm interested in doing some of my own twitter data mining
and I was curious if Twitter posts any data sets covering 24 hour
periods, or perhaps even longer intervals. For instance I'd love to
have a full dump of all tweets over the month when Michael Jackson
died, but even 24 hour dumps would be great. The format seems super
simple, just a timestamp, a username, and the message. Multiply that
by a few million records/day or whatever Twitter now experiences.

Obviously this would be a decent amount of data although compression
should certainly help. Does Twitter provide such data sets anywhere?
If so, where and how can I access them?