> To clarify, does this mean that each (non-protected) user has an equal
> probability of showing up in the stream regardless of how often they
> tweet?
Nope. The stream is a sample of statuses as they are posted. Each
status has an equal probability of being selected. This isn't a user
sampling
> That sample will be biased towards more active posters and may include
> some demographic biases due to seasonal activities during the limited
> time frame of the sample.
That answers my question, and that is what I was afraid of. I think
for my purposes (language detection), a random sample of
> I am doing some research using the Twitter API and I would like to get
> a random sample of Twitter users. Any ideas of how this can be
> accomplished?
Here's a start:
http://en.wikipedia.org/wiki/Sampling_(statistics)
At this point you are asking for a sampling method without providing an
ad
The Streaming API sample method would provide a random sampling of
public users weighted by update rate, not a random sampling of all
users. The default 'spritzer' should be sufficient for most uses.
-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.
On Oct 12, 8:01 am, Andrew Bad
Doesn't the streaming API have a sampling method "status/sample" for
statuses from which you can derive users? And don't the docs describe
this as "random," while specifying gardenhose access is required for
statistically significant samples?
∞ Andy Badera
∞ +1 518-641-1280
∞ This email is: [ ] b