[twitter-dev] Re: random sampling of users....do we know anything about user id range?

2009-06-05 Thread TechRavingMad

Yeah, it definately not users, just IDs.  Out of the (now) 44.8+ user
IDs about half are actual user accounts that haven't been deleted or
banned.

That's accounts, it counts multiple accounts from the same person as
individual accounts.

Twitter IDs are sequentially assigned.  This is easily verifiable by
watching the creation of the accounts via the api.


[twitter-dev] Re: random sampling of users....do we know anything about user id range?

2009-06-04 Thread Nick Arnett
On Wed, Jun 3, 2009 at 7:13 PM, TechRavingMad techraving...@gmail.comwrote:


 There are a little over 44.5 million twitter IDs as of right now
 (10:10pm cst 6/3/9) with what seems to be about 10 being added every
 second.


However, Twitter has been quite clear about not saying if status IDs
correspond to the actual number of statuses, so I'd guess that they're
equally circumspect about whether or not the number of user IDs corresponds
to the number of users.  In other words, we can be sure there are not more
than 44.5 million users, but we don't know how much lower the actual number
is.  We don't know if all IDs have been used... and even Twitter doesn't
know how many of those IDs belong to the same users.

I would think that if one wants a random sample of users, one would have to
propose a selection method and ask Twitter if there's any reason that it
would introduce a selection bias... and hope that they are willing to reply.

Seems to me that the biggest problem would be to include quiet users,
since only those who post in public become visible.

NIck


[twitter-dev] Re: random sampling of users....do we know anything about user id range?

2009-06-04 Thread David Fisher

I'm hoping that Twitter counts users when reporting their numbers,
and not accounts. The reason being that I've signed up probably... 5
accounts myself (main, API testing, business, etc, etc). I'm not sure
how many the average user signs up, but it's definitely on average
more than 1.

I'm wondering if IDs are sequential for users afterall. Not using
system generated primary keys if I remember right puts more strain on
the system as it has to check uniqueness and generate a number (not
hard, but still), and most of Twitter is all about scaling and speed
as I see it.

Otherwise it seems on creating a new user, they are taking the last
ID, and adding an artibrary number to it (1d20?) for the next user
ID.

45M seems like a lot of users, but I could see there being that many
accounts. perhaps

On Jun 4, 11:44 am, Nick Arnett nick.arn...@gmail.com wrote:
 On Wed, Jun 3, 2009 at 7:13 PM, TechRavingMad techraving...@gmail.comwrote:



  There are a little over 44.5 million twitter IDs as of right now
  (10:10pm cst 6/3/9) with what seems to be about 10 being added every
  second.

 However, Twitter has been quite clear about not saying if status IDs
 correspond to the actual number of statuses, so I'd guess that they're
 equally circumspect about whether or not the number of user IDs corresponds
 to the number of users.  In other words, we can be sure there are not more
 than 44.5 million users, but we don't know how much lower the actual number
 is.  We don't know if all IDs have been used... and even Twitter doesn't
 know how many of those IDs belong to the same users.

 I would think that if one wants a random sample of users, one would have to
 propose a selection method and ask Twitter if there's any reason that it
 would introduce a selection bias... and hope that they are willing to reply.

 Seems to me that the biggest problem would be to include quiet users,
 since only those who post in public become visible.

 NIck


[twitter-dev] Re: random sampling of users....do we know anything about user id range?

2009-06-03 Thread TechRavingMad

There are a little over 44.5 million twitter IDs as of right now
(10:10pm cst 6/3/9) with what seems to be about 10 being added every
second.