[twitter-dev] Re: Bulk id - screen_name resolution.

2009-06-02 Thread Stuart

2009/6/1 Nick Arnett nick.arn...@gmail.com:


 On Sun, May 31, 2009 at 3:57 PM, Stuart stut...@gmail.com wrote:

 Much as I respect Twitter and the great people who work there, I don't
 buy that this would place too much demand on their servers. They
 already use Memcached extensively, and this would be a pretty simple
 addition to that data store.

 For that very reason, I'm not sure it makes sense for third parties to
 collaborate on a single-purpose distributed store.  There are user/account
 properties that Twitter won't implement, at least not until there's a lot of
 demonstrated value.  In other words, the developer community could
 collaborate on problems that have marginal value to Twitter in the short
 run.

I'm not suggesting that it would only be usable as an ID =
screen_name repository. I'm suggesting that we could build our own
copy of the user data so we can provide API calls that Twitter don't
or won't. Clearly this is not ideal, but if there's no other choice I
definitely believe it's worth the effort.

At the end of the day it comes down to this would you pay to have
higher API limits? Would Twitter be interested in providing higher
limits to paying developers?

At any rate, based on what Doug has just said it's probably not worth
doing anything until the new TOS are published, just in case it turns
out to be wasted effort.

-Stuart

-- 
http://stut.net/projects/twitter/


[twitter-dev] Re: Bulk id - screen_name resolution.

2009-06-01 Thread Nick Arnett
On Sun, May 31, 2009 at 3:57 PM, Stuart stut...@gmail.com wrote:


 Much as I respect Twitter and the great people who work there, I don't
 buy that this would place too much demand on their servers. They
 already use Memcached extensively, and this would be a pretty simple
 addition to that data store.


For that very reason, I'm not sure it makes sense for third parties to
collaborate on a single-purpose distributed store.  There are user/account
properties that Twitter won't implement, at least not until there's a lot of
demonstrated value.  In other words, the developer community could
collaborate on problems that have marginal value to Twitter in the short
run.

Nick


[twitter-dev] Re: Bulk id - screen_name resolution.

2009-06-01 Thread Doug Williams
There is currently nothing in our TOS preventing this type of project from
developing. However, we are writing our Terms for the API and at this time I
cannot speak of how a service redistributing our data will be classified.
So, proceed as you will, but be warned that we may have to have a discussion
down the road when the API Terms of Service better defines our relationship
with developers.
Thanks,
Doug
--

Doug Williams
Twitter Platform Support
http://twitter.com/dougw




On Mon, Jun 1, 2009 at 8:52 AM, Nick Arnett nick.arn...@gmail.com wrote:



 On Sun, May 31, 2009 at 3:57 PM, Stuart stut...@gmail.com wrote:


 Much as I respect Twitter and the great people who work there, I don't
 buy that this would place too much demand on their servers. They
 already use Memcached extensively, and this would be a pretty simple
 addition to that data store.


 For that very reason, I'm not sure it makes sense for third parties to
 collaborate on a single-purpose distributed store.  There are user/account
 properties that Twitter won't implement, at least not until there's a lot of
 demonstrated value.  In other words, the developer community could
 collaborate on problems that have marginal value to Twitter in the short
 run.

 Nick




[twitter-dev] Re: Bulk id - screen_name resolution.

2009-06-01 Thread Dossy Shiobara


On 6/1/09 6:59 PM, Doug Williams wrote:

There is currently nothing in our TOS preventing this type of project
from developing. However, we are writing our Terms for the API and at
this time I cannot speak of how a service redistributing our data will
be classified.

So, proceed as you will, but be warned that we may have to have a
discussion down the road when the API Terms of Service better defines
our relationship with developers.


Doug, any kind of rough timeline for such an API TOS?  Weeks?  Months? 
Years?


--
Dossy Shiobara  | do...@panoptic.com | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
  He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on. (p. 70)


[twitter-dev] Re: Bulk id - screen_name resolution.

2009-05-31 Thread Dan Brickley


On 31/5/09 13:03, Stuart wrote:

Since there's clearly a lot of demand for this feature is it not
possible for it to be added to the official API? I'd hesitate before
building anything on top of Twitter that also relies on a third party
for something so basic.


Related suggestion: have common REST API for external services who can 
provide this information. You can probably get it from google social 
graph API too, for example.


Dan


[twitter-dev] Re: Bulk id - screen_name resolution.

2009-05-31 Thread Philip Plante

I would like to hear a response from Twitter on the sharing of this
data.  My db has about 2 million active users, and I have another db
with 6 million or so I would gladly share.

Previously I think the response from Twitter is that they cannot
provide this as a bulk translation due to the demand it would place on
their servers.  They are able to provide the entire list of follower
IDs simply because that lives in memory and requires no joins.  The
joining to get this data would be too intensive for them.

If this is allowed maybe the community could take this a step further
and provide a common interface to share data like this.  Any thoughts?

On May 31, 1:42 pm, Dan Brickley dan...@danbri.org wrote:
 On 31/5/09 13:03, Stuart wrote:

  Since there's clearly a lot of demand for this feature is it not
  possible for it to be added to the official API? I'd hesitate before
  building anything on top of Twitter that also relies on a third party
  for something so basic.

 Related suggestion: have common REST API for external services who can
 provide this information. You can probably get it from google social
 graph API too, for example.

 Dan


[twitter-dev] Re: Bulk id - screen_name resolution.

2009-05-31 Thread Stuart

2009/5/31 Philip Plante pplante@gmail.com:

 I would like to hear a response from Twitter on the sharing of this
 data.  My db has about 2 million active users, and I have another db
 with 6 million or so I would gladly share.

 Previously I think the response from Twitter is that they cannot
 provide this as a bulk translation due to the demand it would place on
 their servers.  They are able to provide the entire list of follower
 IDs simply because that lives in memory and requires no joins.  The
 joining to get this data would be too intensive for them.

 If this is allowed maybe the community could take this a step further
 and provide a common interface to share data like this.  Any thoughts?

Much as I respect Twitter and the great people who work there, I don't
buy that this would place too much demand on their servers. They
already use Memcached extensively, and this would be a pretty simple
addition to that data store.

Size-wise we're talking about no more than 50 bytes per user to store
a user ID to username. Even at 100 million users that's less than 5
gig of memory, which I'm sure is pretty small compared to their
overall Memcached footprint. And as for load on the servers each call
for up to 100 IDs would count as an API request, so it's unlikely this
method would add a huge amount to the existing usage.

Clearly I don't know much about Twitters architecture, but this seems
to me to be a pretty simple feature to implement, and relatively
cheap.

If Twitter won't implement it then maybe it's time to consider some of
us getting together to build a user cache. If enough of us get
together I'm sure we can build something that won't cost each of us
too much but will allow us to build the user API methods we need. I'd
hope that Twitter would be ok with this, and most of the useful data
could be kept up to date if they give us single user access to the
firehose. I'd be happy to lead such an effort.

-Stuart

-- 
http://stut.net/projects/twitter/


[twitter-dev] Re: Bulk id - screen_name resolution.

2009-05-31 Thread Philip Plante

Depending on the response from Twitter on this I would like to sponsor
this effort.  I have a architecture that will already support a larger
number of tweets, and have the resources at my disposal to scale it
out as it grows.

We will just have to wait for Twitter to weigh in on the sharing
aspect.  *crosses fingers*

On May 31, 5:57 pm, Stuart stut...@gmail.com wrote:
 2009/5/31 Philip Plante pplante@gmail.com:



  I would like to hear a response from Twitter on the sharing of this
  data.  My db has about 2 million active users, and I have another db
  with 6 million or so I would gladly share.

  Previously I think the response from Twitter is that they cannot
  provide this as a bulk translation due to the demand it would place on
  their servers.  They are able to provide the entire list of follower
  IDs simply because that lives in memory and requires no joins.  The
  joining to get this data would be too intensive for them.

  If this is allowed maybe the community could take this a step further
  and provide a common interface to share data like this.  Any thoughts?

 Much as I respect Twitter and the great people who work there, I don't
 buy that this would place too much demand on their servers. They
 already use Memcached extensively, and this would be a pretty simple
 addition to that data store.

 Size-wise we're talking about no more than 50 bytes per user to store
 a user ID to username. Even at 100 million users that's less than 5
 gig of memory, which I'm sure is pretty small compared to their
 overall Memcached footprint. And as for load on the servers each call
 for up to 100 IDs would count as an API request, so it's unlikely this
 method would add a huge amount to the existing usage.

 Clearly I don't know much about Twitters architecture, but this seems
 to me to be a pretty simple feature to implement, and relatively
 cheap.

 If Twitter won't implement it then maybe it's time to consider some of
 us getting together to build a user cache. If enough of us get
 together I'm sure we can build something that won't cost each of us
 too much but will allow us to build the user API methods we need. I'd
 hope that Twitter would be ok with this, and most of the useful data
 could be kept up to date if they give us single user access to the
 firehose. I'd be happy to lead such an effort.

 -Stuart

 --http://stut.net/projects/twitter/


[twitter-dev] Re: Bulk id - screen_name resolution.

2009-05-30 Thread Dossy Shiobara


On 5/30/09 3:46 PM, David W wrote:

[... David asks about bulk resolving of Twitter user IDs to screen_name ...]


I don't know what the Twitter TOS says, but I've got a sizable cache of 
(reasonably fresh) Twitter user data thanks to Twitter Karma.


Would it be a Twitter TOS violation for me to publish an API to allow 
bulk resolution of IDs to screen_name?  Is this something that folks 
would use if I made it available?


--
Dossy Shiobara  | do...@panoptic.com | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
  He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on. (p. 70)


[twitter-dev] Re: Bulk id - screen_name resolution.

2009-05-30 Thread David M. Wilson



On May 30, 11:28 pm, Dossy Shiobara do...@panoptic.com wrote:
 On 5/30/09 3:46 PM, David W wrote:

  [... David asks about bulk resolving of Twitter user IDs to screen_name ...]

 I don't know what the Twitter TOS says, but I've got a sizable cache of
 (reasonably fresh) Twitter user data thanks to Twitter Karma.

 Would it be a Twitter TOS violation for me to publish an API to allow
 bulk resolution of IDs to screen_name?  Is this something that folks
 would use if I made it available?

In comment to your TOS question: Twitter as a company seem a whole lot
more liberal (and realistic) when it comes to their data. I think I
may have even read this somewhere semiofficial in the past. Profile
information itself is also available to the public, and so, keeping a
local cache is probably no more harmful (from Twitter's perspective)
than what happens when a search engine crawls a user's profile page.

Compare and contrast to Facebook's approach. :P


David.

 --
 Dossy Shiobara              | do...@panoptic.com |http://dossy.org/
 Panoptic Computer Network   |http://panoptic.com/
    He realized the fastest way to change is to laugh at your own
      folly -- then you can let go and quickly move on. (p. 70)


[twitter-dev] Re: Bulk id - screen_name resolution.

2009-05-30 Thread David M. Wilson

Hey Dossy,

This sounds awesome, and I'd be very tempted, however I've come up
with a solution which should cater for any size of user account.

Right now when I periodically (~24 hours) note a set of changes to the
social graph, I create an entry in a (persistent) ring buffer
recording the new graph state. Previously, I wanted to use my existing
uid-name cache and a bunch of API calls to resolve all the changes
IDs in this entry. The solution is really simple.

Instead if there are more id-name calls required than there is quota,
I simply use up half the remaining quota resolving some names, and
reschedule the graph check for 1 hours time rather than 24. The next
check will pick up where the previous one left off doing resolution,
until eventually the entire entry is resolved (and my cache is bigger
for the future:).

This also neatly breaks down the amount of work done for very large
sets of changes into chunks of at most 35 (quota/2) HTTP requests per
hour per user.


David.

On May 30, 11:28 pm, Dossy Shiobara do...@panoptic.com wrote:
 On 5/30/09 3:46 PM, David W wrote:

  [... David asks about bulk resolving of Twitter user IDs to screen_name ...]

 I don't know what the Twitter TOS says, but I've got a sizable cache of
 (reasonably fresh) Twitter user data thanks to Twitter Karma.

 Would it be a Twitter TOS violation for me to publish an API to allow
 bulk resolution of IDs to screen_name?  Is this something that folks
 would use if I made it available?

 --
 Dossy Shiobara              | do...@panoptic.com |http://dossy.org/
 Panoptic Computer Network   |http://panoptic.com/
    He realized the fastest way to change is to laugh at your own
      folly -- then you can let go and quickly move on. (p. 70)


[twitter-dev] Re: Bulk id - screen_name resolution.

2009-05-30 Thread Dossy Shiobara


On 5/30/09 7:06 PM, David M. Wilson wrote:

In comment to your TOS question: Twitter as a company seem a whole lot
more liberal (and realistic) when it comes to their data. I think I
may have even read this somewhere semiofficial in the past. Profile
information itself is also available to the public, and so, keeping a
local cache is probably no more harmful (from Twitter's perspective)
than what happens when a search engine crawls a user's profile page.

Compare and contrast to Facebook's approach. :P


Yeah, Facebook has very strict guidelines as to what you can cache, how 
long you can cache it, and I'm almost positive they have a 
no-redistribute policy.


I can totally understand Twitter not allowing third-party API consumers 
to redistribute data retrieved by the API - their valuation probably 
relies greatly on the number of requests (hits) they receive - if a 
third-party service adds a layer of indirection in front of Twitter, 
then that traffic is no longer hitting Twitter directly which makes them 
appear less active than they really are.


Can someone either point to the clause in the Twitter TOS that says a 
third-party application can redistribute Twitter data to other services 
directly, or can someone from Twitter issue an official statement to 
this effect?  Or, equally useful would be a statement that clearly 
states that this would be forbidden ... so I know not to waste my time 
even thinking about this.  :-)


--
Dossy Shiobara  | do...@panoptic.com | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
  He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on. (p. 70)


[twitter-dev] Re: Bulk id - screen_name resolution.

2009-05-30 Thread jstrellner

We also have a sizable cache of this data (around 4 million users)
that we are already using in an API format internally at Twitturly. If
Twitter approves it, we can add it to our publicly available API.
Currently it allows conversion from both ID to username and username
to ID one at a time, and in bulk (up to 100 at a time).

Doug or Alex, does the TOS allow us to provide this via our API?

-Joel

On May 30, 3:28 pm, Dossy Shiobara do...@panoptic.com wrote:
 On 5/30/09 3:46 PM, David W wrote:

  [... David asks about bulk resolving of Twitter user IDs to screen_name ...]

 I don't know what the Twitter TOS says, but I've got a sizable cache of
 (reasonably fresh) Twitter user data thanks to Twitter Karma.

 Would it be a Twitter TOS violation for me to publish an API to allow
 bulk resolution of IDs to screen_name?  Is this something that folks
 would use if I made it available?

 --
 Dossy Shiobara              | do...@panoptic.com |http://dossy.org/
 Panoptic Computer Network   |http://panoptic.com/
    He realized the fastest way to change is to laugh at your own
      folly -- then you can let go and quickly move on. (p. 70)