Re: [twitter-dev] Twitter Search API - Questions Regarding Scaling Out

2011-04-13 Thread Stuart Dallas
You may want to take a look at http://datasift.net/

-Stuart

-- 
Stuart Dallas
3ft9 Ltd
http://3ft9.com/

On Monday, 11 April 2011 at 16:14, Corey Ballou wrote: 
 I tried speaking with Ryan Sarver directly, but he's forwarding me
 here to the community advocates to answer. I believe this answer will
 need to come top down from Twitter, as it's your rate limiting that
 I'm most worried about.
 
 I have a technical question for all of you in regards to the Search
 API as I want to maintain full compliancy. Currently, the old Search
 API implementation (albeit slower) provides a fuller result set and
 allows for more flexibility in the types and combinations of searches
 allowed. The manner I have developed my application would allow for a
 number of daemonized worker instances running on different IP
 addresses to make calls to the search API on behalf of the stored
 OAuth credentials to avoid rate limiting issues.
 
 I had a conversation with the Pluggio developer in which he stated
 Twitter had threatened to shutdown his application if he didn't switch
 to a different implementation of the Search API. The problem indicated
 was that he was performing searches for multiple Twitter accounts,
 which is exactly my use case. Site streams does not make as much sense
 for my application given the search queries I wish to perform and the
 necessity for logical AND operations on geo-location.
 
 Do you foresee any problems with my current method of using different
 IP addresses to stay under the rate limit? I'm trying to stay in full
 compliance with Twitter's TOS and would love to find the most
 applicable and API friendly solution. I know headway is being made
 with Twitter's new search implementation so I would like to stay ahead
 of the curve and not get myself stuck in a box.
 
 I still need a method for polling for new search results (say, every
 30 minutes, dependent upon the pricing plan) for non-logged in users.
 
 Below is a scaled down representation of how I'm currently handling
 searches to help you decide the best plan of action:
 
 1) Searches are performed on a rolling queue basis, say one search
 every thirty minutes. There can be a finite number of searches per
 Twitter user (say 5 searches per Twitter account). There can be any
 number of Twitter accounts.
 2) Search results are stored locally for retrieval by a javascript
 AJAX long-poller every minute to check for frequent changes.
 3) When a user visits the search results page and filters results, no
 API calls to Twitter are made, only a local query is required
 
 Due to this process, the queue is constantly searching for the next
 searches and mentions to perform. I foresee rate limiting concerns
 cropping up with searches being performed for any number of users.
 
 Can you steer me in the right direction to avoid shutdown notices or
 access revocation?
 
 Regards,
 
 Corey
 @cballou
 
 -- 
 Twitter developer documentation and resources: http://dev.twitter.com/doc
 API updates via Twitter: http://twitter.com/twitterapi
 Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
 Change your membership to this group: 
 http://groups.google.com/group/twitter-development-talk
 

-- 
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk


[twitter-dev] Twitter Search API - Questions Regarding Scaling Out

2011-04-11 Thread Corey Ballou
I tried speaking with Ryan Sarver directly, but he's forwarding me
here to the community advocates to answer. I believe this answer will
need to come top down from Twitter, as it's your rate limiting that
I'm most worried about.

I have a technical question for all of you in regards to the Search
API as I want to maintain full compliancy. Currently, the old Search
API implementation (albeit slower) provides a fuller result set and
allows for more flexibility in the types and combinations of searches
allowed. The manner I have developed my application would allow for a
number of daemonized worker instances running on different IP
addresses to make calls to the search API on behalf of the stored
OAuth credentials to avoid rate limiting issues.

I had a conversation with the Pluggio developer in which he stated
Twitter had threatened to shutdown his application if he didn't switch
to a different implementation of the Search API. The problem indicated
was that he was performing searches for multiple Twitter accounts,
which is exactly my use case. Site streams does not make as much sense
for my application given the search queries I wish to perform and the
necessity for logical AND operations on geo-location.

Do you foresee any problems with my current method of using different
IP addresses to stay under the rate limit? I'm trying to stay in full
compliance with Twitter's TOS and would love to find the most
applicable and API friendly solution. I know headway is being made
with Twitter's new search implementation so I would like to stay ahead
of the curve and not get myself stuck in a box.

I still need a method for polling for new search results (say, every
30 minutes, dependent upon the pricing plan) for non-logged in users.

Below is a scaled down representation of how I'm currently handling
searches to help you decide the best plan of action:

1) Searches are performed on a rolling queue basis, say one search
every thirty minutes. There can be a finite number of searches per
Twitter user (say 5 searches per Twitter account). There can be any
number of Twitter accounts.
2) Search results are stored locally for retrieval by a javascript
AJAX long-poller every minute to check for frequent changes.
3) When a user visits the search results page and filters results, no
API calls to Twitter are made, only a local query is required

Due to this process, the queue is constantly searching for the next
searches and mentions to perform. I foresee rate limiting concerns
cropping up with searches being performed for any number of users.

Can you steer me in the right direction to avoid shutdown notices or
access revocation?

Regards,

Corey
@cballou

-- 
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk


Re: [twitter-dev] Twitter Search API - Questions Regarding Scaling Out

2011-04-11 Thread M. Edward (Ed) Borasky
I don't see an answer here, but I'll tell you how *I* would go about
implementing this:

1. Switch to the Streaming API. Using Search in an application puts a strain
on Twitter's servers and makes it difficult to Twitter to manage capacity.
That's why it's rate-limited and why the rate limits aren't publicly
disclosed.

2. If your application is a desktop application, use User Streams. If it is
a server, use User Streams on a desktop or the low-frequency free access to
Streaming on a server to prototype and develop. Your target for a server
will be Site Streams, but that's in closed beta at the moment IIRC.

3. *Concurrently with development*, your business development / sales /
marketing / planning people, or yourself, if it's a one-person shop, should
be negotiating with Twitter for access to Site Streams, I'm assuming an
agile development methodology - customer-in-the-loop - and one of the
parties that needs to be in the loop is Twitter for Site Streams. You simply
*can't* build an at-scale Twitter application without direct business
discussions with Twitter!

On Mon, Apr 11, 2011 at 8:14 AM, Corey Ballou ball...@gmail.com wrote:

 I tried speaking with Ryan Sarver directly, but he's forwarding me
 here to the community advocates to answer. I believe this answer will
 need to come top down from Twitter, as it's your rate limiting that
 I'm most worried about.

 I have a technical question for all of you in regards to the Search
 API as I want to maintain full compliancy. Currently, the old Search
 API implementation (albeit slower) provides a fuller result set and
 allows for more flexibility in the types and combinations of searches
 allowed. The manner I have developed my application would allow for a
 number of daemonized worker instances running on different IP
 addresses to make calls to the search API on behalf of the stored
 OAuth credentials to avoid rate limiting issues.

 I had a conversation with the Pluggio developer in which he stated
 Twitter had threatened to shutdown his application if he didn't switch
 to a different implementation of the Search API. The problem indicated
 was that he was performing searches for multiple Twitter accounts,
 which is exactly my use case. Site streams does not make as much sense
 for my application given the search queries I wish to perform and the
 necessity for logical AND operations on geo-location.

 Do you foresee any problems with my current method of using different
 IP addresses to stay under the rate limit? I'm trying to stay in full
 compliance with Twitter's TOS and would love to find the most
 applicable and API friendly solution. I know headway is being made
 with Twitter's new search implementation so I would like to stay ahead
 of the curve and not get myself stuck in a box.

 I still need a method for polling for new search results (say, every
 30 minutes, dependent upon the pricing plan) for non-logged in users.

 Below is a scaled down representation of how I'm currently handling
 searches to help you decide the best plan of action:

 1) Searches are performed on a rolling queue basis, say one search
 every thirty minutes. There can be a finite number of searches per
 Twitter user (say 5 searches per Twitter account). There can be any
 number of Twitter accounts.
 2) Search results are stored locally for retrieval by a javascript
 AJAX long-poller every minute to check for frequent changes.
 3) When a user visits the search results page and filters results, no
 API calls to Twitter are made, only a local query is required

 Due to this process, the queue is constantly searching for the next
 searches and mentions to perform. I foresee rate limiting concerns
 cropping up with searches being performed for any number of users.

 Can you steer me in the right direction to avoid shutdown notices or
 access revocation?

 Regards,

 Corey
 @cballou

 --
 Twitter developer documentation and resources: http://dev.twitter.com/doc
 API updates via Twitter: http://twitter.com/twitterapi
 Issues/Enhancements Tracker:
 http://code.google.com/p/twitter-api/issues/list
 Change your membership to this group:
 http://groups.google.com/group/twitter-development-talk




-- 
http://twitter.com/znmeb http://borasky-research.net

A mathematician is a device for turning coffee into theorems. -- Paul
Erdős

-- 
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 
http://groups.google.com/group/twitter-development-talk