[twitter-dev] Re: Getting screen_name from id without gazillion API calls?
The ideal solution is for Twitter to change the system and allow each account to have only one screen name, all the time, forever, with no changes. Then a separate id value is not required because all account identification will be done by the original screen name. REST and SEARCH would finally be consistent. No extra calls to figure out who the user really is. Users would complain until they got used to the fact that they cannot change their screen names on a whim anymore, but they will learn to deal with it soon enough. Email doesn't just let you change your address whenever you feel like it, and I see no reason why Twitter should allow screen name changes either ... except that it takes more work to standardize the system in this way than to continue with what already exists. But with only the screen name as each unique account identifier things would certainly be much simpler. Many fewer requests to the server. Less data storage. And being that Twitter is supposed to be simple this seems like a goal worth pursuing, at least from my point of view. Owkaye When i request friends (or followers) from the Twitter API i want to get the screen_name's based on the id's. I use users/show for this, inputting the id and getting back de screen_name. This costs ALOT of API calls and i run into the API rate limit fast, especially with many friends. Is there a better way of getting screen_names for friends / followers? ( Better, meaning in fewer API calls.)
[twitter-dev] Re: Followers/screen_names API
You've just made a perfect argument for my suggestion that Twitter use ONLY unchangeable screen names (no more ids) for the whole system. :) Owkaye I know there's been a ton of request for a followers/screen_names API, or a friends/screen_names one for that matter. Right now the only way of getting all of a user's followers is with http://twitter.com/followers/ids.xml and that only renders the id's. There's no efficient way of getting the associated screen_names without doing hundreds/thousands/millions of calls or running into API rate limits. Twitter has rejected the creation of a followers/screen_names API due to performance issues/ concerns. What if I or you want to present our app users with a human readable list of their followers/friends. I believe the alternative is a much more performance heavy approach for Twitter. What's to stop me from creating a 1000 (or more) unique users that my app/service uses to resolve id's into screen_names? That way I would have hundreds of thousands of API calls available each hour and could easily create a locally cached db of id-to-screen_name pairs. And of course I would have to recheck all of them every few days or so to account for screen_name changes, since there isn't an API for that either. All of this would result in millions of API calls a day, just to do something that Twitter could enable with one simple API... Hell, I could register a hundred thousand users, and create a service that maintains an id-to-screen_name pair db for Twitter's entire userbase and make it available to the dev community as a service to work around this issue... What do you think? Wouldn't it be much easier and beneficial to Twitter to enable this simple API that many of us have been asking for for so long now? I look forward to you thoughts... Michael
[twitter-dev] A few simple Search API questions
My app needs to retrieve tweets that match a specific phrase: 1- How many searches are allowed from a single IP address per hour? I'm thinking of doing one per minute, is that too many? 2- I cannot find examples of phrase-based searches in the API docs. Can someone post a working example of a curl search that requires a phrase match? Owkaye
[twitter-dev] Search API limits
How can I retrieve the maximum number of tweets in a search? Can rpp be set to more than 100? What if I do not send a rpp value, does twitter default to returning more than 100 per page? Owkaye
[twitter-dev] Re: Search API limits
The key to max search results isn't in paging or rpp, but in max_id. Hi David, I do not understand how max_id can help me. If I want to get the 10,000 most recent tweets that match the phrase michael jackson changing the max_id value doesn't seem like it's going to help at all. In fact, it doesn't even make sense to use it when trying to retrieve the most recent tweets, does it? Be careful what you ask for. Retrieval of everything available can take a long time (hours). My understanding is that every request is limited to 100 tweets max, and this forces multiple requests when trying to retrieve more than 100 tweets. Am I wrong about this? Owkaye
[twitter-dev] Re: All friends and followers of a Twitter user
Hi Peter, I got it working already, that was easy ... and FAST thanks to your help! Owkaye friends/idshttp://apiwiki.twitter.com/Twitter-REST-API-M ethod%3A-friends%C2%A0ids followers/idshttp://apiwiki.twitter.com/Twitter-REST-API -Method%3A-followers%C2%A0ids
[twitter-dev] Re: FW: Twitter is Suing me!!!
I surely hope people would not judge me based on who is following me. They won't unless they are stupid. After all, Twitter gives you no way to control who follows you, and most people understand this. Followers do no, zero, nada harm. Just let them be. Agreed. Owkaye
[twitter-dev] Re: FW: Twitter is Suing me!!!
Perhaps I'm being daft, but how can someone following you be spam or wrong, regardless of whether it is manual or auto follow? It can be spam if you had your account sent to auto-notify your phone or inbox when someone follows you. You're wrong. SPAM only exists when you do NOT ask for commercial messages to be sent to you, and in this case clearly you are asking for it. Owkaye
[twitter-dev] Re: FW: Twitter is Suing me!!!
I surely hope people would not judge me based on who is following me. They won't unless they are stupid. After all, Twitter gives you no way to control who follows you, and most people understand this. sure they do. it's called blocking. every time a pain in the ass porn bot or social media expert following 100x more people than follow them follows me, i block them. then they can't follow me. I guess I care so little about who is following me that I never bothered to learn about this. Now that I know about it I still have no use for it, and I never will ... but it's good to know it's available to others I guess. Owkaye
[twitter-dev] Re: Following Churn: Specific guidance needed
Would be very helpful to know the definition of quick as relates to following churn suspensions. As Cameron pointed out earlier, as soon as they do that, the following churners will adjust their methods to be just inside that definition of OK. This seems like a really short-sighted reason for NOT clarifying what's acceptable and what's not. If it's acceptable then who cares if the churners adjust their methods? At least everyone will know how to avoid problems for a change, right?
[twitter-dev] Re: Following Churn: Specific guidance needed
If users paid due diligence to those they follow and only followed those people who demonstrate some value to them, follower churn would not exist. Period. Obviously they won't so maybe it's time to deal with reality rather than dreaming of a perfect world. Owkaye
[twitter-dev] Re: Following Churn: Specific guidance needed
Owkaye Would be very helpful to know the definition of quick as relates to following churn suspensions. As Cameron pointed out earlier, as soon as they do that, the following churners will adjust their methods to be just inside that definition of OK. This seems like a really short-sighted reason for NOT clarifying what's acceptable and what's not. The alternative is considerably more restrictive limits that globally apply so that any value up to the mythical X has little repercussion ... Well at least it's fair to everyone EQUALLY instead of possibly being prejudice against certain users.
[twitter-dev] Re: API only shows messages from last 7 days
You're probably correct when you say that throwing more programmers at the problem is not the solution. That's not what I was suggesting ... My thought is that there may be no one at Twitter actually planning or developing a plan for historical data access, and if this is true then hiring someone with the skills and the desire to implement this in a practical manner would go a very long way towards providing people like us with a workable solution now. Having said this, I agree that in the absence of enough people in the company who can be trusted to make wise decisions and accomplish a wide variety of projects all at the same time, it ends up becoming a priority issue. When there are too few people available to actually take charge and make progress on projects like the one we've been discussing in this thread, it all comes down to priorities -- and when those priorities focus on things we do not need, the things we really want are set aside and ignored, with no progress being made. In other companies money is a significant limiting factor, but I tend to question this at Twitter given all the reports of their financial condition, so I really think it's a priority issue in Twitter's case. Now, if only someone at Twitter could see how important historical data access can be to real businesses, and how these businesses might be willing to pay for this data, then all it would take is to hire the right person to implement it. Twitter simply needs the money, the current ability to recognize the future value of such a project, and the commitment to make it happen ... and then they hire a leader who gets it done. Easier said than done of course, but there are excellent people available who can accomplish such goals when given the chance -- and the support they need from within the company of course. Then again, if these people are already working on it (as you may have suggested) then it's going to happen one of these days anyways ... :) Owkaye I don't think that adding more people to the staff at Twitter is the solution. In one startup I saw a thing posted on the refrigerator that had the adage, Adding more people to a project already behind schedule will only slow it down more. Surely for support and customer service issues having more people on the team to deal with growth is good, but I doubt throwing more programmers at it will help fix most issues. It just never seems to work that way. While many startups do tend toward younger employees (I personally think because being younger normally means that you can work a lot with minimal life impact), I'm sure that someone with a strong background would be able to get a job at Twitter if they were local to the company (or willing to move). A lot of this surely comes down to priorities inside the company. While Doug and Team want to support us developers as much as possible, much of our initial 'value' that we've offered in helping push twitter to the masses has already happened. We aren't the core business strategy, and with a fixed amount of resources and focus they aren't working to push mainly for developer access, but for standard user access. This 100% makes sense. Users are what is going to make twitter happen, not 3rd party developers. They want to provide a stable experience on both fronts, but users come first. In my private discussions with some team members, I've gotten the sense that they have good stuff in the pipeline for us and that they are working hard to make it happen. However we're only a small part of the overall strategy of a quickly growing company that is still dealing with massive growing pains which is no fault of theirs and something they are dealing with as best they can. david On Jul 28, 1:46 pm, owkaye owk...@gmail.com wrote: I'm sure others feel the same way Dave, but it looks and feels like Twitter is moving in the opposite direction. The load on a server to extract a big dataset once a month would be minimal, and both you and I can see the value in this approach. But I'm not sure the folks at Twitter do, or if they do maybe they just don't have the people who can (and will) get things like this implemented. Is a shortage of competent staff the cause of this type of problem? Even though I have the capabilities I do not have the 'resume' to get a job there and help them deal with some of this stuff, nor do I have the contacts within the Twitter organization to put a good word in for me and help me get hired so I could do good things for them. I'm 52 years old too, and my age seems to be a negative to most of the Web 2.x companies hiring these days. This is kind of a shame considering that people like me frequently have broader-based experience and insights that are sometimes lacking in younger people, and because of this we can add a lot more value in the areas of planning and structural
[twitter-dev] Re: API only shows messages from last 7 days
I agree with you Dave. I have several thought about new services based on searching Twitter's historical data. Unfortunately my ideas appear to be getting less and less practical. Twitter claims to have all its data stored in disk-based databases from what I understand ... yet without access to this data it is worthless. It seems to me they could allow searches of this historical data via a new History API then let us cache the results on our own servers. Most of the services I've conceived would do this infrequently -- never in real time -- and would not impact their existing cached server data because this historical data would exist on separate data storage servers ... theoretically anyways. Owkaye I am a bit concerned. I remember at one point it being between 30-45 days. Now it seems to be getting smaller by about 1-day per month. Last month it was closer to 10 days. Is it basically going to keep getting smaller and smaller until we get V2 of the API, or will we be forced to all use only streaming services and then locally cache everything that we'd want to search for any time period? I know there are a LOT of problems inherent in the massive scaling out of Twitter, and this is just a symptom of them- but at the same time I can only imagine how unusable Google would be if you only had a 7-day window to Search in, and couldn't get any content made prior to that. Very worried about this soon being a 2-3 day window. dave
[twitter-dev] Re: API only shows messages from last 7 days
I'm sure others feel the same way Dave, but it looks and feels like Twitter is moving in the opposite direction. The load on a server to extract a big dataset once a month would be minimal, and both you and I can see the value in this approach. But I'm not sure the folks at Twitter do, or if they do maybe they just don't have the people who can (and will) get things like this implemented. Is a shortage of competent staff the cause of this type of problem? Even though I have the capabilities I do not have the 'resume' to get a job there and help them deal with some of this stuff, nor do I have the contacts within the Twitter organization to put a good word in for me and help me get hired so I could do good things for them. I'm 52 years old too, and my age seems to be a negative to most of the Web 2.x companies hiring these days. This is kind of a shame considering that people like me frequently have broader-based experience and insights that are sometimes lacking in younger people, and because of this we can add a lot more value in the areas of planning and structural development than people half our age. Our coding skills are honed after so many years of experience too, not to mention the thousands of code snippets we have collected over the years to contribute to making us even faster. But since jobs like this are basically not open to me and many other folks my age, my alternative is to remain self- employed and try to build something on top of their existing available source data and API's ... and then deal with the issues and frustrations created when building a service on top of a 'moving target' that sometimes seems to be moving in funny directions. I hear about Twitter having lots of money to work with, and I'm probably wrong here but it almost seems like there's too little of this money being dedicated to paying new talent with long term views of some of these issues, and who will implement wise policies to help support and encourage rapid growth in the areas that are lacking. But once again this might just be due to a shortage of the right staff. Obviously we cannot do anything from the outside except point out these issues and ask questions, or beg and plead for changes, but it sure would be great if a few of us could actually get in there as employees and implement a couple of the new features we really need -- such as a new Historical Search API for example. Then developers like you and I could proceed with some of our plans now, instead of months or years from now ... or possibly never. I would love to lead a team on a project like this, or even be one of its members, but until it happens I'll focus on building my own little space in the Twitter universe and continue to hope for the best. :) Owkaye I would do anything (including paying good amounts of money) to be able to purchase access to older datasets that I could transfer to my database through non-rest-api methods. I'm envisioning being able to download a CSV or SQL file that I could merge with my database easily, but only have to make a single request to the server to get a month of data. I'd sign agreements and pay money for such. dave On Jul 28, 12:03 pm, owkaye owk...@gmail.com wrote: I agree with you Dave. I have several thought about new services based on searching Twitter's historical data. Unfortunately my ideas appear to be getting less and less practical. Twitter claims to have all its data stored in disk-based databases from what I understand ... yet without access to this data it is worthless. It seems to me they could allow searches of this historical data via a new History API then let us cache the results on our own servers. Most of the services I've conceived would do this infrequently -- never in real time -- and would not impact their existing cached server data because this historical data would exist on separate data storage servers ... theoretically anyways. Owkaye I am a bit concerned. I remember at one point it being between 30-45 days. Now it seems to be getting smaller by about 1-day per month. Last month it was closer to 10 days. Is it basically going to keep getting smaller and smaller until we get V2 of the API, or will we be forced to all use only streaming services and then locally cache everything that we'd want to search for any time period? I know there are a LOT of problems inherent in the massive scaling out of Twitter, and this is just a symptom of them- but at the same time I can only imagine how unusable Google would be if you only had a 7-day window to Search in, and couldn't get any content made prior to that. Very worried about this soon being a 2-3 day window. dave
[twitter-dev] Re: Updating the APIs authentication limiting policy
One solution to this problem is to add to each twitter account another private ID. Jim, Wouldn't it make more sense to implement this private id thing on your own server? My thought here is that your service should maintain its own database of users, and issue a unique private id for each of these users. Then when the visitor tries to login, your code can check to see if the private id the visitor has entered is in your own database. If so the person is allowed to login, and if not they get an error. Would this work to solve the problem of am I missing something here? Owkaye
[twitter-dev] Re: Is it okay to close a connection by opening a new one?
The Streaming API docs say we should avoid opening new connections with the same user:pass when that user already has a connection open. But I'm hoping it is okay to do this every hour or so ... If you're only doing this every hour, that's fine by us. Great, thanks for the confirmation Alex! :)
[twitter-dev] Re: Is it okay to close a connection by opening a new one?
Why can't you do this entirely in your code? Why do you need to close the connection and reconnect? My software keeps the local data file open as long as the connection is open, so the connection must be closed before the file can be moved or deleted. Closing a file, moving it, and then creating a new file should be able to be done extremely fast ... I know, but these cannot be done while the connection is open, thus the need to close it. And since a new connection will need to be opened almost immediately anyways, the natural way for me to close it is to open a new one. JSON is a much better format to use. Not for me it isn't. My software has built-in XML parsing capabilities but it doesn't know how to deal with JSON data so XML is clearly the best way for me to go. Owkaye
[twitter-dev] Re: Safe url shorteners
Just wanted to let you guys know about a free service we're prototyping for shortening URL's that overcomes a few of the limitations of other shorteners. Only one problems with all these URL shorteners, when the companies creating them disappear all their shortened URLs become orphans and therefore useless. Not a major problem on Twitter because of the typical transience of data, but when you run a company like mine that needs to reference historic data it will definitely create future problems when these companies fail. Just something for folks to consider ... Owkaye
[twitter-dev] How to track a phrase in Streaming API?
How do I track a phrase like harry potter? The docs only show how to track individual words, not phrases ... and this curl command doesn't work properly because it finds tweets with harry and not potter: curl -o /home/ken/twitterStreamJSON.txt http://stream.twitter.com/track.json -u username:password -d track=harry potter, Owkaye
[twitter-dev] Re: How to track a phrase in Streaming API?
How do I track a phrase like harry potter? The docs only show how to track individual words, not phrases ... and this curl command doesn't work properly because it finds tweets with harry and not potter: curl -o /home/ken/twitterStreamJSON.txt http://stream.twitter.com/track.json -u username:password -d track=harry potter, I think the problem is missing quotes and URL encoding. Try curl … -d track=harry+potter Thanks for the suggestion Matt but that doesn't work either. Any other ideas? Owkaye
[twitter-dev] Re: How to track a phrase in Streaming API?
Currently track works only on keywords, not phrases. This answers my question very clearly, thanks John! I'm storing the data in a local database anyways, so I can just do a phrase search of my data and delete the records I don't need. More data than necessary gets transmitted from Twitter this way, but I guess there's no way around it -- and for me the end result is the same anyways -- so it looks like I can proceed successfully now. Thanks again for everyone's help, I'll be back when I have new questions ... :) Owkaye
[twitter-dev] Is it okay to close a connection by opening a new one?
The Streaming API docs say we should avoid opening new connections with the same user:pass when that user already has a connection open. But I'm hoping it is okay to do this every hour or so, here's why: My plan is to write the streaming XML data to a text file during each connection -- but I don't want this file to get so big that I have trouble processing it on the back end. Therefore I want to rotate these files every hour ... This means I have to stop writing to the file, close it, move it somewhere else, and create a new file so I can use the new file to continue storing new streaming XML data. The obvious way for me to close these files is to close the connection -- by opening a new connection -- because from what I've read it seems that opening a new connection forces the previous connection to close. Can I do this without running into any black listing or denial of service issues? I mean, is this an acceptable way to close a connection ... by opening a new one in order to force the old connection to close? Any info you can provide that will clarify this issue is greatly appreciated, thanks! Owkaye
[twitter-dev] Re: How to insure that all tweets are retrieved in a search?
First, I wouldn't expect that thousands are going to post your promo code per minute. That doesn't seem realistic. Hi John, It's more than just a promo code. There are other aspects of this promotion that might create an issue with thousands of tweets per minute. If it happens and I haven't planned ahead to deal with it, then I'm screwed because some data will be missing that I really should be retrieving, and apparently I won't have any way to retrieve it later. Second, you can use the /track method on the Streaming API, which will return all keyword matches up to a certain limit with no other rate limiting. I guess this is what I need ... unless you or someone can reduce or eliminate the Search API limits. It really seems inappropriate to tie up a connection for streaming data 24 hours a day when I do not need streaming data. All I really need is a search that doesn't restrict me so much. If I had this capability I could easily minimize my promotion's impact on Twitter by 2-3 orders of magnitude. From my perspective this seems like something Twitter might want to support, but then again I do not work at Twitter so I'm not as familiar with their priorities as you are. Contact us if the default limits are an issue. I'm only guessing that they will become a problem, but it is very clear to me how easily they might become a problem. The unfortunate situation here is that *IF* these limits become a problem it's already too late to do anything about it -- because by then I've permanently lost access to some of the data I need -- and even though the data is still in your database there's no way for me to get it out because the search restrictions get in the way again. It's just that the API is so limited that the techniques I might use with any other service are simply not available at Twitter. For example, imagine this which is a far better scenario for my needs: I run ONE search every day for my search terms, and Twitter responds with ALL the matching records no matter how many there are -- not just 100 per page or 1500 results per search but ALL matches, even if there are hundreds of thousands of them. If this were possible I could easily do only one search per day and store the results in a local database. Then the next day I could run the same search again -- and limit this new search to the last 24 hours so I don't have to retrieve any of the same records I retrieved the previous day. Can you imagine how must LESS this would impact Twitter's servers when I do not have to keep a connection open 24 hours a day as with Streaming API ... and I do not have to run repetitive searches every few seconds all day long as with Search API? The load savings on your servers would be huge, not to mention the bandwidth savings!!! - The bottom line here is that I hope you have people who understand this situation and are working to improve it, but in the meantime my only options appear to be: 1- Use the Streaming API which is clearly an inferior method for me because a broken connection will cause me to lose important data without warning. 2- Hope that someone at Twitter can raise the limits for me on their Search API so I can achieve my goals without running thousands of searches every day. - As you can see I'm trying to find the best way to get the data I need while minimizing the impact on Twitter, that's why I'm making comments / suggestions like the ones in this email. So who should I contact at Twitter to see if they can raise the search limits for me? Are you the man? If not, please let me know who I should contact and how. Thanks! Owkaye
[twitter-dev] Re: How to insure that all tweets are retrieved in a search?
We tried allowing access to follower information in a one-query method like this and it failed. The main reason is that when there are tens of thousands of matches things start timing out. While all matches sounds like a perfect solution, in practice staying connected for minutes at a time and pulling down an unbounded size result set has not proved to be a scalable solution. Maybe a different data system would allow this capability. But you have the system you have so I understand why you've done what you've done. There is no way for anyone at Twitter to change the pagination limits without changing them across the board. This is too bad. Are you working on changing this in the future or is this going to be a limitation that persists for years to come? As a side note: The pagination limits exist as a technical limit and not something meant to stifle creativity/usefulness. When you go back in time we have to read data from disk and replace recent data in memory with that older data. The pagination limit is there to prevent too much of our memory space being taken up by old data that a very small percentage of requests need. Okay, this makes sense. It sounds like the original system designers never gave much consideration to the value of historical data search and retrieval. Too bad there's nothing that can be done about this right now, but maybe in the future ... ? The streaming API really is the most scalable solution. No doubt. It's disappointing that my software probably cannot handle streaming data too, but that's my problem not yours. Does anyone have sample PHP code that successfully uses the twitter Streaming API to retrieve the stream and write it to a file or database? I hate PHP but if it works then that's what I'll use, especially if some helpful soul can post some code to help me get started. Thanks. Owkaye
[twitter-dev] Re: How to insure that all tweets are retrieved in a search?
I concur with Matt. Track in the Streaming API is, in part, intended for applications just like yours. Hit the Search API and use track together to get the highest proportion of statuses possible. The default track limit is intended for human readable scale applications. Email me about elevated track access for services. I would use the Streaming API if I could, but now the problem is that my server side scripting language probably won't be able to use the Streaming API successfully ... My software hasn't been upgraded in years, and when it was first coded streaming data via http didn't even exist. The software has been upgraded once in a while over the past decade or so, but the last significant upgrade was more than 5 years ago and it didn't have anything added to allow streaming data access at that time, so I doubt it can handle this task now. I have an email request in to the current owners but I doubt they know how it works either. They never coded the original software or any of the upgrades. They just bought the software without possessing the expertise to understand the code, so they really don't know how it works internally either. My best guess is that it cannot write streaming data to a database as that data is transmitted, and that's what it needs it to do if I have any chance of using the Streaming API instead of a search. So I'll probably have to use some other software to accomplish this task. Any suggestions which software I should use to make this as fast and easy to code as possible? It's possible that you are worrying about an unlikely event. Sustained single topic statuses in the thousands per minute are usually limited to things like massive social upheaval, big political events, celebrity death, etc. You may be correct, but to plan for the possibility that this may be bigger than expected is simply the way I do business. It doesn't make sense for me to launch a promo like this until I'm prepared for the possibilities, right? Owkaye
[twitter-dev] How to insure that all tweets are retrieved in a search?
I'm building an app that uses the atom search API to retrieve recent posts which contain a specific keyword. The API docs say: Clients may request up to 1,500 statuses via the page and rpp parameters for the search method. But this 1500 hits per search cannot be done in a single request because of the rpp limit. Instead I have to perform 15 sequential requests in order to get only 100 items returned on each page ... for a total of 1500 items. This is certainly a good way to increase the server load, since 15 connections at 100 results each takes far more server resources than 1 connection returring all 1500 results. Therefore I'm wondering if I'm misunderstanding something here, or if this is really the only way I can get the maximum of 1500 items via atom search?
[twitter-dev] Re: How to insure that all tweets are retrieved in a search?
Thanks Chad, that's what I was afraid of. I wonder if you know about this next question: Twitter API docs say search is rate limited to something more than REST which is 150 requests per hour, but for the sake of argument let's say the search rate limit is actually 150 hits per hour ... Since I have to do 15 consecutive searches to make sure I've retrieved the last 1500 matching items, does this mean I can only do 10 sets of 15 searches per hour = 150 request per hour? If so, this is only one set of searches every 6 minutes, and it seems to me that on a trending topic there might be lots more than 1500 new tweets every 6 minutes. How can I get around this limit? I'm not trying to hurt Twitter, but business applications that require ALL tweets to be recorded cannot deal with these types of limitations on a practical basis, and if Twitter doesn't come up with a better way I can see this hindering its future revenue streams from businesses like mine that want to build on a solid and easy-to-use foundation. So getting back to my question of What do I do now ... Do I have to put my automated search code on a bunch of separate servers so the IP's are spread around -- such that none of them hit the limit of 150 searches per hour? Seems to me that this is the only realistic way to insure that I can always retrieve all the matching results I need without hitting the API limits ... but if you or others have a better suggestion please let me know, thanks. On Jul 9, 5:52 pm, Chad Etzel jazzyc...@gmail.com wrote: Yep, you gotta do 15 requests at 100 rpp each. -Chad On Thu, Jul 9, 2009 at 5:45 PM, owkayeowk...@gmail.com wrote: I'm building an app that uses the atom search API to retrieve recent posts which contain a specific keyword. The API docs say: Clients may request up to 1,500 statuses via the page and rpp parameters for the search method. But this 1500 hits per search cannot be done in a single request because of the rpp limit. Instead I have to perform 15 sequential requests in order to get only 100 items returned on each page ... for a total of 1500 items. This is certainly a good way to increase the server load, since 15 connections at 100 results each takes far more server resources than 1 connection returring all 1500 results. Therefore I'm wondering if I'm misunderstanding something here, or if this is really the only way I can get the maximum of 1500 items via atom search?
[twitter-dev] Re: How to insure that all tweets are retrieved in a search?
You are correct, you have to do 15 requests. However, you can cache the results in your end, so when you come back, you are only getting the new stuff. Thanks Scott. I'm storing the results in a database on my server but that doesn't stop the search from retrieving the same results repetitively, because the search string/terms are still the same. My problem is going to occur when thousands of people start tweeting my promo codes every minute and I'm not able to retrieve all those tweets because of the search API limitations. If I'm limited to retrieving 1500 tweets every 6 minutes and people post 1000 tweets every minute I need some way of retrieving the missing 4500 tweets -- but apparently Twitter doesn't offer anything even remotely close to this capability -- so I can see where it has a long way to go before it's ready to support the kind of search capabilities I need. Twitter has pretty good date handling, so you specify your last date, and pull forward from there. You may even be able to get the last id of the last tweet you pulled, and just tell it to get you all the new ones. Yep, that's what I'm doing ... pulling from the records I haven't already retrieved based on the since_id value. But when the new tweets total more than 1500 in a short time, the excess tweets will get lost and there's no way to retrieve them -- unless I run my searches from multiple servers to avoid Twitter's ip address limits -- and doing this would be a real kludge that I'm not tempted to bother with. I'm building an app that uses the atom search API to retrieve recent posts which contain a specific keyword. The API docs say: Clients may request up to 1,500 statuses via the page and rpp parameters for the search method. But this 1500 hits per search cannot be done in a single request because of the rpp limit. Instead I have to perform 15 sequential requests in order to get only 100 items returned on each page ... for a total of 1500 items. This is certainly a good way to increase the server load, since 15 connections at 100 results each takes far more server resources than 1 connection returring all 1500 results. Therefore I'm wondering if I'm misunderstanding something here, or if this is really the only way I can get the maximum of 1500 items via atom search? -- Scott * If you contact me off list replace talklists@ with scott@ *