[twitter-dev] Streaming API Basics ...
Dear Experts, Well I have been developing Twitter applicaiton for quite a long now and has been using Twitter Search API for my goals. Here is my business overview: I have subscribers over 20K. Have their profiles containing their interests keywords, location and other geographic information. I use oAuth for authentication and then get following information for each subscriber. 1. Mentions (Cache each mention locally) 2. Retweets (Cache each retweet locally) 3. Search tweets for subscriber interests using their keywords etc and location. All these activities are being performed periodically, where I use sinceId to fetch mentions, retweets, so that I may have historical data and do not lose any mention or retweet of the user. Now I have read the API documentation and can see Streaming API is the most recommended API by twitter. I want to convert my application to use Streaming API. So as I see, with the default access level, I can subscribe to statuses/sample or statuses/filter method using any of my account (using basic authentication) and can fetch whatever I want, as the nature of API is event based, this is definitely going to be fast. Here are few questions though: 1. What is the difference between sample and filter method? When to use which? 2. What is best approach to get the retweets and mentions? Is it tracking my subscribers screen names or just specify there user ids in follow predicate? 3: If I have 20,000 subscribers, that means, I have at least 20,000 screen names to track or follow and suppose I have 3 keywords for each subscriber on average, that makes it 60,000 keywords to track as well, how to manage this? 4: If any of the subscriber changes location or keywords, I have to reconnect to update the predicates. right? I have read the documentation and can follow the best practices. However I am unable to understand the count variable logic. I want to see if any of the mentions or retweets is missing in my storage, what's the best approach to get it back? 5: How to track or follow based on users' location? So basically I am confused :) Any recommendations to move from here or quick answers to above will help. I'll be grateful for any help. Regards, Alam Sher
[twitter-dev] Why does friends/ids.xml require auth for a protected user?
Even if a user is protected you can see who they are following from the twitter website. So, why is it required to have authorization to get a user's friends' ids on a protected user?
Re: [twitter-dev] Re: Twitter Oauth Issues
Yes actually a 10 minutes difference may be enough to break OAuth 2010/1/16 Mark McBride mmcbr...@twitter.com Is the system time on your machine correct? We've heard reports of issues when system clocks are wildly divergent from reality. ---Mark http://twitter.com/mccv On Fri, Jan 15, 2010 at 11:51 AM, Proxdeveloper prox.develo...@gmail.com wrote: Hey man, What do you mean by that ? On Jan 13, 6:59 pm, Andrew Badera and...@badera.us wrote: Server timestamp difference? ∞ Andy Badera ∞ +1 518-641-1280 Google Voice ∞ This email is: [ ] bloggable [x] ask first [ ] private ∞ Google me:http://www.google.com/search?q=andrew%20badera On Wed, Jan 13, 2010 at 4:16 PM, Proxdeveloper prox.develo...@gmail.com wrote: Hello folks, I'm developing atwitterdesktop client for windows using theOauthmethod, but for some reason I'm getting this error while requesting an Access token The remote server returned an error: (401) Unauthorized.. This issue is only happening in my development PC, I've tried the app in other computers and Internet Connections and it works great, I'm guessing this is happening because I make too much requests totwitter from the same computer. Could anyone help me on this issue ? Thanks.
[twitter-dev] Is there anyway get update profile background image to work with OAuth?
Is there anyway get update profile background image to work with OAuth? No one from twitter API team also seem to be trying help people out?
[twitter-dev] Best practice - Stream API into a FILE or MySQL or neither?
Just looking for thoughts on this. I am consuming the gardenhose via a php app on my web server. So far so good. The script simply creates a new file every X amount of time and starts feeding the stream into it so I get a continuous stream of fresh data and I can delete old data via cron. I plan to access the stream (files) with separate processes for further json parsing and data mining. But then that got me to thinking about simply feeding the data into a MySQL database for easier data manipulation and indexing. Would that cause a more stressful server load with the constant INSERT queries vs a process just dumping the data into a file [ via PHP fputs() ] that is perpetually open? What about simply running the php process and accessing the stream directly? Only grabbing a snapshot of the data when a process needs it? I'm not really concerned with historical data as my web based app is more focused on trends at a given moment. Just wondering out loud if simply letting the process run in the background grabbing data would eventually fill up any caches or system memory.
[twitter-dev] New Twitter VB.NET 2005 Source Code
Yeah, I developed an OAUTH app using VB.NET. For the OAUTH process I used a library recommended to me from this group. It's pretty good. The website for my app source code is www.twitterdesktop.net Even though there are binaries on this site - it would help if you downloaded the binaries from www.download.com! Search for CYK Desktop. There is the simple HCI version of CYK Desktop. This version was designed with Human Computer Interaction in mind and is really easy to use. Then, there is the advanced version of CYK Desktop which takes over the desktop! It's a pretty cool feature. There are some problems that can arrise with the advanced version. 1. Firewalls need to accept the dumpurge.exe before logging into CYK Desktop Advanced. 2. On Vista / Windows 7 - UAC (User Account Control) needs to be turned OFF before logging in. The source code is for CYK Desktop advanced. They are pretty similar under the hood (the different versions). Okies, what I would LOVE is some feedback on my program in this thread by EVERY ONE!!! I recommend downloading the simple version to test because it does not hijack the desktop. But, if you are technically minded - and want your mind blown - download the advanced version! Love in - peace out - Catcalls
[twitter-dev] Re: Best practice - Stream API into a FILE or MySQL or neither?
It's obviously going to depend on your configuration, time and hardware budget, but I think the basic grab the stream to timestamped flat files and post-process later approach has a lot going for it. Especially on a Linux server, scripting languages are really good and efficient at the post-processing, and the tweets are coming in at such a high rate that you might well want to be using something other than a conventional RDBMS like MySQL or PostgreSQL for your data analysis and management anyhow. So why stuff things into MySQL, only to need to pull them out again for a MapReduce? My original design called for a process to sit on the streaming API and dump the tweets into PostgreSQL one at a time, but I ended up just collecting them into hourly JSON files, converting them to CSV with a simple Perl script, then putting them into PostgreSQL with the blazingly fast COPY operator. It doesn't take very long to build a sizable tweet database that way. And I can do filtering at the JSON or CSV level before anything even goes into PostgreSQL. On Jan 16, 10:13 am, GeorgeMedia georgeme...@gmail.com wrote: Just looking for thoughts on this. I am consuming the gardenhose via a php app on my web server. So far so good. The script simply creates a new file every X amount of time and starts feeding the stream into it so I get a continuous stream of fresh data and I can delete old data via cron. I plan to access the stream (files) with separate processes for further json parsing and data mining. But then that got me to thinking about simply feeding the data into a MySQL database for easier data manipulation and indexing. Would that cause a more stressful server load with the constant INSERT queries vs a process just dumping the data into a file [ via PHP fputs() ] that is perpetually open? What about simply running the php process and accessing the stream directly? Only grabbing a snapshot of the data when a process needs it? I'm not really concerned with historical data as my web based app is more focused on trends at a given moment. Just wondering out loud if simply letting the process run in the background grabbing data would eventually fill up any caches or system memory.
[twitter-dev] Re: Anyone using phirehose?
Same for us, George. But what are the alternatives ? On Jan 15, 6:17 pm, GeorgeMedia georgeme...@gmail.com wrote: I'm looking for a solid PHP library to access the gardenhose and just wondering if anyone is successfully implementing this using phirehose. It seems to be the only one out there... This fairly dated code seems to work for random periods of time then stops. http://hasin.wordpress.com/2009/06/20/collecting-data-from-streaming-...
[twitter-dev] List of Common Error messages and possible causes, ie 'Failed to validate oauth signature and token'.
Hi I've read the FAQ, and all the documentation. Am attempting to get an AS3 client working using OaUth. I am getting the following error message 'Failed to validate oauth signature and token'. tried resetting my consumer key, secret, and also checked my system clock which seems fine. After a quick search this seems to be a VERY common error message with many possible causes. Is there a list somewhere of common error messages such as this with probable causes?
[twitter-dev] Failed to validate oauth signature and token
Ok Yes this IS a common error message. I've read most of the posts, the entire OAuth beginner's documentation, registered my application, checked for capitalization , checked my system clock. So far, no luck As a base library I am using Sönke Rohde's open source Twitter library http://github.com/srohde/Twitter, though might switch to Tweetr and see if I make better progress. This is my header GET /oauth/request_token? oauth_consumer_key=C4eEz9MqGy28wuCj8hJC4woauth_nonce=0020a00%2001oauth_signature=gX9Uk20RF70D6sxljfvcIK4szr4%3Doauth_signature_method=HMAC- SHA1oauth_timestamp=1263675366 HTTP/1.1 Also , I am testing from the desktop at the moment so needing a proxy for security sandbox issues isn't a problem. Can anyone help with troubleshooting?
[twitter-dev] Re: Sent URLs received incompletely if not urlencoded - how to fix?
well, the ? right after the / is no problem on your site. this is a link sent from joomla. i use url-rewriting to shorten the url and didn't activate to append 'topic.html' after the last / so far. so the url i mentioned would look something like http://www.domain.xy/category/this-is-a-topic.html?key1=value1key2=value2key3=value3 when developement is done. but this does not reference the main problem. i used to send every possible kind of urls and only the urlencoded(url) was received in its full length by twitter. all none- urlencoded strings like the mentioned above end up after value1 and look like http://www.domain.xy/category/this-is-a-topic.html?key1=value1 . so the rest 'key2=value2key3=value3' is getting lost even if the whole string is less than 140 chars long. i have no idea about that. regards, tinobee
[twitter-dev] search api: best practice to capture all tweets.
I would like to capture and store all tweets that match a search query and do so from this time forward. My 1st attempt to do this was to query and store the matching results (tweets); additional queries include the parameter since_id=The max id value already stored. However the search api does not seem reliable to code this way. I am missing tweets because apparently the api does not always return all matches every query. By coding this way if a tweet is missed but the next one is captured, because the next one has a higher id the missing tweet will never be recovered. This is discussed here: http://groups.google.com/group/twitter-development-talk/browse_thread/thread/b7b6859620327bad/a31a88f8125c1c4e?lnk=gstq=search+api+store+#a31a88f8125c1c4e This is my code, I then just run it as a cron once a min. http://pastebin.com/f6207f43 So if this is not a reliable method, what is? I was thinking I could just remove the since_id parameter which would return the 100 most recent results. Then, in my code, I could see if the tweet was already stored or not and update/insert accordingly. If a tweet is missing from a query maybe it will be there next time and will be added. However this approach would fail if there were more then a 100 results a minute. This script would not keep up. I really appreciate any advice.
Re: [twitter-dev] Failed to validate oauth signature and token
The signature needs to be the very last parameter. You put all of the parameters in order except for the signature. Then you create the signature and append it to the end of the query string. Ryan Sent from my DROID On Jan 16, 2010 9:48 PM, eco_bach bac...@gmail.com wrote: Ok Yes this IS a common error message. I've read most of the posts, the entire OAuth beginner's documentation, registered my application, checked for capitalization , checked my system clock. So far, no luck As a base library I am using Sönke Rohde's open source Twitter library http://github.com/srohde/Twitter, though might switch to Tweetr and see if I make better progress. This is my header GET /oauth/request_token? oauth_consumer_key=C4eEz9MqGy28wuCj8hJC4woauth_nonce=0020a00%2001oauth_signature=gX9Uk20RF70D6sxljfvcIK4szr4%3Doauth_signature_method=HMAC- SHA1oauth_timestamp=1263675366 HTTP/1.1 Also , I am testing from the desktop at the moment so needing a proxy for security sandbox issues isn't a problem. Can anyone help with troubleshooting?
Re: [twitter-dev] List of Common Error messages and possible causes, ie 'Failed to validate oauth signature and token'.
Going by your other email, your query string parameters are not in the correct order. This is a very important part of OAuth. Ryan Sent from my DROID On Jan 16, 2010 9:48 PM, eco_bach bac...@gmail.com wrote: Hi I've read the FAQ, and all the documentation. Am attempting to get an AS3 client working using OaUth. I am getting the following error message 'Failed to validate oauth signature and token'. tried resetting my consumer key, secret, and also checked my system clock which seems fine. After a quick search this seems to be a VERY common error message with many possible causes. Is there a list somewhere of common error messages such as this with probable causes?
Re: [twitter-dev] Re: Sent URLs received incompletely if not urlencoded - how to fix?
Are you absolutely certain that the entire URL is being posted to twitter? Is it possible that some filter is interpreting the “” character and stripping off the remaining URL before you post it to twitter? Do you have a log of what is being transmitted to twitter? Are you transmitting through any proxies which could potentially be stripping the data off? Is twitter the only site with which this problem is occuring? I can’t reproduce the problem, including posting the URL you listed, but I am URL encoding “” to “%26”. By definition (see http://apiwiki.twitter.com/Things-Every-Developer-Should-Know#5Parametershavecertainexpectations) tweets are supposed to be URL encoded before transmitting to twitter, so I don’t understand what you mean by URL encoding. If you want the “” to have meaning within your tweet (regardless of whether it’s in a URL or just text), you MUST convert it to %26 otherwise it will appear to twitter as a variable on par with source, geo, status and in_reply_to_status_id. If you are not URL encoding the tweet then start doing so. -- -ed costello
Re: [twitter-dev] Re: Anyone using phirehose?
I'd strongly suggest consuming the Streaming API only from persistent processes that write into some form of durable asynchronous queue (of any type) for your application to consume. Running curl periodically is unlikely to be a robust solution. Select one of the existing Streaming API clients out there and wrap it in a durable process. Write to rotated log files, a message queue, or whatever other mechanism that you choose, to buffer the arrival of new statuses before consumption by your application. This will allow you to restart your application at will without data loss. -John Kalucki http://twitter.com/jkalucki Services, Twitter Inc. On Sat, Jan 16, 2010 at 10:27 AM, Jacopo Gio jacopo...@gmail.com wrote: Same for us, George. But what are the alternatives ? On Jan 15, 6:17 pm, GeorgeMedia georgeme...@gmail.com wrote: I'm looking for a solid PHP library to access the gardenhose and just wondering if anyone is successfully implementing this using phirehose. It seems to be the only one out there... This fairly dated code seems to work for random periods of time then stops. http://hasin.wordpress.com/2009/06/20/collecting-data-from-streaming-...
[twitter-dev] Re: Basic Auth Deprecation in June
On Jan 14, 8:30 am, twittme_mobi nlupa...@googlemail.com wrote: Hello , Regarding Basic Auth Deprecation is June Any where this is announced? -- Hwee-Boon
[twitter-dev] Re: Failed to validate oauth signature and token
solved, apparently my oauth_nonce value was incorrect, I assumed it was simply a random string and I didn't use the mx.utils.UIDUtil class to generate. I'll try also switching the order so the signature is at the end.
[twitter-dev] Re: Anyone using phirehose?
On Jan 16, 7:28 pm, John Kalucki j...@twitter.com wrote: I'd strongly suggest consuming the Streaming API only from persistent processes that write into some form of durable asynchronous queue (of any type) for your application to consume. Running curl periodically is unlikely to be a robust solution. Select one of the existing Streaming API clients out there and wrap it in a durable process. Write to rotated log files, a message queue, or whatever other mechanism that you choose, to buffer the arrival of new statuses before consumption by your application. This will allow you to restart your application at will without data loss. I don't know that there are any open source libraries out there yet that are robust enough to do that. At the moment, I'm working exclusively in Perl, and AnyEvent::Twitter::Stream seems to be the only Perl Streaming API consumer with any kind of mileage on it. As you point out, real-time programming for robustness is a non-trivial exercise. It would be nice if someone would build a C library and SWIG .i files. ;-) -- M. Edward (Ed) Borasky http://borasky-research.net/smart-at-znmeb A mathematician is a device for turning coffee into theorems. ~ Paul Erdős
Re: [twitter-dev] Re: Anyone using phirehose?
Given a reasonable stack, it shouldn't be all that hard to build something robust. Our internal streaming client, which transits every tweet that you see on the streaming api, seems to work just fine through various forms of abuse, and it's, roughly, a few hundred lines wrapped around Apache httpclient. On the other hand, I suspect that dependability is all but impossible on some stacks, or will require some heroism on the part of a library developer. As a community, we need clients that trivially allow robustness in a variety of stacks. We'll get there soon enough. On Sat, Jan 16, 2010 at 10:05 PM, M. Edward (Ed) Borasky zzn...@gmail.comwrote: On Jan 16, 7:28 pm, John Kalucki j...@twitter.com wrote: I'd strongly suggest consuming the Streaming API only from persistent processes that write into some form of durable asynchronous queue (of any type) for your application to consume. Running curl periodically is unlikely to be a robust solution. Select one of the existing Streaming API clients out there and wrap it in a durable process. Write to rotated log files, a message queue, or whatever other mechanism that you choose, to buffer the arrival of new statuses before consumption by your application. This will allow you to restart your application at will without data loss. I don't know that there are any open source libraries out there yet that are robust enough to do that. At the moment, I'm working exclusively in Perl, and AnyEvent::Twitter::Stream seems to be the only Perl Streaming API consumer with any kind of mileage on it. As you point out, real-time programming for robustness is a non-trivial exercise. It would be nice if someone would build a C library and SWIG .i files. ;-) -- M. Edward (Ed) Borasky http://borasky-research.net/smart-at-znmeb A mathematician is a device for turning coffee into theorems. ~ Paul Erdős
Re: [twitter-dev] Cursor Expiration
* John Kalucki j...@twitter.com [091209 09:28]: A cursor should be valid forever, but as it ages and rows are removed, you might see some minor data loss and probably more duplicates. Out of curiosity, what is a cursor? From our (the users') perspective, it's just an opaque number. But I'm curious. How is it generated? What does it represent internally? -Marc
[twitter-dev] DELETE list members API is rate limited
Hi, I'm @ono_matope. I found a bug on lists API and I wanna report you. Even though API document says DELETE list members API is Not rate limited, my DELETE list members API requests like following is rate limited. I requested following DELETE request. curl -u ono_matope:X -X DELETE -d id=17130681 http://api.twitter.com/1/ono_matope/hoge/members.json -i And response headers is following. HTTP/1.1 200 OK Date: Thu, 14 Jan 2010 15:05:30 GMT Server: hi X-RateLimit-Limit: 150 X-Transaction: 1263481531-77276-28451 Status: 200 OK ETag: 77550e3c9975d610529f85edff0913e9 Last-Modified: Thu, 14 Jan 2010 15:05:31 GMT X-RateLimit-Remaining: 147 X-Runtime: 0.14168 Content-Type: application/json; charset=utf-8 Pragma: no-cache Content-Length: 1109 X-RateLimit-Class: api Cache-Control: no-cache, no-store, must-revalidate, pre-check=0, post- check=0 Expires: Tue, 31 Mar 1981 05:00:00 GMT X-Revision: DEV X-RateLimit-Reset: 1263482054 Set-Cookie: lang=en; path=/ Set-Cookie: _twitter_sess=ABBR; domain=.twitter.com; path=/ Vary: Accept-Encoding Connection: close This request is rate-limitted. So I can't rebuild lists. I think that is a kind of a bug... I hope that this problem would be fixed.