[twitter-dev] Re: list/statuses
Try using per_page=200 instead of count=100... it's a documentation error. On May 31, 3:24 am, ogierepier wrote: > I have already tried asking for 200 tweets, but the results stay the > same because the api divides it in pages of 20 and you get the first > page back. > You used to be able to determine the results per page, but that > doesn't seem to work anymore since the api was renewed. Is there a way > I can maybe retrieve several pages in one call? Because looping > through the pages means an exponential growth of my calls made to the > API and I don't want to hit my limit. > > On 29 mei, 17:08, Tom van der Woerdt wrote: > > > > > > > > > You can''t. The 20 is the number of tweets received from Twitter's > > database. It will then simply not send the ones which come from private > > users, deleted ones (?), retweets, etc. If you want 20, ask for 50 and > > limit it yourself. > > > Tom > > > On 5/29/11 11:20 AM, ogierepier wrote: > > > > Now I have a public list that includes private accounts. I'm > > > retrieving the result by calling statuses.json. The list is followed > > > by a few people. I have a gadget on my site which retrieves the latest > > > statuses. The private tweets are left out, which is fine by me, but > > > they're taking the place of the public tweets. Which means if for > > > example the 20 latest tweets on the first page of the results contain > > > 19 private tweets you get only one tweet back. I do not want to remove > > > the private accounts from the list because the people following the > > > list can see this private tweets on twitter.com. How can I exclude the > > > private tweets from the query so that my results of the latest 20 > > > tweets contain 20 public statuses? -- Twitter developer documentation and resources: https://dev.twitter.com/doc API updates via Twitter: https://twitter.com/twitterapi Issues/Enhancements Tracker: https://code.google.com/p/twitter-api/issues/list Change your membership to this group: https://groups.google.com/forum/#!forum/twitter-development-talk
[twitter-dev] Throw us a bone?
With the recent outage at Amazon, we lost 3 days of database server... Everything is back running and I've back-filled what I can, but the Search API is only returning back as far as 4/26. This leaves us a huge gap for tweets that we don't get on lists or timelines from 4/21 0700 UTC to 4/26ish Please, could you consider a one-time effort to enable back-filled searches going back to 4/21? I'm sure there are others that sorely need this sort of back-fill. (also, still can't use SiteStreams because it STILL doesn't return tweets from tweeps with a matching location for geocode searches). Help, Marc Brooks http://stltweets.com http://stlindex.com -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Quick question on iPhone tweets
> There's actually a much easier way for you to implement the simple ability > to Tweet from your application without having to code up the OAuth song & > dance or xAuth, but the frictionless approach comes with the downside of > less control, attribution, and feedback. I second this recommendation, especially as twitter has left us stranded for going on three weeks now with the OAuth dance being broken. -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Can you search for tweets linking to any page within a domain?
> Say I have a website athttp://instantwatcher.comand I want to search > for all tweets, including ones condensed by TinyURL, that link to any > URL within this domain. Is this possible, and if so how can I do it? You basically have to suck the firehouse (or use search or whatever) to get all the tweets, then unshorten (& canonicalize) them to build your OWN search index. I have a huge system that does this for the BuzzRadius-based sites (like http://STLTweets.com or http://loufesttweets.com ). It's not trivial, and there are lots of gotchas to watch for like Facebook login gateways and such. That said, the best simple unshortener in my experience (beyond just crawling the link) is UnTiny.me http://www.untiny.me/ (the API is fast and very reliable). -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] 401 Unauthorized responses on OAUTH
We're getting a ton of 401 errors when people are trying to OAuth against some of our sites. These sites have been in production for years (and one new one went up yesterday). When we get the error, we get no message in the Response. From the client perspective, it happens when you click the "Allow" button and Twitter redirects back to us. I've checked all the usual things 1) Server clock is synced correctly to nist time (and the server runs in UTC, so no timezone/DST issues) 2) The servers haven't had any recent patches. 3) Same applications were working fine and haven't been changed (except the new site) 4) We get the same issues no matter what user we're logged into Twitter as. 5) We get the same issues even when running from the Amazon EC2 instance (IP whitelisted) or our QA servers (also IP whitelisted) or from development machines (not whitelisted). 6) Occasionally (1 in 20 or worse) we get a success. 7) Nonce values are NOT being reused and we're (still) using DotNetOAuth for the library to handle that part (no change) 8) Happens on all of these: http://stlindex.com (application under @STLIndex) http://stltweets.com (application under @STLTweets) http://loufesttweets.com (application under @LouFestTweets) http://taste.stltweets.com (application under @STLTweets) Typical failure: REQUEST Headers: (https://twitter.com:443/) Authorization: OAuth oauth_verifier="fRSn84gupR7TFAW5G5ySm4c2LmuvD9x8ZckCHIEA", oauth_token="MfgvKyS4Vgxy8c1kNgw7h3owkpAlzdqG223DTIs8vc", oauth_consumer_key="MliXkE6e4kCJY2U10OH8sQ", oauth_nonce="gkJy165f", oauth_signature_method="HMAC-SHA1", oauth_signature="9tRuLd55El37hJ2fqJs2cJVREaM%3D", oauth_version="1.0", oauth_timestamp="1300464048" User-Agent: DotNetOpenAuth/3.4.5.10201 Host: twitter.com RESPONSE: Status: 401 Unauthorized X-Transaction: 1300464048-3423-38581 X-Runtime: 0.00544 Pragma: no-cache X-Revision: DEV Content-Length: 1 Cache-Control: no-cache, no-store, must-revalidate, pre-check=0, post- check=0 Content-Type: text/html; charset=utf-8 Date: Fri, 18 Mar 2011 16:00:48 GMT Expires: Tue, 31 Mar 1981 05:00:00 GMT Last-Modified: Fri, 18 Mar 2011 16:00:48 GMT Set-Cookie: k=208.82.145.5.1300464048881173; path=/; expires=Fri, 25- Mar-11 16:00:48 GMT; domain=.twitter.com,guest_id=13004640478451; path=/; expires=Sun, 17 Apr 2011 16:00:48 GMT, _twitter_sess=BAh7CDoPY3JlYXRlZF9hdGwrCPmasskuAToHaWQiJWFkNDcxMzE2Yjg1YmIy %250ANDkzMGFkMWI3YmM5NTZlNDA5IgpmbGFzaElDOidBY3Rpb25Db250cm9sbGVy %250AOjpGbGFzaDo6Rmxhc2hIYXNoewAGOgpAdXNlZHsA--1c8558a834ffe3d40ae9be1bed2360f83555f5ae; domain=.twitter.com; path=/; HttpOnly Server: hi X-XSS-Protection: 1; mode=block X-Frame-Options: SAMEORIGIN Vary: Accept-Encoding -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: twitter dump where i only care about size
So Taylor or Matt or someone, can I do this for-trailing historical export or would that be in violation of the republish rule? I don't want to bring the FailWhale down on us Marc On Mar 6, 9:43 pm, Ted Pedersen wrote: > Thanks very much! I don't know the ins and outs of Twitter's data > distribution rules, but my intent is to use this strictly for > classroom assignments and we will not post or distribute the data in > any way. > > Cordially, > Ted > > On Sun, Mar 6, 2011 at 8:55 PM, @IDisposable wrote: > >> Is such a collection available for download anywhere, or is there an > >> existing program I could use to simply record twitter data for some > >> period of time? (I've heard about both the firehose and the streaming > >> API, but can't seem to find anything that is ready to run with that > >> for this particular taskbut I might not know where to look). > > > I could possibly export the STLTweels database in JSON format, but I > > don't want to run afoul of Twitter policies... We've got 47MM tweets > > from 600K tweeps... > > > -- > > Twitter developer documentation and resources:http://dev.twitter.com/doc > > API updates via Twitter:http://twitter.com/twitterapi > > Issues/Enhancements Tracker:http://code.google.com/p/twitter-api/issues/list > > Change your membership to this > > group:http://groups.google.com/group/twitter-development-talk > > -- > Ted Pedersenhttp://www.d.umn.edu/~tpederse -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Bigger avatar images for users/profile_image/twitter ?
> I want to use users/profile_image/twitter to get the picture of a > Twitter account. But I've seen the biggest size allowed is 73*73px. Is > there a way to get the original picture or a bigger one ? Avatars come in three sizes: mini = 24x24 normal = 48x48 bigger = 73x73 reasonably_small = 128x128 The last one only started being used with #NewTwitter so unless someone has uploaded an avatar since then, you'll get the "bigger" one instead. To get the alternate sizes, just replace the "_normal" with the desired size. For my avatar these are the urls: http://a3.twimg.com/profile_images/361706538/mk1_mini.jpg http://a3.twimg.com/profile_images/361706538/mk1_normal.jpg http://a3.twimg.com/profile_images/361706538/mk1_bigger.jpg http://a3.twimg.com/profile_images/361706538/mk1_reasonably_small.jpg Since sometimes people change the avatar and the one you might have on file is now dead, I use spiurl as a backup source for images http://code.google.com/p/spiurl/ which downloads and caches images from Twitter and runs on google's AppSpot (you can clone your own if you want control) The SPIURLs look like this: http://purl.org/net/spiurl/IDisposable/normal I've got it doing automatic fallback like this (and handling error from there with a fallback to my own locally cached default images): http://a3.twimg.com/ profile_images/361706538/mk1_normal.jpg" title="IDisposable" onerror="avatarOnError(this, 'normal');"> // Common.js // must be setup very early to guarantee it is available before any images try loading. avatarOnError = function(img, size) { var profileName = img.title; size = size || "normal"; img.src = 'http://purl.org/net/spiurl/' + profileName + '/' + size; img.onerror = function() { this.onerror = undefined; this.src = '/Assets/images/twitter/default_profile_' + Math.floor(Math.random() * 7) + '_' + size + '.png'; } } -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: twitter dump where i only care about size
> Is such a collection available for download anywhere, or is there an > existing program I could use to simply record twitter data for some > period of time? (I've heard about both the firehose and the streaming > API, but can't seem to find anything that is ready to run with that > for this particular taskbut I might not know where to look). I could possibly export the STLTweels database in JSON format, but I don't want to run afoul of Twitter policies... We've got 47MM tweets from 600K tweeps... -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Search API rate limit on some keywords
> have observed that sometimes some of the keywords get a 420 code? Any > ideas why is this happening? You get a 420 NOT USED when a search term hasn't been used "recently" where the "recently" is whatever small timeframe (sometimes 7 days, often less) is currently available in the search index. I get it all the time for things like #stlcards :) -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Location-based search is returning tweets that should not be included (again)
In response to this query: http://search.twitter.com/search.atom?rpp=100&geocode=38.627522%2C-90.19841%2C30mi&since_id=28525950136229890 I get tweets like this: http://api.twitter.com/1/statuses/show/28525953676218368.json We're talking about a location search for St. Louis MO, radius of 30 miles. We're getting a guy from "Jeffersonian", but timezone is Madrid. Any ideas where this wire got crossed, when we can get it uncrossed, or what the long-term viability of location based searches are? Marc Brooks http://stltweets.com -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Broken Json Status on Streaming API
There's NO QUESTION in my mind that this is a memory/disk/network issue. Somewhere in the chain of events leading to your parser, someone is dropping bits and tons of ECC/Checksum logic/etc. is missing it. I really really really doubt the feed is corrupt when leaving twitter-land... it's getting borked on the way to your parser. FWIW, I've never seen any of these issues... and I consume a TON of tweets daily. The difference between T and V is one bit in ASCII (hex 0x54 versus 0x56), between u and w (0x75 vs. 0x77), between h and j (0x68 and 0x6A). We're seeing a consistent flip-to-on of the 0x02 bit. You've probably got bad RAM if it were up to my estimation... if you've got hardware control of this machine, try upping the voltage on the Northbridge and DRAM if you can a notch. Marc On Nov 12, 6:52 am, Augusto Santos wrote: > In this period of wrong json status, I received either wrong date format of > created_at > > 2747941206892544 Thu Nov 11 35:42:14 + 2010 > 2565022072963072 Thw Nov 11 03:35:23 + 2010 > 256213672896 Tju Nov 11 03:23:54 + 2010 > 2550619441209344 Thu Nov 11 02:38:0; + 2010 > 2545567930523648 Vhu Nov 11 02:18:05 + 2010 > > and so on... > > Looks like was only one char problem per status, that mess up everything. > > Since Thu Nov 11 15:44:37 + 2010, I get no json parser or corrupt > created_at problem. > > Thanks. > > > > > > > > > > On Thu, Nov 11, 2010 at 11:51 PM, Augusto Santos wrote: > > Hi Taylor, > > > First, thanks for the answer. > > > I'm using Phirehose lib to PHP, the native json_decode($status,TRUE) from > > PHP and after json decode I'm using mysql_real_escape_string for the string > > fields. I see now that my log routine use mysql escape either before query. > > So this examples are escaped according this mysql procedure. > > > Here is the amount of tweets with this problem. That's when json_decode > > didn't work, so there's no id_str or new_id_str in my $status[] array, then > > it's throw an error and log it with the json status. I can send you all > > these status if you want it. > > > Date Hour(GMT-2) Count > > 2010-11-11 13 97 > > 2010-11-11 1 367 > > 2010-11-11 0 521 > > 2010-11-10 23 598 > > 2010-11-10 22 569 > > 2010-11-10 21 577 > > 2010-11-10 20 619 > > 2010-11-10 19 606 > > 2010-11-10 18 603 > > 2010-11-10 17 607 > > 2010-11-10 16 247 > > 2010-11-10 11 9 > > 2010-11-09 22 2 > > > Thanks, Augusto. > > > On Thu, Nov 11, 2010 at 2:52 PM, Taylor Singletary < > > taylorsinglet...@twitter.com> wrote: > > >> Hi Augusto, > > >> I monitored the sample stream this morning for a few hours for instances > >> similar to the JSON examples you've provided below and was unable to see > >> the > >> scenario duplicated. What JSON parser are you using? Is there any other > >> processing that may have occurred prior to generating your examples below? > >> Do you know how your library is handling "escaped" quote values like \" ? > > >> How many of these did you observe? > > >> Thanks, > >> Taylor > > >> On Thu, Nov 11, 2010 at 5:10 AM, Augusto Santos wrote: > > >>> I recevied a lot of broken json status from streaming api > > >>> Count of brojen json by day > >>> 2010-11-09 2 > >>> 2010-11-10 4435 > >>> 2010-11-11 888 > > >>> Examples: > >>> {"in_reply_to_status_id_str":"2563309119209472","text":"@joi4kitten I > >>> have that same > >>> fear.","truncated":false,"in_reply_to_user_id_str":"16155805","entities":{" > >>> > >>> user_menvions":[{"screen_name":"joi4kitten","indices":[0,11],"name":"joi4ki > >>> > >>> tten","id":16155805,"id_str":"16155805"}],"urls":[],"hashtags":[]},"geo":nu > >>> > >>> ll,"in_reply_to_status_id":2563309119209472,"place":{"country_code":"US","c > >>> ountry":"The > >>> United States of > >>> America","bounding_box":{"type":"Polygon","coordinates":[[[-76.965351,38.97 > >>> > >>> 1109],[-76.909147,38.971109],[-76.909147,39.022114],[-76.965353,39.022114]] > >>> ]},"place_type":"city","attributes":{},"full_name":"College > >>> Park, MD","name":"College Park",* > >>> "id":"e4c17912c815124d"."url":"http:\/\/api.twitter.com > >>> \/1\/geo\/id\/e4c17912c815124d.json"*},"favorited":false,"source":"\u003Ca > >>> href=\"http:\/\/mobile.twitter.com\" rel=\"nofollow\"\u003EMobile > >>> Web\u003C\/a\u003E","contributors":null,"in_reply_to_screen_name":"joi4kitt > >>> > >>> en","coorfinates":null,"retweet_count":null,"in_reply_to_user_id":16155805, > >>> "created_at":"Thu > >>> Nov 11 03:38:52 + > >>> 2010","new_id_str":"2565897919139841","new_id":2565897919139841,"user":{"fo > >>> llow_request_sent":null,"lang":"en","time_zone":"Eastern > >>> Time (US & > >>> Canada)","screen_name":"kellygo","following":null,"profile_sidebar_border_c > >>> olor":"0A84A5","profile_background_image_url":"http:\/\/ > >>> a3.twimg.com\/profile_background_images\/8381831\/twitter_background_with_b > >>> ear_2.jpg","notifications":null,"description":"I > >>> am not Kelly > >>> Osbourne.","listed_count":12,"prof
[twitter-dev] Re: TwitterOAuth example gets 401 all the time
> > Check to make sure the clock on the server/computer is correct. If it is off > > by more then five minutes this is likely the problem. > > It can't be Abraham. It's synchronized with NTP so it should be > perfect. Unless you've verified the time, it certainly CAN be. Even if a machine it setup to NTP, doesn't mean it can get there, is sucessful, etc. You need to check logs and verify that the time on the machine is now correct and NTP is happening. Note, on Windows servers, for example, the default frequency of NTP syncs is more than 24 hours. If you're running on a hosted virtual machine, drift can easily be hours over that time frame. How do I know this? Because I've had an Amazon EC2 SQL Server box that wouldn't stay synched (even though the IIS servers in the same availability zone are always correct) until I reset the NTP sync frequency to once every 20 minutes... no idea why that machine, but it was random 401s until I did. BTW, Twitter, can we PLEASE get a decent message in the human-readable chunk when the time signature is wrong? PLEASE? Marc Brooks Hack Prime http://stltweets.com -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Entities display_url and expanded_url
> Is the expanded_url field only intended to be present for t.co- > shortened links, or will it be extended to work with bit.ly and other > services? If you want expanded urls for all shorteners, I recommend using http://untiny.com -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Occasional 401s with correct tokens
> However, we get occasional 401s. After digging around a bit we found > that correctly-signed requests can timeout on the server side and > Twitter returns a 401. I'll be willing to bet that you got an "invalid / used nonce" message too... I suspect your machine's clock is out of sync with twitters by "too much". Make sure the machine you are using for this does an NTP time sync regularly. BTW, can we PLEASE get a better message for this, Twitter? I spent a bunch of time trying to figure out why my _unchanged_ library suddenly started 401ing with invalid/used nonces... and this was the cause... my machine was 11 minutes off UTC (fixed that, but still... a reasonable message would have helped). Marc -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Snowflake: An update and some very important information
> Isn't the point of having versioned API's so changes can be rolled out w/o > breaking a much of applications at once? YES! 1000 times YES! This is exactly correct. We need to use the restful API correctly. > Why not increment to version 2 and replace all ID's as strings in the JSON > format? Keep version 1 around for a few months Or 24 days after SnowFlake, or foreverish... since anyone still using it isn't using an "affected" parser (e.g. one in a conformant JavaScript/EcmaScript). > allowing everyone to upgrade and then kill it off. This can also give > twitter a chance to make any other breaking changes. Like, you know, revisioning the search.twitter.com API endpoints start giving us _real_ TweetIDs, Tweep UserIDs, etc... you know, maybe giving us complete details like the rest of the API. I can dream, right? Marc -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Sign out from twitter using Oauth
> Sorry, there is currently no way to accomplish this. Nor should there be... there is NO way that any site other than Twitter should control my login status on Twitter. Now to the OP's question: > When I logged out from my application, I need to logout from > twitter also. What _you_ can do is before you "forget" the login state of your application, delete an OAuth tokens you have for the logged in user... then when they return to login to your application, the will not-yet have Twitter OAuth tokens, so it will appear that they are not associated with the Twitter account and will have to reauthorize. You can safely keep the Tweep's user id and (less safely) screen name and profile image url around if you want to keep some knowledge of them... Marc -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Getting a Twitter User's Profile Image
> So what's the right way to get user profile image? One option is to throw yourself at the mercy of someone tracking and caching those images. We've used Shannon Whitley (@swhitley) SPIURL as a backup source on other projects like http://tweet08.com http://www.voiceoftech.com/swhitley/?p=652 You basically code http://purl.org/net/spiurl/{screen_name} and it "just redirects". They're running on free-level service for Google AppEngine, so it might not always be up, but hey, it's free. Marc -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Woe is me, I can't seek what I find (or Search is failing me)
> From that thread ticket 1930 was filed on our issue tracker which we > will update when a fix is deployed: > http://code.google.com/p/twitter-api/issues/detail?id=1930 Excellent, I hope it gets fixed while there is still time to back-fill some of this data,,,otherwise we're going to have a silly-looking hole in the next State of Twitter in St. Louis report :) > I understand your reasons for the location tracking using the Search > API but wondered if you knew that the mentions search you are doing > can be carried out on using the Streaming API filter method. That > should cut down on the number or REST queries you need to make. More > information on that method is here: > http://dev.twitter.com/pages/streaming_api_methods#statuses-filter Yes, I really need to switch to streaming for that... I just haven't had he bandwidth as of yet... we are using a Search (nee Summize) based infrastructure from a long while back and me being the "one guy in the room", I've not had a chance to really skim through and update our stuff for streaming. > Out of curiosity what is the third column of your figures represent? > It may be possible to track that one using the Streaming API as well. We do about 68 searches (mostly hashtags, a couple keyword or user searches--for legacy/coverage guarantees) and 64 timeline follows (mostly lists, one hometimel). Each of these sources applies a "label" based on the source of incoming data (which search/timeline) for our various categories (see http://stltweets.com and click the category menus e.g. Blues). For ALL of these searches, we also apply a top-level category (e.g. Sports) and finally ALL of the tweets get a label of "Everything" for ease of seperating various sub-sites. Thus, the "Everything" column in my numbers is the overall volume of tweets from all sources. SO, am I to assume that the geocode search bug, once fixed, will go back to returning the tweets from people whose _profile location_ reads something "near St. Louis" like before? Thanks, Marc -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Woe is me, I can't seek what I find (or Search is failing me)
> The Location search has been VERY unstable, and uses this typical > search:http://search.twitter.com/search.atom?rpp=100&geocode=38.627522%2C-90... It's getting worse all the time! Is this what we can expect going forward? If so, how can I follow all 20+ people we used to get tweets from on the location search? I'll happily create an account and manage the lists/follows... but I'm pretty sure that will get me killed, and it will only be a snapshot based on current profile location strings that we have... Sure, I could suck the *-pipe, but without a filter criteria, I'm going to be seeing all tweets from the entire universe, which seems hella-wasteful to twitter and me... Day MentionsLocationEverything 2010-09-13 498546801 53503 2010-09-14 471948110 54589 2010-09-15 477947599 54209 2010-09-16 514347087 54312 2010-09-17 525648363 55581 2010-09-18 488840943 47237 2010-09-19 587146008 53843 2010-09-20 499046219 52826 2010-09-21 49274 55933 2010-09-22 536451567 58999 2010-09-23 686642495 52967 2010-09-24 619141107 50679 2010-09-25 567336321 43950 2010-09-26 678435168 44664 2010-09-27 634632580 42192 2010-09-28 544832528 41792 2010-09-29 603840677 50472 2010-09-30 596438116 47713 2010-10-01 661538360 48302 2010-10-02 561223107 32024 2010-10-03 672822802 33328 2010-10-04 552823990 33491 2010-10-05 511638733 47023 2010-10-06 542739041 47856 2010-10-07 573330855 40742 2010-10-08 6355945922235 2010-10-09 5894835218691 2010-10-10 7240839920861 2010-10-11 4017558713010 -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Woe is me, I can't seek what I find (or Search is failing me)
Over the last couple months, we've seen some wierd behavior in the responses to search queries. First, I understand the rules about search being non-covering, and that we are at the mercy of the index. That said, I've noticed some odd behavior lately. As background material, we run many searches (and we're white-listed by IP and OAuth account), but the two I want to reference are the Mentions and the Location searches. The Mentions search seems pretty stable and uses this typical search (and then we exclude a bunch of things like Bay St. Louis, etc.): http://search.twitter.com/search.atom?rpp=100&q=stl+OR+%23stl+OR+stlouis+OR+%22St+Louis%22+OR+%22St.+Louis%22+OR+%22Saint+Louis%22+OR+SaintLouis&since_id=26682507745 The Location search has been VERY unstable, and uses this typical search: http://search.twitter.com/search.atom?rpp=100&geocode=38.627522%2C-90.19841%2C30mi&since_id=26679538876 As the day progresses, we move up the high-water mark in the since_id to track what we've already received so we should be getting minimal gaps. We almost never see two 100-entry polls in a row, so I think we're keeping up with whatever coverage the search index is offering. I've posted in a Google Spreadsheet a graph of the tweet counts we're seeing since 7/1/2010 so you can see the trends http://bit.ly/9wnnFM (sheet two is the graph). Some interesting things to note: 1) The Mentions search is very consistent. 2) The Location search likes to bounce around a bit. 3) In mid August, we started to have issues with more 403s and error about since_id being too old. We were also getting rate-limited in our calls to get the tweep details (since the ATOM feed is so meager). Due to a bug, I wasn't committing all the tweets when this happened. 4) On or about Sept 1st, you guys did something that broke our ability to stay caught up... we started getting almost no tweets and lots of errors about since_id being too old. I thought this was due to your "new tweet id" assignment being rolled out. 5) On Sept 5th, I got back from vacation and added logic to understand and use the "no new tweets, roll the tweet id forward to this" driven by parsing the node in the ATOM feed. 6) I also, around this time, added better logic to the tweep-lookup detail, only asking you for tweeps I don't have at least a minimal row on. This reduced the number of rate-limiting issues. 7) We were very stable and until 9/23 when volume falls off a lot, and never really recovers. I think this is the "new search" engine rollout. To research a little more, I tried the Twitter advanced search page and asking for the RSS (atom, really) feed from the advanced search page I get this URL now: http://search.twitter.com/search.atom?geocode=38.627522,-90.19841,30.0mi&lang=en&q=+near:38.627522,-90.19841+within:30mi Which starts off like ours, but adds the (seemingly redundant) human- readable search criteria "&q=+near:38.627522,-90.19841+within:30mi". Oddly, if we remove that and do the same search at nearly the same instant, I DO get vastly different tweets sets... probably due to volume, possibly just sorting, but I would hope that with the same since_id value, I would get the same tweets... but I don't. So, I'm asking... what's going on? Why are we seeing so much volume fall-off? What can we do about it? Should I be running both searches (my current one and one with the human-readable query) to get better coverage? Is there any hope/expectation of the volume returning to normal? Doesn't anyone else care about tweep-location searches? Now, before you tell me that I should be using Site Streams (which I want to do), realize that I _NEED_ tweets from people whose profile location says they are in St. Louis (and similar) like the old Summize search honored. I can't just get by with the _tweet_ location being STL. Marc Brooks Chief guy getting yelled at, http://stltweets.com http://taste.stltweets.com http://loufest.stltweets.com -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
[twitter-dev] Re: Authorizing for partial control
> - The possibility to ask for (by the app) and grant (by the user) a > more fine grained level of authorization (more than just read/write > only) Totally agreed!. Specifically, I want: 1) One time tweet WRITE 2) Ongoing tweet WRITE 3) Non-public READ 3) Non-DM READ 4) Full READ 5) Profile and Settings WRITE I should be able to ask for any combination as a developer, and as a client/end-user I should be able to revoke or refuse ANY of them while still allowing access. Thus if someone codes an application that wants to read all my tweets and send a solicit message, as an end-user I should be able to allow the read access but deny the tweet writes. Yes, this would complexify (wee) the UI, but it would enable people to avoid the Twitter-worms that annoy us so much. Marc -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk?hl=en
[twitter-dev] Re: Twitter Search/Stream API
> Also, all automated repetitive searching should be on the Streaming API. > Search is intended largely for ad-hoc queries. If Stream honored the location search (where the tweep's profile's location mattered), I would switch in seconds. Sadly, they are NOT equivalent. Marc -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk?hl=en
[twitter-dev] Re: Recent API changes and new fields
> listed_count > represents the number of public lists a user is listed in. This field is an > integer. As this is a new field it is possible some users will not have a > listed_count value yet. Very nice, thanks... I assume this will be correctly set when doing the mass-lookup of users by screen name? > To the status object: > - > retweet_count > represents the number of times a status has been retweeted using the Twitter > retweet action. This field is an integer. There will not be a value for this > field when the feature is turned off, or the Tweet was created before we > added retweet_count support. How accurate will this be in results returned from search.twitter.com? Will it be in those tweets at all for the json or atom feed? > When requesting data for suspended users the user/show used to return an > HTTP 404 status code - it now returns HTTP 403. This is AWESOME! Thank you!!! How is this reflected when doing a mass-lookup by screenname? Thanks, Marc
[twitter-dev] Re: Best practice question- back-channel posting to "add" metadata to a tweet.
On Aug 20, 2:36 pm, "M. Edward (Ed) Borasky" wrote: > Hmmm ... maybe build your own Status.Net infrastructure / servers, > stream your processed tweets into it Got that already... and it's running fine :)... > and sell subscriptions to the outputs? Someday, when people start to care about more than the "now" aspect of twitter and start thinking more about the where and why (location and sentiment) and what (kind of link/message). These days, it seems to be mostly SMDs out there... I want more useful things... blog post coming... > It doesn't make sense to push your processed results back > into the main Twitter stream unless there's some *profit* in it for > Twitter. ;-) Well, one thing is that it does propose a general solution to the mutable/addition of annotations... and I think annotations are great BOTH for the client originating the tweet, but also for post-analysis. As for how Twitter can profit from this, I would be happy to push all my link-canonicalization and classification stuff back into the pool... it's NOT easy work because the web isn't really symantic enough. Additionally, where better to store _forever_ (I'm looking at you, Library of Congress) the metadata about a tweet than in Twitter itself, and why not as a tweet? Heck, had I worked there, I would have likely pushed all account/profile updates/EVERYTHING (less sensitive data). through as specialized tweets that don't hit the public stream... you've already got a generalized message queue structure... :) Marc, the generalizer
[twitter-dev] Re: Twitter as a Publish/Subscribe service
> (1) Only URL's need to be broadcast with a couple of criptic tag-id's > - so none of this would resemble a typical Twitter message that is > made up of words and sentences. I just asked a related question, where the actual tweet message is not all that interesting or grokkable. Ideally I think we want a "back channel" attribute on the tweet that says "this tweet is unsuitable for direct human consumption". > (2) The same application would need to "subscribe" to messages from a > single Twitter user, without forcing application users to create their > own individual accounts. You don't need credentials to consume the search api at http://search.twitter.com, so you can do something as simple as http://search.twitter.com/search.js?rpp=100&since_id=xxx&q=from: where yyy is your tweeting user's screen name and the xxx is the last tweet that you got from your prior query. Note that you may miss messages in this because search doesn't surface every tweet and is only K-sorted... no guarantees. More details here http://dev.twitter.com/doc/get/search You could ALSO follow a user/list on the client, but that requires credentials. Finally, you could (if a desktop application, for now) consume a UserStream at each client (or in a proxy server for each), but that will also require credentials on the client (or proxy) site. As for acquiring client credentials, you could have a commissioning/ setup process on your side to create those credentials (create user account, OAuth, follow, etc) and then pass this AccessToken and TokenSecret to the client applications. This would have to be partially manual simply because [THANK ALL THAT IS GOOD AND HOLY] twitter accounts creation is still not automatable (captchas and all). > If not twitter, is there some other service that provides free and > reliable messaging services, as described above. Well, Twitter is not (sorry guys) what I would call a "reliable messaging service". It's kind of (potentially) lossy, so you're going to need a wrapper around that for transactional symantics (like direct messages back to the producer when the message sequence number -- yours, not the tweet id -- is out of order or skipping) to handle the retries and such. Frankly, if you can't handle the lossy nature, you should probably be looking at something like Amazon's Simple Messaging Service http://aws.amazon.com/sqs/ or Simple Notification Service http://aws.amazon.com/sns/ both are excellent products. Marc Hack Prime Infuz / BuzzRadius / STLTweets
[twitter-dev] Best practice question- back-channel posting to "add" metadata to a tweet.
So, I've got a nice bunch of Bayesian filters to do spam detection, tweet categorization and link canonicalization and classification. The stuff runs great on http://stltweet.com now, but I'm looking to share the load for other properties I'm developing for other locations/ verticals. In this load sharing, I want one processor doing the link- work, and one processor doing the tweet processing work across all properties. (of course there will be N-machines doing the work, but I want to only do the work once per...) So the ideal thing would be some way to emit the applicable metadata as annotations in a new tweet in the tweet stream, placing the new "classification, typing & labeling" information on the NEW tweet. When I create that tweet, I would make it "in reply to" the original tweet being classified to easily link the two. It SEEMS like this is the ideal solution, in general, to the post- mutability of tweet annotations... just tweet another tweet with the annotations that you want to apply to the original tweet, set the in- reply-to-tweet-id and go about business. When that new tweet is seen by the "in the know" application, it knows to apply the metadata retroactively to the original tweet in whatever manner it wish... think things like a "read flag", a "star rating", a "sentiment analysis", etc... heck, you could even track triggered "trouble ticket numbers" like this... I don't mind someone else seeing all these tweets (honest, don't care), but I wonder how Twitter will feel about what are essentially just automated tweets used as a broadcast communication channel. These tweets would not be very interesting in themselves because the tweet message would be essentially irrelevant. How do I keep from triggering spam filters, and how do I get over the tweets-per-day limits for this sort of work? So, any ideas?
[twitter-dev] Re: OAuth and impact on Twitter userbase / volume and freedom of speech
> Simple answer: because people in china can't even get to twitter.com *once*. Then the certainly don't have a twitter account to post tweets with, do they? Certainly we're not trying to enable people to create twitter accounts and tweet completely automatically. That would end well, wouldn't it? Marc
[twitter-dev] Re: How to retreive all tweets in time range in .net
> I need to retreive all the tweets in a time range (say for last seven > days), irrespective of the user who made them. That's going to be a LOT of tweets. Our site http://stltweets.com follows just St. Louis located people, mentions and 1500+ curated users. We get >6 unique tweets every day... and that's a tiny little slice of all tweets... > Also,I want to store them in database so that I can use them later.How can I > do this > in .net . There are many strategies... and several libraries. I personally use the LinqToTwitter library from CodePlex to do the Twitter side of things, and use a SQL Server 2008 database via simple DataReader stuff. The thing is,there is no way you can do this with the firehose (which is the only way you will get "all tweets"), because no database is going to commit that fast. I don't have any issues keeping up with my stream, but you're talking about a ton more data. In any case, it can be done quite easily in .Net... start with http://linqtotwitter.codeplex.com and a simple schema to capture the data. You'll have to figure out how you are going to use the data to properly design a schema, but for simple use something like this: SET ANSI_NULLS ON SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[Tweep]( [UserID] [bigint] NOT NULL, [ScreenName] [nvarchar](20) NOT NULL, [Name] [nvarchar](30) NOT NULL, [Location] [nvarchar](30) NOT NULL, [Description] [nvarchar](160) NOT NULL, [ProfileImageUrl] [nvarchar](max) NOT NULL, [HomepageUrl] [nvarchar](max) NOT NULL, [Followers] [int] NOT NULL, [Following] [int] NOT NULL, [Listed] [int] NOT NULL, [Tweets] [bigint] NOT NULL, [Freshness] [datetime] NOT NULL, [Verified] [bit] NOT NULL, CONSTRAINT [PK_Tweep] PRIMARY KEY CLUSTERED ( [UserID] ASC ) GO CREATE UNIQUE NONCLUSTERED INDEX [IX_Tweep_ScreenName] ON [dbo]. [Tweep] ( [ScreenName] ASC ) GO CREATE TABLE [dbo].[Tweet]( [TweetID] [bigint] NOT NULL, [TweepUserID] [bigint] NOT NULL, [CreatedTime] [datetime] NOT NULL, [Message] [nvarchar](max) NOT NULL, [Source] [nchar](100) NOT NULL, CONSTRAINT [PK_Tweet] PRIMARY KEY CLUSTERED ( [TweetID] ASC ) GO ALTER TABLE [dbo].[Tweet] WITH CHECK ADD CONSTRAINT [FK_Tweet_Tweep] FOREIGN KEY([TweepUserID]) REFERENCES [dbo].[Tweep] ([UserID]) GO ALTER TABLE [dbo].[Tweet] CHECK CONSTRAINT [FK_Tweet_Tweep] GO Hope this gets you started...
[twitter-dev] Re: Better support for Developers
> Just some thoughts. Votes please. That's trivially done with the LinqToTwitter library. Head over to http://linqtotwitter.codeplex.com
[twitter-dev] Re: Breaking change on Lists API endpoint?
> Thanks for your good-humored analysis of the issue. This is a new > feature we haven't documented or announced yet, and causes a conflict > we should have obviously thought more deeply about in advance. Okay, so I read into this that it is going to stay... can you then correct the return type to have a root node so the parsers that (rightly) switch on root node type get what was expected from the other (non-filtered) call? > Here's a work around to access the same end point you know and love > for lists named "all": > > https://api.twitter.com/1/STLT_Business/lists/show.xml?list_id=all I could do that, but it seems that would only help _me_ out... I use the LinqToTwitter library, which builds nice RESTful URLs... I'm a committer on that library, so I just want to fix this or ignore it... > More about this mysterious lists/all later. Let me know if this > alternate access point doesn't provide the output you are expecting. That response is just fine, if that's the way you want list URLs built going forward... but I would need your request to make it so before Joe will want to alter the LinqToTwitter library. For now, until you tell me the long-term plan, I've tweaked to accept the root node, and I'm renaming all my lists that are currently "all" to something that doesn't clash with you guys Would Everyone be safe? Marc
[twitter-dev] Breaking change on Lists API endpoint?
When trying to translate a list slug to a list ID, we make a call against the API endpoint https://api.twitter.com/1/STLT_Business/lists/tech.xml (where STLT_Business is the Twitter Screen name and tech is the list slug) This returns a nice valid XML like this: 7866001 Tech @STLT_Business/tech tech Programmers, developers, technology reporters and bloggers from St. Louis 2 66 /STLT_Business/tech false public 95984397 STL Tweets Business STLT_Business St. Louis, MO I'm watching St. Louis businesses, organizations, reporters and columnists. http://a3.twimg.com/profile_images/661611485/ STLT-Twitter_Pic-business_normal.png http://stltweets.com/Tweets/Business false 111 ff 9c9c9c 5483c9 062042 C0DEED 256 Thu Dec 10 21:11:58 + 2009 0 -21600 Central Time (US & Canada) http://a1.twimg.com/ profile_background_images/70631318/STLT-Twitter_Profile-business.png false true false true false false 31 en false false As of sometime today, that process started not to work when the list slug is "all". Instead of getting the information like shown above (e.g. an XML root node of , we get a response with ALL of my lists. This did NOT happen before today. Here's the sample from https://api.twitter.com/1/STLT_Business/lists/all.xml 7866001 Tech @STLT_Business/tech tech Programmers, developers, technology reporters and bloggers from St. Louis 2 66 /STLT_Business/tech false public 95984397 STL Tweets Business STLT_Business St. Louis, MO I'm watching St. Louis businesses, organizations, reporters and columnists. http://a3.twimg.com/profile_images/661611485/ STLT-Twitter_Pic-business_normal.png http://stltweets.com/Tweets/Business false 111 ff 9c9c9c 5483c9 062042 C0DEED 256 Thu Dec 10 21:11:58 + 2009 0 -21600 Central Time (US & Canada) http://a1.twimg.com/ profile_background_images/70631318/STLT-Twitter_Profile-business.png false true false true false false 31 en false false 7865951 Economy_and_Finance @STLT_Business/economy-and-finance economy-and-finance Local St. Louis columnists, analysts, reporters covering finance, economics and investing. 1 14 /STLT_Business/economy-and-finance false public 95984397 STL Tweets Business STLT_Business St. Louis, MO I'm watching St. Louis businesses, organizations, reporters and columnists. http://a3.twimg.com/profile_images/661611485/ STLT-Twitter_Pic-business_normal.png http://stltweets.com/Tweets/Business false 110 ff 9c9c9c 5483c9 062042 C0DEED 245 Thu Dec 10 21:11:58 + 2009 0 -21600 Central Time (US & Canada) http://a1.twimg.com/ profile_background_images/70631318/STLT-Twitter_Profile-business.png false true false true false false 30 en false false 7865907 Local_Independents @STLT_Business/local-independents local-independents Small, Indie businesses of all kinds in St. Louis. 3 43 /STLT_Business/local-independents false public 95984397 STL Tweets Business STLT_Business St. Louis, MO I'm watching St. Louis businesses, organizations, reporters and columnists. http://a3.twimg.com/profile_images/661611485/ STLT-Twitter_Pic-business_normal.png http://stltweets.com/Tweets/Business false 110 ff 9c9c9c 5483c9 062042 C0DEED 246 Thu Dec 10 21:11:58 + 2009 0 -21600 Central Time (US & Canada) http://a1.twimg.com/ profile_background_images/70631318/STLT-Twitter_Profile-business.png false true false true false false 30 en false false 7865872 STL_Corporations @STLT_Business/stl-corporations stl-corporations Major/National businesses (employers) based in St. Louis. 1 28 /STLT_Business/stl-corporations false public 95984397 STL Tweets Business STLT_Business St. Louis, MO I'm watching St. Louis businesses, organizations, reporters and columnists. http://a3.twimg.com/profile_images/661611485/ STLT-Twitter_Pic-business_normal.png http://stltweets.com/Tweets/Business false 110 ff 9c9c9c 5483c9 062042 C0DEED 253 Thu Dec 10 21:11:58 + 2009 0 -21600 Central Time (US & Canada) http://a1.twimg.com/ profile_background_images/70631318/STLT-Twitter_Profile-business.png false true false true false false 31 en false false false 6962471 Marketing @STLT_Business/marketing marketing List of advertising and marketing companies in St. Louis, MO 4 69 /STLT_Business/marketing false public 95984397 STL
[twitter-dev] Re: New attack/Phish going on?
As further information, supportcenter-twitter.com was registered TODAY and man-plus.com was registered on the 28th of June... I smell a phish. On Jul 12, 4:40 pm, "@IDisposable" wrote: > We've seen a huge increase in links coming in for http://*.man-plus.com > orhttp://supportcenter-twitter.comdomains in the last couple > hours... Given the name, and the fact that those domains are not > reliably resolving, I wonder if a Phish is ongoing > > Are any Twitter folks or other API users seeing this? > > Marchttp://stltweet.com
[twitter-dev] New attack/Phish going on?
We've seen a huge increase in links coming in for http://*.man-plus.com or http://supportcenter-twitter.com domains in the last couple hours... Given the name, and the fact that those domains are not reliably resolving, I wonder if a Phish is ongoing Are any Twitter folks or other API users seeing this? Marc http://stltweet.com
[twitter-dev] Any way to get a REASON for why an account was suspended.
We are building a "State of Twitter in St. Louis" whitepaper for our local companies/agencies/etc. In doing this, we gathered the list of influential people from our own STLTweets site and are mining for extra information from Twitter, Klout, TrstRank, etc... For a couple high-ranking people we would be listing, we've noticed that they have been suspended in the last week or so (one's last tweet was 2 weeks ago in our mirror, and the other's last tweet was 1 week ago). I checked with the API and they show as suspended, but we can't fathom why based on the tweets WE have archived from them... the profile pictures aren't lewd, the tweets aren't spammy, etc... My CEO would love to know why these two people have dropped out of the ecosystem (because they will ask at this presentation). http://stltweets.com/People/Everything/Detail/kiconner http://twitter.com/KiConner http://api.twitter.com/1/users/show.xml?id=30913186 http://stltweets.com/People/Everything/Detail/shordeedoowhop http://twitter.com/ShordeeDooWhop http://api.twitter.com/1/users/show.xml?id=130768057 So, my request is could we get augmentation as to WHY these accounts were suspended in the API response? I would LOVE to be able to code accordingly (e.g. to be able to know if we should purge/suspend the accounts on our system, etc.) but I really can't take an action without knowing WHY they were suspended.
[twitter-dev] Erroneous return from location-based search
It seems that every time someone checks into the Blue Bottle Coffee location on foursquare, they appear in our location search for St. Louis, MO. This is odd because the people tweeting are usually from Brooklyn or NYC, and the Blue Bottle is in Brooklyn... thus they really should not appear in our searches. Sample tweet: http://twitter.com/naterkane/status/16100616342 Sample tweep: http://twitter.com/naterkane/ [clearly in Brooklyn, NY] Sample search query: http://search.twitter.com/search.atom?rpp=100&lang=en&geocode=38.627522%2C-90.19841%2C30mi&since_id= (not sure what the since_id was at the time it arrived... sorry) Foursquare venue shows correct GPS in embedded map page http://foursquare.com/venue/1283426
[twitter-dev] Re: t.co Is cool, and I might have an issue with it anyway.
> > Oh, I know it... that's why a Sitemap.xml, ROBOTS.TXT and offering an > > OEmbed endpoint on your sites is a really good idea. Seehttp://oembed.com/ > > for the use of the latter. > > What's their business model? What do they sell to whom? OEmbed.com is the place where the standard is spelled out... e.g. what you should provide as a web developer if you want to encourage embedding and/or reduce crawling loads. As such, there's no business model (for them), but some website owner might have one that incentives you to use the standard. There is also a service that provides OEmbed data for tons of sites already, my favorite being http://api.embed.ly/ I have no idea what their business model is, but they have a wicked-cool service. Marc
[twitter-dev] t.co Is cool, and I might have an issue with it anyway.
Unlike many posters here, I REALLY LIKE the t.co shortening idea. In addition to enabling the blocking of malicious links, it will enable Twitter to start offering some metrics and buzz rating on shared links. I might have an issue with adhering to the letter of the TOS, if not the actual spirit. To get to the point, I need to introduce our platformso let me explain our Twitter use at BuzzRadius (platform) and STL Tweets (canonical example). In each "metro" vertical silo, we consume a curated list of users (organised as per-category Twitter Lists against per-top-level-category twitter users -- e.g. http://twitter.com/STLT_Politics/city-of-st-louis) using the timelines, and ALSO consume a ton of searches for hashtags- per-category, a search against common local terms terms and the old- style (Summize based) search for geocoded profiles in a radius around the metro area. This gives us a local lens into Twitter that we offer up to anyone (even not using/having a Twitter account). You can see an example of the platform in use at http://stltweets.com Once we get tweets, we extract the RawLinks, Hashtags, Mentions and label/categorize them based on which of the sources provided the tweet. We then offer those categorized list of tweets under the various tabs like http://stltweets.com/Tweets/CityPolitics we provide a local search against the curated tweets (to enable focused results). The INTERESTING part is that we then take the RawLinks and canonicalize them and explore them to solve several problems: 1) We want the meta-data of the final destination of the Link... headline, body-text, media-type, etc. This allows presentation of "type-specific" link lists (e.g. Photos, Videos, Audio, Location, News, etc.) We can also grab thumbnail and embedding information to enable presentation on our side where possible. 2) We want to take all the RawLinks that point to the same actual destination link and follow them all the way down and associate the RawLink to a canonical Link. We do this by following links, redirections, frame-busting, query-string stripping of utm-tracking (etc.), link rel="canonical" and all that other fancy stuff to get a URL that is the final destination of the RawLink. This allows us to calculate a Buzz value for the Link based on how many people tweet about it, how recently, etc... EVEN IF the use different sources, shorteners, etc to get there. 3) We provide an ordered/ranked list of Links whenever a user visits a links page like http://stltweets.com/Links/CityPolitics if someone clicks on a link on this page, we send them to a Link detail page that show the meta-data and all the tweets that reference the Link (independant of the RawLink that leads there). 4) In the next deployment, we will also be doing ANONYMOUS link click- through tracking of the presented links to also gather the resonance in our audience for feed-forward into the Link ranking in bullet 3. Right now, that tracking is done only when clicking off the Link detail page like http://stltweets.com/Links/CityPolitics/Detail/804058 but we're also going to be redecorating the Tweet links when rendering the tweets in a stream. SO FINALLY TO THE POINT: In our current release, we present the RawLink in the tweet without redecoration so once Twitter starts feeding us t.co links, we'll be proffering those up just fine. Were good to go on any pages showing the original tweets (any Tweet list, including the list of tweets referencing a canonicalized Link). In future release, we will still be okay, because we'll click through our site and serve a 302 redirection to the "original" t.co URL (which in all likelihood is also to a shortener, but no matter). In our current release, when we show the user the canonicalized Link page, the tweet listing is good to go (see above), but the Link detail section at the top is NOT compliant because we cannot tell WHICH OF MANY possible RawLinks the user is "interested in", so we cannot serve the "original" link at all. Once Twitter starts serving t.co links, we cannot know which (of possibly many) the user is responding to. This sucks for Twitter, because they can't gather metrics, or mal-filter. In our case, we distribute the "clicked juice" to ALL the RawLinks that canonicalized to this Link based on the number of tweets that each RawLink got. It's an estimate, but it is all we have. In a future release, we will be putting in a best-effort change in that if a Link has only ONE RawLink referring to it, we'll actually click-through that RawLink instead of the canonicalized Link. This will enable proper tracking in a original URL (e.g. the utm-tracking, etc that we stripped will be honored). Once Twitter starts serving t.co URLs, we'll thus pass through per the TOS. As you can, however, we cannot ALWAYS do the right thing, because we don't know which RawLink to redirect through if the user clicks through the Link instead of the Tweet. So, who's going to yell at us?
[twitter-dev] Re: t.co Is cool, and I might have an issue with it anyway.
> > So, who's going to yell at us? > > With all you data miners out there clicking and downloading everything > in sight, pretty soon you will only measure the noise created by data > miners, web crawlers and the like. If someone would operated a free global place where we could get that information (like the OEmbed standard calls for) then we could ask without counting. In the meantime, I'm offering a valuable service to my audience by unrolling the shortened URL to something meaningful. I hope you bothered to look at the pages I gave to understand what that value is. The canonicalization does NOT click/crawl anything on the final page... it just follows the redirections and frame-busting as needed to get to the actual content. > Google, yandex and the rest are already a signigicant amount of the > traffic for small sites. Oh, I know it... that's why a Sitemap.xml, ROBOTS.TXT and offering an OEmbed endpoint on your sites is a really good idea. See http://oembed.com/ for the use of the latter. > What this means is that because you are introducing more and more > background noise into your data, you will only be able to measure the > really strong signals. That narrows what you can find, and you risk > that eventually you find only obvious things. I'm not introducing noise in my OWN data because I'm correctly rendering the links with rel="nofollow" so Google and other well- behaved crawlers won't follow them. What I'm measuring is the click- though rate ON MY SITE of links leading off-site. This is standard behavior. Sadly, I will agree that my crawl of the RawLink to canonical link will add noise to that destination site's numbers. I hope that the fact that I follow the best practice of using a bot-noted User-Agent helps in statistics on their end. I know that I have had to understand and honor/count those UAs correctly. Marc
[twitter-dev] Re: Introduce yourself!
I'm Marc Brooks @IDisposable I'm the Hack Prime (Sr. Architect) at Infuz http://infuz.com We have a platform BuzzRadius http://buzzradius.com that enables location and/or special interest sites that build siloed, curated, rated and ranked Tweet/Link/Trend/People listings of twitter. You can see a location-specific site for the St. Louis, MO, USA area at http://stltweets.com to see what we do. We're built on the Microsoft .Net 3.5 platform in C# using SQL Server 2008. We're built with ASP.Net MVC 2, LinqToTwitter http://linqtotwitter.codeplex.com and deployed to the Amazon AWS cloud. We use Timelines and Searches (especially the old Summize search, since that's the only one that respects the Profile location field). Thanks Twitter! Marc
[twitter-dev] Re: What tools do you use?
Using: ASP.Net 3.5 with MVC 2.0 http://asp.net/mvc C# Microsoft SQL Server 2008 Twitter LinqToTwitter http://linqtotwitter.codeplex.com (thanks @JoeMayo) dotNetOAuth for oAuth http://code.google.com/p/dotnetoauth/ (thanks @AArnott) Newtonsoft JSON.Net for JSON http://james.newtonking.com/projects/json-net.aspx ( Framework xUnit unit-text platform http://xunit.codeplex.com/ (thanks @bradwilson) nInject dependency injection http://ninject.org (thanks @nhokari) Log4Net for logging http://logging.apache.org/log4net/ MOQ for mocking http://code.google.com/p/moq/ (thanks @kzu) Automapper for DTO to VM mapping http://automapper.codeplex.com/ (thanks @ehexter) UI jQuery for WebUI magic http://jquery.com/ Link Scraping / MetaData HTML Agility Pack for metadata http://htmlagilitypack.codeplex.com/' Flickr REST API for photo http://www.flickr.com/services/api/ OEmbed REST API for embedding http://oembed.com/ TweetPhoto REST API for photo http://tweetphotoapi.com/ TwitPic REST API for photo http://twitpic.com/api.do YFrog REST API for photo/video http://code.google.com/p/imageshackapi/ Vimeo REST API for video http://www.vimeo.com/api Google DATA API for YouTube metadata http://code.google.com/apis/gdata/ (inhouse tons of special logic for frame-busting, etc) Link Canonicalization Bit.Ly REST API http://code.google.com/p/bitly-api/wiki/ApiDocumentation Digg REST API http://apidoc.digg.com/ IS.GD REST API http://is.gd/api_info.php Owl.Ly REST API http://ow.ly/url/shorten-url (apply at bottom) Snurl/SnipUrl REST API http://snipurl.com/site/help?go=api Twurl .NL/.CC Tweetburner REST API http://tweetburner.com/api StumbleUpon SU.PR REST [good luck, sorry] Untiny.Me REST API for [all others] http://untiny.me/api/ (thanks @alzaid @untiny) Marc Brooks Hack Prime @Infuz http://infuz.com http://stltweets.com http://buzzradius.com http://musingmarc.blogspot.com
[twitter-dev] Search and Timelines not returning many results.
We're still seeing very low tweet rates from our polls. We have 40+ searches and 40+ timeline (actually List timelines) that have been running fine for may months. Over the last two days, even though we've gotten successful polls, the rate of tweets is WAY down (like an order of magnitude). In the following hour-by-hour dump, you can see the patterns of normal ebb-and-flow over time (since the focus of what we monitor is St. Louis local, the volume follows). The columns are the hour slot, the number "matched" (e.g. returned from Twitter) and the number "inserted" (e.g. the new tweets not seen in another search). There are some weird lines where I reset the "since_id" to zero so we could try catching up. Any ideas when this will get better? The vast majority of our results have traditionally come in from a geocode search like below (returned one tweet, normally returns dozens/ 100 per pulse [hence the need for since_id]); http://search.twitter.com/search.atom?rpp=100&lang=en&geocode=38.627522%2C-90.19841%2C30mi&since_id=15864634317 2010-06-10 17:00:00.000 437 200 2010-06-10 16:00:00.000 678 308 2010-06-10 15:00:00.000 667 277 2010-06-10 14:00:00.000 705 427 2010-06-10 13:00:00.000 481 247 2010-06-10 12:00:00.000 442 257 2010-06-10 11:00:00.000 320 225 2010-06-10 10:00:00.000 235 158 2010-06-10 09:00:00.000 6524181 <- I reset the since_id again, but no NEW messages 2010-06-10 08:00:00.000 433 284 2010-06-10 07:00:00.000 190 169 2010-06-10 06:00:00.000 427 368 2010-06-10 05:00:00.000 408 290 2010-06-10 04:00:00.000 664 434 2010-06-10 03:00:00.000 803 503 2010-06-10 02:00:00.000 594 440 2010-06-10 01:00:00.000 46541318 <- I reset the since_id to catch up 2010-06-09 20:00:00.000 571 253 2010-06-09 19:00:00.000 597 244 2010-06-09 18:00:00.000 669 281 2010-06-09 17:00:00.000 2190995 2010-06-09 15:00:00.000 10 10 2010-06-09 14:00:00.000 456 180 2010-06-09 13:00:00.000 1013641 2010-06-09 12:00:00.000 628 386 2010-06-09 11:00:00.000 454 351 2010-06-09 10:00:00.000 355 281 2010-06-09 09:00:00.000 266 224 2010-06-09 08:00:00.000 449 371 <- twitter goes wonky 2010-06-09 07:00:00.000 18811798 2010-06-09 06:00:00.000 26012467 2010-06-09 05:00:00.000 30202780 2010-06-09 04:00:00.000 37933461 2010-06-09 03:00:00.000 31872827 2010-06-09 02:00:00.000 705 467 <- our maint period completes 2010-06-09 01:00:00.000 377 184 2010-06-09 00:00:00.000 438 202 <- out maint period begins 2010-06-08 23:00:00.000 17101432 2010-06-08 22:00:00.000 17181422 2010-06-08 21:00:00.000 34272932 2010-06-08 20:00:00.000 37723159 2010-06-08 19:00:00.000 27221451 <- twitter come back 2010-06-08 18:00:00.000 146 127 2010-06-08 17:00:00.000 198 171 2010-06-08 16:00:00.000 185 134 2010-06-08 15:00:00.000 1335924<- twitter goes wonky 2010-06-08 14:00:00.000 22531909 2010-06-08 13:00:00.000 20881727 2010-06-08 12:00:00.000 13531129 2010-06-08 11:00:00.000 832 689 2010-06-08 10:00:00.000 627 540 2010-06-08 09:00:00.000 580 532 2010-06-08 08:00:00.000 874 776 2010-06-08 07:00:00.000 11401090 2010-06-08 06:00:00.000 19271817 2010-06-08 05:00:00.000 25712422 2010-06-08 04:00:00.000 34353185 2010-06-08 03:00:00.000 42263846 2010-06-08 02:00:00.000 52104714 2010-06-08 01:00:00.000 44003800 2010-06-08 00:00:00.000 31162814 2010-06-07 23:00:00.000 27522437 2010-06-07 22:00:00.000 34142683 2010-06-07 21:00:00.000 30082343 2010-06-07 20:00:00.000 34332969 2010-06-07 19:00:00.000 35222875 2010-06-07 18:00:00.000 39533352 2010-06-07 17:00:00.000 30672399 2010-06-07 16:00:00.000 33322461 2010-06-07 15:00:00.000 29782412 2010-06-07 14:00:00.000 25002035 2010-06-07 13:00:00.000 19411607 2010-06-07 12:00:00.000 13811174 2010-06-07 11:00:00.000 747 584 2010-06-07 10:00:00.000 719 502 2010-06-07 09:00:00.000 482 430 2010-06-07 08:00:00.000 635 518 2010-06-07 07:00:00.000 912 879 2010-06-07 06:00:00.000 15801489 2010-06-07 05:00:00.000 19501830 2010-06-07 04:00:00.000 25962438 2010-06-07 03:00:00.000 33292995 2010-06-07 02:00:00.000 35943296 2010-06-07 01:00:00.000 33613097 2010-06-07 00:00:00.000 29802702 2010-06-06 23:00:00.000 21571912 2010-06-06 22:00:00.000 21541944 2010-06-06 21:00:00.000 19481790 2010-06-06 20:00:00.000 22662020 2010-06-06 19:00:00.000 22732024 2010-06-06 18:00:00.000 22632119 2010-06-06 17:00:00.000 22781998 2010-06-06 16:00:00.000 19821790 2010-06-06 15:00:00.000 18121586 2010-06-06 14:00:00.000 13351179 2010-06-06 13:00:00.000 1106978 2010-06-06 12:00:00.000 795 680 2010-06-06 11:00:00.000 516 443 2010-06-06 10:00:00.000 487 412 2010-06-06 09:00:00.000 594 534