Re: [twitter-dev] tco crawler details
t.co is not a crawler; Are you referring to the URL unpacking process or something else? -john On Thu, Jun 10, 2010 at 11:46 PM, Ken k...@cimas.ch wrote: If tco is to be the new three-letter agency and gatekeeper, we would like to treat it nice and whitelist its crawler. If tco is inadvertantly blocked, what happens? I do not know if we have already been checked by tco as I have not sent or received a dm with one of our own URLs. What are the user-agent and IP addresses used by this crawler? Does it check robots.txt? And since, for some, a tco thumbsdown could be a problem, is there a (speedy) appeals process?
RE: [twitter-dev] tco crawler details
Of course it is. Twitter were asked what defines a bad site on the second day but I haven't seen a reply apart from more questions about who is making the choice, eg will pornography be classed as bad, will religious free speech be classed as bad. I don't think the Twitheads thought through what it means to now offer an aol version of the web and the long term responsibilities that this entails through implicit guarantees to their users. Of course Ken you don't expect them to publish their ip address list do youotherwise some smartass would route this ip address to a clean site and everyone else to the bad content. Regards, Dean Collins Cognation Inc d...@cognation.net mailto:d...@cognation.net +1-212-203-4357 New York +61-2-9016-5642 (Sydney in-dial). +44-20-3129-6001 (London in-dial). From: twitter-development-talk@googlegroups.com [mailto:twitter-development-t...@googlegroups.com] On Behalf Of John Adams Sent: Friday, 11 June 2010 6:00 AM To: twitter-development-talk@googlegroups.com Subject: Re: [twitter-dev] tco crawler details t.co is not a crawler; Are you referring to the URL unpacking process or something else? -john On Thu, Jun 10, 2010 at 11:46 PM, Ken k...@cimas.ch wrote: If tco is to be the new three-letter agency and gatekeeper, we would like to treat it nice and whitelist its crawler. If tco is inadvertantly blocked, what happens? I do not know if we have already been checked by tco as I have not sent or received a dm with one of our own URLs. What are the user-agent and IP addresses used by this crawler? Does it check robots.txt? And since, for some, a tco thumbsdown could be a problem, is there a (speedy) appeals process?
Re: [twitter-dev] tco crawler details
We've already been checking for bad links now for at least a year, if not 18 months. It's been so long, I can't remember when it went into production. Link checking seems to work very well. -John Kalucki http://twitter.com/jkalucki Infrastructure, Twitter Inc. On Fri, Jun 11, 2010 at 6:21 AM, Dean Collins d...@cognation.net wrote: Of course it is. Twitter were asked what “defines” a “bad” site on the second day but I haven’t seen a reply apart from more questions about who is making the choice, eg will pornography be classed as “bad”, will religious free speech be classed as “bad”. I don’t think the Twitheads thought through what it means to now offer an “aol” version of the web and the long term responsibilities that this entails through implicit guarantees to their users. Of course Ken you don’t expect them to publish their ip address list do you….otherwise some smartass would route this ip address to a “clean” site and everyone else to the “bad” content. Regards, Dean Collins Cognation Inc d...@cognation.net +1-212-203-4357 New York +61-2-9016-5642 (Sydney in-dial). +44-20-3129-6001 (London in-dial). From: twitter-development-talk@googlegroups.com [mailto:twitter-development-t...@googlegroups.com] On Behalf Of John Adams Sent: Friday, 11 June 2010 6:00 AM To: twitter-development-talk@googlegroups.com Subject: Re: [twitter-dev] tco crawler details t.co is not a crawler; Are you referring to the URL unpacking process or something else? -john On Thu, Jun 10, 2010 at 11:46 PM, Ken k...@cimas.ch wrote: If tco is to be the new three-letter agency and gatekeeper, we would like to treat it nice and whitelist its crawler. If tco is inadvertantly blocked, what happens? I do not know if we have already been checked by tco as I have not sent or received a dm with one of our own URLs. What are the user-agent and IP addresses used by this crawler? Does it check robots.txt? And since, for some, a tco thumbsdown could be a problem, is there a (speedy) appeals process?