> > So, who's going to yell at us? > > With all you data miners out there clicking and downloading everything > in sight, pretty soon you will only measure the noise created by data > miners, web crawlers and the like.
If someone would operated a free global place where we could get that information (like the OEmbed standard calls for) then we could ask without counting. In the meantime, I'm offering a valuable service to my audience by unrolling the shortened URL to something meaningful. I hope you bothered to look at the pages I gave to understand what that value is. The canonicalization does NOT click/crawl anything on the final page... it just follows the redirections and frame-busting as needed to get to the actual content. > Google, yandex and the rest are already a signigicant amount of the > traffic for small sites. Oh, I know it... that's why a Sitemap.xml, ROBOTS.TXT and offering an OEmbed endpoint on your sites is a really good idea. See http://oembed.com/ for the use of the latter. > What this means is that because you are introducing more and more > background noise into your data, you will only be able to measure the > really strong signals. That narrows what you can find, and you risk > that eventually you find only obvious things. I'm not introducing noise in my OWN data because I'm correctly rendering the links with rel="nofollow" so Google and other well- behaved crawlers won't follow them. What I'm measuring is the click- though rate ON MY SITE of links leading off-site. This is standard behavior. Sadly, I will agree that my crawl of the RawLink to canonical link will add noise to that destination site's numbers. I hope that the fact that I follow the best practice of using a bot-noted User-Agent helps in statistics on their end. I know that I have had to understand and honor/count those UAs correctly. Marc