It seems to me that Looksmart is doing the right thing. Excluding user-agents named "Due to a deficiency in Java it's not currently possible to set the User-Agent." will exclude all Java-based "browsers" unable to set the user-agent property using the java.net.URLConnection.setRequestProperty method.
-------------------------------------------------------------- Rasmus T. Mohr Direct : +45 36 910 122 Application Developer Mobile : +45 28 731 827 Netpointers Intl. ApS Phone : +45 70 117 117 Vestergade 18 B Fax : +45 70 115 115 1456 Copenhagen K Email : mailto:[EMAIL PROTECTED] Denmark Website : http://www.netpointers.com "Remember that there are no bugs, only undocumented features." -------------------------------------------------------------- > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On > Behalf Of Alan Perkins > Sent: Tuesday, May 28, 2002 12:41 PM > To: [EMAIL PROTECTED] > Subject: [Robots] Looksmart's robots.txt file > > > > Hi there > > I'm sure most of you are aware of the furore following looksmart.com's > recent shift to pay-per-click (PPC). One of the issues > reported by many > people is the number of "false clicks" reported by Looksmart, i.e. > advertisers just cannot reconcile Looksmart's reported > clickthroughs with > clickthroughs derived from their Web logs. These same advertisers can > reconcile clickthroughs from other PPC providers such as > Overture or Google > so the problem doesn't appear to lie with the advertiser. > > I've been looking at Looksmart's robots.txt file and it is - > well, shall we > say unusual? > > www.looksmart.com/robots.txt > > In my opinion this file demonstrates a lack of understanding > of robots in > several different respects, e.g. lines like: > > <snip> > User-agent: Due to a deficiency in Java it's not currently > possible to set > the User-Agent. > Disallow: > </snip> > > I'm wondering if this lack of understanding permeates through > to Looksmart's > PPC-accounting department. > > In other words, I'm wondering how many of the false clicks seen by > advertisers are from robots (particularly robots masquerading > as a Mozilla > browser). Looksmart's robots.txt does not prevent robots > from reading the > URLs that cause advertisers to incur a fee. So if Looksmart cannot > recognise the robot as a robot (and especially if they aren't > even checking > for robots) advertisers could be incurring fees from > robot-clickthroughs. > Most robots do not send a referrer in their HTTP request so this would > explain why advertisers could not reconcile clickthroughs. > > Looksmart's URLs are featured in the SERPs (search engine > results pages) of > its search engine partners, as well as throughout > looksmart.com itself. So > any robot that crawls SERPs and/or the web could cause these false > clickthroughs. I know of at least two robots that crawl out > from SERPs > masquerading as browsers to analyse why pages rank well. > > So your thoughts please on > > a) how many robots, given www.looksmart.com/robots.txt, would > read those > looksmart.com PPC URLs? > b) how many of those robots would be recognisable as robots, > i.e. use a > unique User Agent? > > Alan Perkins > CTO, e-Brand Management Limited > http://www.ebrandmanagement.com/ > > > > > >