On Fri, 6 Aug 2004, [EMAIL PROTECTED] wrote: >>> That's not a typical web crawler, and obviously not what I meant. >>>Such databases already exist (e.g. bugmenot) but using them to rip a >>>page is definitely abusive. > > Not abusive at all. It's a public service.
It's abusive to the content provider who pays the network connectivity bills and expects ad revenue, regardless of how you or anyone else feels. Note the context is on a major site's ripping of a page so visitors never see the original site, NOT general web visitors. I'm not interested in discussing the latter's attitude towards web registrations because that's completely irrelevant to Slashdot caching. >>> Think Google, not rip-off. > > Go to news.google.com and you will see many results that say things like > > Kansas City Star (subscription) > > So the Google crawler does indeed subscribe to some > registration-required sites and crawl them. I'm not sure how that matters. We're talking about Google's HTTP caching of ANY page, not their news items; furthermore the focus is on *intent* and not on *mechanism*. Google's intent with cached HTTP crawling is clearly not to rip off advertisers. Ted _______________________________________________ Boston-pm mailing list [EMAIL PROTECTED] http://mail.pm.org/mailman/listinfo/boston-pm

