Re: why doesn't altavista block metacrawler.com's IP's?

Marc Slemko Sat, 8 Apr 2000 11:31:26 -0700

On Sat, 8 Apr 2000, Adam Megacz wrote:

> Ok, I guess it's not really a robot, but metacrawler does connect to
> sites such as altavista.com and masquerade as a client -- somewhat
> similar to what robots do.


Mmm.  Not really.  There is no "masquerade" involved; they _are_ a client.
This isn't really ontopic here, since metasearch engines don't care about
what 95% of being a robot is about (knowing what URLs to query and how,
following links in retrieved documents, etc.) but...

>
> I've always been curious -- why doesn't altavista block access from
> the IP's owned by metacrawler.com? Metacrawler slows down altavista,
> and they don't make any money from banner ads (since metacrawler
> strips the banner ads off of the pages returned by altavista).

While it would not be appropriate for me to comment on the particulars of
the situation with MetaCrawler and AltaVista, there are a few general
comments I can make about metasearch engines and the backend engines they
query.

MetaCrawler does indeed pass through certain banner ads from certain
engines.  Take a look on the second and later pages of some MetaCrawler
results pages and you will see them in the middle of the results
sometimes.

MetaCrawler includes attribution of where the queries come from besides
each results, which can serve as a form of advertising and a way to drive
some amount of traffic to the particular engine.  For lesser known
engines, the exposure that metasearch engines give could be a reason for
them to want such traffic.  For a lot of Internet companies, the revenue
thing is still something they are trying to figure out; they figure if
they can just get the traffic in whatever way they can, they will figure
out how to do something with it later.

Some search engines, such as goto.com, receive money themself from the
sites being linked to for clickthroughs on certain paid results.  They
still get these payments even if the clickthrough comes from a metasearch
engine.

I think you will find that every metasearch engine with a significant
volume of traffic has business agreements in place with the engines they
query.  The nature of these agreements, including such things as if the
search engine pays the metasearcher, vice versa, or neither, would depend
on the situation and the clout of the various companies involved.  All of
these situations are options, however.

For all the random tiny metasearch engines out there that just point
themselves at some search engines and query away, their small volume of
queries isn't worth the bother to investigate or block.  You can be
certain, however, that if they grow to be a significant volume of queries,
most search engines will notice and will take action.

One view is reflected by David Filo's (one of the founders of Yahoo and a
"Chief Yahoo") response to a similar question about MetaCrawler at
ApacheCon in 1998.  He said "MetaCrawler doesn't matter, and will never
matter" in terms of Yahoo's traffic.  You will note, however, that
MetaCrawler doesn't query Yahoo at the moment, and 1998 is a very long
time ago in net-time.

You will also probably note that there are some search engines that there
are no major metasearch engines querying; obviously, there is a reason for
that.

I did work at Go2Net (http://www.go2net.com) in the past, which is the
company that owns both MetaCrawler (http://www.metacrawler.com) and
Dogpile (http://www.dogpile.com), which are probably the two most
trafficked metasearch engines around.  While this helps give me
perspective on the issue, everything I said is based on public
information.

Re: why doesn't altavista block metacrawler.com's IP's?

Reply via email to