? The robot.txt file allows you to exclude pages on THIS site that you don't want indexed.
-Mark
-Original Message-
From: Dave Watts [mailto:[EMAIL PROTECTED]
Sent: Sunday, April 04, 2004 5:23 PM
To: CF-Talk
Subject: RE: user agent checking and spidering...
Sequelink (the access service
P.S. Actually he had NO caching and that is our first step - and it has been quite successful.
-Mark
-Original Message-
From: Dave Watts [mailto:[EMAIL PROTECTED]
Sent: Sunday, April 04, 2004 5:23 PM
To: CF-Talk
Subject: RE: user agent checking and spidering...
Sequelink (the access
Mark A. Kruger - CFG wrote:
Dave,
That's not what I'm finding.If you have a robots.txt file that says:
disallow /search.cfm
It will not index the search.cfm file from the root of the server. But
I cannot find anywhere where you can put in
something like this:
disallow
STephen,
Thanks for the URL.We do have a robots file in the root of each site.Perhaps the meta tags will help.I'll check it
out.
-Mark
-Original Message-
From: Stephen Moretti [mailto:[EMAIL PROTECTED]
Sent: Monday, April 05, 2004 8:32 AM
To: CF-Talk
Subject: Re: user agent checking
That's not what I'm finding.If you have a robots.txt file
that says:
disallow /search.cfm
It will not index the search.cfm file from the root of the
server. But I cannot find anywhere where you can put in
something like this:
disallow http://www.someothersite.com
You see what
Mark A. Kruger - CFG wrote:
I have a client with many many similar sites on a single server using CFMX.Each of the sites is part of a network of
sites that all link together - about 150 to 200 sites in all.Each home page has links to other sites in the network.
Periodically, it appears that
I'm not sure if it's the best way to do things but I may be able to help
with the user agents.Basically what I've done is capture all the user
agents to hit my sites over the past few years.I go through periodically
and (using a bit column in the table) mark whether the agents are bots or
not.
Jim,
Thanks - that might be a good place to start. Can you send it to my email? thanks!
-Mark
-Original Message-
From: Jim Davis [mailto:[EMAIL PROTECTED]
Sent: Sunday, April 04, 2004 1:27 PM
To: CF-Talk
Subject: RE: user agent checking and spidering...
I'm not sure if it's the best
) - there's really no way to tell if these are homemade bots or homemade
browsers.
Hope this helps,
Jim Davis
_
From: Mark A. Kruger - CFG [mailto:[EMAIL PROTECTED]
Sent: Sunday, April 04, 2004 2:35 PM
To: CF-Talk
Subject: RE: user agent checking and spidering...
Jim,
Thanks - that might
Sequelink (the access service for Jrun I think) locks up
quickly trying to service hundreds of requests at once to
the same access file.
As a short-term fix, have you considered a more aggressive caching strategy?
That might be pretty easy to implement.
Each site has a pretty well thought
10 matches
Mail list logo