----- Original Message -----
From: "Tim Bray" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, March 09, 2000 3:30 AM
Subject: Re: What happens once robots are barred?


> At 11:59 AM 3/8/00 -0800, Mark Bennett wrote:
>
> >* It should also keep track of "orphan" pages - pages that are still
> >accessible via the direct URL, but are no longer linked-to by other
pages on
> >the site.
> >
> >I believe all 3 classes of pages should be removed from the index.
> >
> >The third item is an interesting one.  I know some spiders do NOT
realize
> >that pages are no longer "linked to" and keep indexing them.
>
> When you're indexing a web *site* (i.e. you don't care about anything
> outside the web site), this is sensible.  When you're trying to a
large scale
> index of the whole web, it gets more complex.  If a site has ever been
> announced to the outside world, the assumption is that it may have
been linked
> to from elsewhere; the publishing of a page *should* represent a
commitmenet
> on the part of the publisher to maintain it.  If the page needs to be
removed,
> merely removing links to it is violently unsatisfactory since there is
no way
> an incoming link from outside can know that it's now an orphan.  So
such pages
> are a live part of the web until removed.  -T.

Yes, so my question is "is it possible to remove a page from a search
engines view of the web, without removing the resource?"

To give an example, a company's annual report is published
(http://www.acme.com/annual-report-1999/) and submitted to several
search engines.  The following year the report is unlinked from the main
area, but linked from an archive area.  The company wishes to remove the
report from search engines.  Can this be done - in a more elegant
fashion that going to every search engine and submitted a load of
unsubmit requests?

What effect does having a <meta name="robots" content="noindex"> element
have for a resource which has already been indexed?

Brian

--------------------------------------------------------------------
Brian Kelly, UK Web Focus
UKOLN, University of Bath, BATH, England, BA2 7AY
Email:  [EMAIL PROTECTED]     URL:    http://www.ukoln.ac.uk/
Homepage: http://www.ukoln.ac.uk/ukoln/staff/b.kelly.html
Phone:  01225 323943            FAX:   01225 826838

Reply via email to