I like the proposal for !-- noindex --junk here!-- /noindex --, I
think a lot of people have a hard time getting their heads around a
stop after an implicit start
Avi
--
Complete Guide to Search Engines for Web Sites and Intranets
http://www.searchtools.com
--
This message was sent by
It was thus said that the Great Walter Underwood once stated:
As for the anti-thesaurus proposal, many search engines already provide
something that does a similar job. You can mark sections of a document
to not be indexed. Usually, you want to do this for the topnav, sidebars,
ads, and
For example, Inktomi Enterprise Search uses !--stopindex-- and
!--startindex-- to turn indexing off and on within a page. Other
engines use different tags.
That's only for the Enterprise Search, not main Inktomi indexes - is that
correct? I don't know any global indexes that support such a
You may have more than just two scans on the resource, as urls such as
http://www.abc.de/xyz/index.html will also return the same document.
Calculate a checksum for each url retrieved, and compare for identical
checksums. If you find that one page is identical to another, the second
can
In [EMAIL PROTECTED], Matthias Jaekle [EMAIL PROTECTED] writes:
I read about adding a slash at the end of the URLs, if there is no
absolut path present.
But what about pathes ending in subdirectories (xyz).
A link to http://www.abc.de/xyz/ might be more correct then the link
to
I guess it depends on what you are asking to have returned. ( And this bring
up another robots.txt question.. below)
http://www.abc.de/xyz
Asking for the directory. (where the service is allowed redirection to a
temporary default file list or another default file as a reply if the service