Use <span> instead of <div> because the use of div will effect page
formatting, whilst span will not..

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On
Behalf Of [EMAIL PROTECTED]
Sent: Thursday, November 22, 2001 12:48 PM
To: [EMAIL PROTECTED]
Subject: [Robots] Re: Anti-thesaurus proposal 



Human Resources         D�veloppement des ressources
Development Canada              humaines Canada
______________________________________________________

anti-keywords 
or 
anti-keywordareas   

http://www.w3.org/TR/html4/struct/global.html#edef-DIV   

Does anyone use the DIV tags in HTML to mark the "noindex, nofollow,
follow, 
index" parts by way of block areas.     The emerging web content
management 
systems may have done something along these lines for their own imbedded

search/retrieval benefits, but this group should have a better idea on
the 
subject.

So has anyone seen/done anything like....

<div id="robots-txt-noindex-follow" class="robots">
{headers/footer/siderbars} </div>

<div id="robots-txt-noindex-nofollow" class="robots">
{a banner area }
</div>

<div id="robots-txt-index-nofollow" class="robots">
{ content for the index, but holds looping links or dynamically
generated 
links which are best navigated via the statedataless sitemaps links. }
</div>

The following is assumed for all areas but can be explicitly stated, 
<div id="robots-txt-index-follow" class="robots">
{ content for the index }
</div>
and if done so on a single block then all other blocks not already
defined as 
above are then treated as being "noindex, follow".

I would like to get comments and suggestions on the use of defined DIV
id 
names to improve index processes.( global or local)

-Thomas Kay
Information Resource Management, Corporate Systems, Systems, National 
Headquarters, Human Resources Development Canada, Government of Canada.

[EMAIL PROTECTED]
---------- Original Text ----------

From: "Andrew Daviel" <[EMAIL PROTECTED]>, on 21/11/2001 9:00 AM:


On Tue, 20 Nov 2001, Alan Perkins wrote:

> 
> > For example, Inktomi Enterprise Search uses <!--stopindex--> and 
> > <!--startindex--> to turn indexing off and on within a page. Other 
> > engines use different tags.
> 

htDig supports by default <!--htdig_noindex--> , <!--/htdig_noindex--> 
(configurable), plus (older?) non-DTD <noindex> and </noindex>

> It would be useful to have a "standard" for this over for all global 
> search engines.  Something like <robot instruc="noindex,nofollow"> ...

> </robot> to allow finer grained manipulation than the meta robots tag 
> allows.  NOINDEX and NOFOLLOW attributes for all tags that supported 
> HREF attributes would also be handy...particularly for e-mail 
> addresses.

Agreed. I also think the per-page anti-keyword list might be useful, if
a name or word occurs multiple times in a page. I don't share  Nicholas
Carroll's reservations about "stopword" and think that <meta
name="stopwords" content="key1, key2 .."> as the opposite of "keywords"
would not cause any confusion - it's implicit that meta-tags are
per-page elements. "nonwords" to me conjures up images of, 
well, non-words like "23.446" or "#%$!!@@@@!".

Regarding a <robot> HTML element, it would I think be naturally ignored
by existing agents and browsers yet parsable within a DTD. Questions of
precedence would need to be addressed. I believe that if a page is
listed in robots.txt that it is never even visited, so robots.txt has
precedence over <meta name=robots content=index>. That in turn may
prevent the body of the page being parsed, otherwise I was wondering if
it made sense to be able to say

<head><meta name=robots content=noindex></head><body>
don't index this page
<robot instruc="index">
except this bit
</robot>
</body>

otherwise the tag could be possibly simplified yet further to e.g.
<noindex>don't index this</noindex> (just have to get it in the DTD)
(Hmm, maybe we still want to distinguish "index" from follow" ...)

(I don't really care for the wordfragment "instruc". "action" maybe?)


Andrew Daviel, TRIUMF, Canada
also Vancouver Webpages


--
This message was sent by the Internet robots and spiders discussion list

([EMAIL PROTECTED]).  For list server commands, send "help" in the
body of 
a message to "[EMAIL PROTECTED]".


--
This message was sent by the Internet robots and spiders discussion list
([EMAIL PROTECTED]).  For list server commands, send "help" in the
body of a message to "[EMAIL PROTECTED]".


--
This message was sent by the Internet robots and spiders discussion list 
([EMAIL PROTECTED]).  For list server commands, send "help" in the body of a message 
to "[EMAIL PROTECTED]".

Reply via email to