According to Matthew Nuzum:
> According to Greg Lepore:
> >          Good point but for Mich<sup>l</sup> I would add entries in the 
> > synonyms database for Mich Michael and Michl  since they are all the same 
> > word.  For that example, there's no reason to want to search for each entry 
> > individually.  I would vote for treating the <sup> and <sub> as hyphens but 
> > I can see where this might cause trouble (mathematical equations might be 
> > one such case).
> 
> I have a website for heart surgeons.  They have all kinds of subscripts
> in the names of their medicines.  People will often search for HDL3, HDL
> 3 and just HDL in hopes of finding what in HTML would be
> HDL<sub>3</sub>.

OK, this is another good reason to treat <sup> and <sub> the same way as
valid_punctuation characters, so that HDL<sub>3</sub> would be indexed as
hdl, 3 and hdl3.

> I've encouraged them to instead use HDL<span
> style="font-size:8pt">3</span> instead.  It looks better because it
> doesn't increase the line spacing. They now have several of each
> throughout the site. 
> 
> I just tested a search and found that none of the above show up in the
> results.  Only HDL3 shows up as a result for a search "hdl3".
> HDL<sub>3</sub> and HDL<span style="font-size:8pt">3</span> are
> excluded.
> 
> I'm glad someone caught this.  Can we have spans treated the same way?
> I often use <span> tags with stylesheet rules in place of some HTML
> equivs.

It's a pretty easy fix:  just add "|span|/span" to the end of the string
passed to nobreaktags.Pattern("...") on line 85 of HTML.cc (in 3.1.6).
I had put font tags in the list for this very reason, but I didn't put in
span.

I'm really ignorant when it comes to CSS tags in HTML, but are span tags
just used for font and style changes which shouldn't trigger a word break,
or are there other uses of <span ...> which should cause word breaks?

In any case, it would probably make sense to have this list settable by
config attribute, to allow adapting to varying needs and evolving standards.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to