According to Antun Karlovac:
> Hey Gilles
> 
> Sorry for the huge delay, but here is an example of the code that the
> noindex_start/end tags didn't work on:
> 
> <!-- ignoreThis --><span class="IndexTable"><b><a href="someLink.php"
> target="_top">Link address</a><br />
> </b></span>
> <span class="IndexTable">&nbsp;&nbsp;&nbsp;&nbsp;<a href="someLink.php"
> target="_top">Link address</a><br />
> </span>
> ... skip a few ...
> <span class="IndexTable">&nbsp;&nbsp;&nbsp;&nbsp;<a href="someLink.php"
> target="_top">Link address</a><br />
> </span>
> <span class="IndexTable">&nbsp;&nbsp;&nbsp;&nbsp;<a href="someLink.php"
> target="_top">Link address</a><br />
> </span>
> <!-- /ignoreThis -->
> 
> Sorry, but I had to strip out the names of the links (and urls) for security
> reasons. The link names (descriptions) were just text. The link addresses
> were one of two types (in different examples) - one was absolute from the
> server root; e.g. /developers/documentation/... and the other was relative,
> e.g. ../../images/...
> 
> I initially tried using the default "<!--noindex_start-->" and
> "<!--noindex_end-->", but replaced them because not having a space between
> the dash and first comment character can cause problems with some older
> browsers. They didn't help anyway.
> 
> What I ended up doing was putting the comments inside the link on each line.
> (Not a problem, because this is all dynamically generated).
> 
> I'm using 3.1.6

Well, I just tried htdig 3.1.6 on an html file that included the HTML
excerpt above, verbatim, with the following settings in htdig.conf:

noindex_start:  <!-- ignoreThis -->
noindex_end:    <!-- /ignoreThis -->

Quite predictably, it skipped over the whole excerpt above without any
difficulty at all.  It's not like this feature has never been tested
before, after all.

The very idea of trying to reproduce an unexpected problem with a highly
edited excerpt of a problem file is rather ridiculous.  It's pretty
much guaranteed NOT to reproduce the problem you're trying to isolate.
There are just too many variables that are out of your control when you
do that.  The parts you skipped, both inside the excerpt and on either
side, could potentially have an impact on how the parser sees the whole
picture - I can't reproduce the whole picture if substantial parts are
hidden from me.  You also can't count on e-mail to faithfully reproduce
a character for character exact duplicate of the file that's giving you
problems, unless you encode it somehow to prevent the mail program from
mangling space and control characters.

If you're able to come up with the simplest case file that clearly
demonstrates the error reproduceably on your system, and you can provide
an exact copy of that file, that would be a big help.  Without that, I
can't imagine how I'd be able to reproduce the problem here.  My guess
is that more likely than not, you had a mismatch somewhere between
the starting and ending delimiter strings you were using in your HTML
file, and the ones you had defined in the noindex_start and noindex_end
attributes in your htdig.conf file.  Either that or there's some other
problem preventing htdig from recognizing your attribute settings, as
described in http://www.htdig.org/FAQ.html#q5.31

By the way, the defaults are not "<!--noindex_start-->" and
"<!--noindex_end-->", but rather "<!--htdig_noindex-->" and
"<!--/htdig_noindex-->".

See http://www.htdig.org/attrs.html#noindex_start
and http://www.htdig.org/attrs.html#noindex_end

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


-------------------------------------------------------
This sf.net email is sponsored by: To learn the basics of securing 
your web site with SSL, click here to get a FREE TRIAL of a Thawte 
Server Certificate: http://www.gothawte.com/rd524.html
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to