According to Stefan Nehlsen:
> On Fri, Sep 07, 2001 at 01:05:11PM +0200, Ferenc VERES wrote:
> > Hello Users and Authors!
> >
> > On the result page you can have 3 different tilte lines:
> > 1. "No title" configurable
> > 2. <TITLE></TITLE> value from header
> > 3. filename
> >
> > Since the page header is always the same (same file!) on
> > one of our homepage, it would be good to have a custom
> > title line, which can be set inside the content of each
> > page. Is there a way to do this? For example, if I
> > find the first <H3> on the page, that would be great.
>
> I would think of something like this:
>
> NOT TESTED!
>
> htdig may use something like this as external parser for
> text/html. --- no, this will not work
>
> (
> Is it possible to do something like this:
>
> external_parser: text/html->text/html /usr/local/bin/foobar.pl
>
> where the external parser is only used the first time?
> )
Actually, there is a currently undocumented trick to do this. If you use
external_parser: text/html->text/html-internal /usr/local/bin/foobar.pl
it will actually run the foobar.pl script to preprocess the HTML before
parsing it with the internal HTML parser. The drawback is the performance
penalty, because all HTML files will have to go through this Perl script,
with Perl being restarted for each one. If this is a big problem, the
only other option is to patch the htdig/HTML.cc parser code, to treat the
<h3>...</h3> tags the way it currently treats the <title>...</title> tags.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html