> > This behaviour is odd on two levels:
> >
> >     1) It not only replaces the <, > and " characters with entities
>
> Well, the only explanation I can see is that it's still doing the SGML
> encoding on your meta description.  Are you sure you are using the
> patched htsearch, and not an older one?  If you added the if() statement
> as I suggested, did you perhaps forget the "!"?

Yep.  That was my own response, so I double checked, re-compiled,
re-ran...same result.

My initial attempt, before receiving your email, was to comment the SGML
call out altogether, also with the same result.

The release for Display.cc is:

#if RELEASE
static char RCSid[] = "$Id: Display.cc,v 1.54.2.27 2000/02/17 16:46:25
grdetil Exp $";
#endif

Perhaps it has something to do with how HTML.cc treats META values?  The
changes you've specified below change this behaviour.

> When HTML.cc finds the start of a tag (the "<" character), it searches
> for the next ">" which it takes as the ending.  That next ">" is inside
> the content of your meta tag.  It's a violation of the HTML standard
> to embed a ">" (or a "<" for that matter) inside an HTML tag, so your
> <META NAME="htdig-description" ... > tag above is invalid.  You'd need to
> SGML-encode the embedded "<" and ">" as &lt; and &gt; to get that to work.
> You'll also need to set translate_lt_gt to true in your htdig.conf,
> for them to get translated.
>

Closer.  When I do this, the line is no longer truncated.  However, '&lt;'
becomes '&amp;lt;' in the search template.  EG:

<dl><dt><strong><a
href="http://www.greymattermedia.com/algtr001.htm">Suitable for Framing :: -
Respite at Alligator Creek -</a></strong><img src="/htdig/star.gif"
alt="*"><img src="/htdig/star.gif" alt="*"><img src="/htdig/star.gif"
alt="*"><img src="/htdig/star.gif" alt="*">
</dt><dd>&amp;lt;IMG align=left width=192 height=192
SRC=&quot;http://www.greymattermedia.com/images/algtrs01.jpg&quot;
align=TEXTTOP HSPACE=5 VSPACE=0 BORDER=0
ALT=&quot;<strong>Alligator</strong> Creek National Park&quot;&amp;gt;Water
and time collaborate to create a fanciful waterway. ::
<strong>Alligator</strong> Creek National Park, Queensland<b><tt>
...</tt></b><br>
<i><a
href="http://www.greymattermedia.com/algtr001.htm">http://www.greymattermedi
a.com/algtr001.htm</a></i>
 <font size="-1">02/19/01, 15406 bytes</font>
</dd></dl>

Obviously, I need to find out where this is happening.  Sorry for the c++
tutorial request--too many years since I coded in a unix environment--but
how do I set the debug level when 'make'ing, and where does output sent to
cout go?

Thanks,

Patrick.

patrick jennings                      synaptic § grey matter media
    [EMAIL PROTECTED]
    eJournal Travelogue :: http://synaptic.bc.ca
    Fine Art & Travel Photographs :: http://greymattermedia.com
    Prisoners ~ A New Play :: http://greymattermedia.com/prisoners


----- Original Message -----
From: Gilles Detillieux <[EMAIL PROTECTED]>
To: Patrick Jennings <[EMAIL PROTECTED]>
Cc: Gilles Detillieux <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Monday, February 19, 2001 1:26 PM
Subject: Re: [htdig] Putting HTML in the META description


> According to Patrick Jennings:
> > Hi Gilles,
> >
> > Thanks, and <smile> I realise the unusual nature of this kludge.
Hopefully,
> > the exclusivity of htdg-description will render it benign on all other
> > systems.  I'll perhaps code a more robust solution if I can get this one
> > working first.
> >
> > I spent a good portion of yesterday going through the code and had
arrived
> > at pretty much the answer you provide: skip the call to encodeSGML()
> >
> > However, after making the change, recompiling and executing rundig,
htsearch
> > doesn't produce the results either of us expected. EG:  Here's a sample
> > htdig-description META from my site
> > (http://www.greymattermedia.com/algtr001.htm)
> >
> > <META NAME="htdig-description"
> >  CONTENT='<IMG align=left width=192 height=192
> > SRC="http://www.greymattermedia.com/images/algtrs01.jpg" align=TEXTTOP
> > HSPACE=5 VSPACE=0 BORDER=0 ALT="Alligator Creek National Park">Water and
> > time collaborate to create a fanciful waterway. :: Alligator Creek
National
> > Park, Queensland, Australia; The Natural Order Katrin'>
> >
> > And here's the search result template for the page generated by htsearch
> >
(http://www.greymattermedia.com/cgi-bin/htsearch?config=htdig&restrict=&excl
> > ude=&words=alligator)
> >
> > <dl><dt><strong><a
> > href="http://www.greymattermedia.com/algtr001.htm">Suitable for Framing
:: -
> > Respite at Alligator Creek -</a></strong><img src="/htdig/star.gif"
> > alt="*"><img src="/htdig/star.gif" alt="*"><img src="/htdig/star.gif"
> > alt="*"><img src="/htdig/star.gif" alt="*">
> > </dt><dd>&lt;IMG align=left width=192 height=192
> > SRC=&quot;http://www.greymattermedia.com/images/algtrs01.jpg&quot;
> > align=TEXTTOP HSPACE=5 VSPACE=0 BORDER=0
> > ALT=&quot;<strong>Alligator</strong> Creek National Park&quot;&gt;<br>
> > <i><a
> >
href="http://www.greymattermedia.com/algtr001.htm">http://www.greymattermedi
> > a.com/algtr001.htm</a></i>
> >  <font size="-1">02/19/01, 15400 bytes</font>
> > </dd></dl>
> >
> > ----- Original Message -----
> > From: Gilles Detillieux <[EMAIL PROTECTED]>
> > To: Patrick Jennings <[EMAIL PROTECTED]>
> > Cc: <[EMAIL PROTECTED]>
> > Sent: Monday, February 19, 2001 9:17 AM
> > Subject: Re: [htdig] Putting HTML in the META description
> >
> >
> > > According to Patrick Jennings:
> > > > I'd like the META description** field to contain an <IMG > tag so
that
> > the
> > > > search results page will show a left-aligned image beside the
> > description.
> > > >
> > > > htdig's current behaviour is to replace '<' with '&lt;' and '>' with
> > '&gt;'
> > > > which, of course, results in printing out the HTML code as the
> > description,
> > > > rather than generating an inline image, as desired.
> > > >
> > > > The question is: what do I change in the code to stop this
conversion?
> > >
> > > Well, first of all be aware that this is extremely non-standard
behaviour.
> > > You normally can't embed HTML tags within HTML tags in this manner.
> > > However, if you want this as a site-specific kludge, you can do this
by
> > > changing the following line in htsearch/Display.cc's
Display::excerpt()
> > > method (at line 1118 in unpatched 3.1.5 code):
> > >
> > >     encodeSGML(head_string);
> > >
> > > to
> > >
> > >     if (!use_meta_description) encodeSGML(head_string);
> > >
> > > > **Actually, I'd already added a new META called "htdig-description"
so I
> > can
> > > > have a different description for htdig search results than will be
> > displayed
> > > > by web search engines, directories, etc.  However, this addition
only
> > > > involves 'or'ing "htdig-description" with the test for "description"
in
> > > > HTML.cc, so essentially the code remains as originally written.
> > >
> > > OK, so there were no corresponding changes to htsearch, as it used the
> > same
> > > database field.  I assumed that in the kludge above.
> > >
> > > --
> > > Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
> > > Spinal Cord Research Centre       WWW:
> > http://www.scrc.umanitoba.ca/~grdetil
> > > Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
> > > Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
> > >
> >
>
>
> --
> Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
> Spinal Cord Research Centre       WWW:
http://www.scrc.umanitoba.ca/~grdetil
> Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
> Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
>
> _______________________________________________
> htdig-general mailing list <[EMAIL PROTECTED]>
> Information: http://lists.sourceforge.net/lists/listinfo/htdig-general
> FAQ: http://htdig.sourceforge.net/FAQ.html
>
>


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
Information: http://lists.sourceforge.net/lists/listinfo/htdig-general
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to