According to Emma Jane Hogbin:
> 1. I thought EXCERPT was supposed to mean "the place where the word
> appears in the page." But if you look at my search results:
> 
> 
>http://search.foreign-policy-dialogue.ca/cgi-bin/htsearch?config=fr&restrict=fr&restrict=&words=iraq
> It's not the excerpt that's displayed. I'm not sure what it's actually
> displaying though. (I know what it's showing on the page, but I'm not sure
> why.)

See http://www.htdig.org/attrs.html#no_excerpt_show_top

Also, given the rather gratuitous use of alt text in img tags (which
is really supposed to convey the same message as the image when
the latter can't be displayed), I'd also suggest you have a look at
http://www.htdig.org/attrs.html#ignore_alt_text

> 2. Is there a way to ignore all <meta> information? I've set
> meta_description_factor: 0, but does that get the keywords as well?

No, see http://www.htdig.org/attrs.html#keywords_factor
as well as http://www.htdig.org/attrs.html#max_keywords
and http://www.htdig.org/attrs.html#max_meta_description_length

> 3. I spent yonks of time figuring out how to set up English and French
> databases, but our web site isn't perfect and contains French text on
> English pages. HOWEVER the engine is ok with this and displays accents
> correctly even when using the english config file (with no locale set):
> 
>http://search.foreign-policy-dialogue.ca/cgi-bin/htsearch?config=en&restrict=&words=iraq
> That's cool and all, but how does it know? Just because the locale is
> intalled on the system? Do I actually need to maintain two different
> databases? I do htfuzzy accents on the French databases but not on
> English. Is that right? I think it is, but now I'm starting to question
> myself.

That the accents are displayed by htsearch doesn't guarantee that the
accented letters were indexed properly.  (It's actually pretty unusual
for htsearch NOT to display accented letters, as you had reported was
happening on one of your systems.)  On most systems where there are
locale problems that prevent indexing of accented words, the accented
letters still display as they should in htsearch.

However, there is nothing preventing you from indexing several languages
in one database, as long as the documents all use the same encoding
(e.g. ISO-8859-1), and as long as the locale you select will recognize
all accented letters in that encoding.  E.g., you could use a locale of
fr_CA to index both English and French documents together, into one
database.  You can also get by with a single accents database, as the
accents algorithm isn't language-specific, as long as the language uses
the ISO-8859-1 encoding.  The only thing you'd need separate databases
for would be the endings and synonyms fuzzy algorithms - you may still
want to offer two different configs for those if you've gone through
the trouble of setting them up in both languages, but there's no reason
you couldn't use the same bilingual word database for either config.

> 4. I'd like to put the SESSION=blah back into the search results links.
> Even though the session is in the URL, it's not getting put back into the
> links. See:
> 
>http://search.foreign-policy-dialogue.ca/cgi-bin/htsearch?config=en&SESSION=5672c196e148691c7f305e4d61f501a9&restrict=&words=iraq
> I've added $(SESSION) to the HTML templates. Isn't that what you're
> supposed to do for HTML variables that you want to replace? In the
> wrapper.html file it looks like this:
> <a href="http://blah.com?SESSION=$(SESSION)">nav link</a>. But as you can
> see in the URL above SESSION= is there, but not the actually value...
> 
> From: http://htdig.org/hts_templates.html
> There are many variables that can be substituted into these templates. Not
> all of them make sense for each file, so not all of them will be
> substituted for every file. In addition, all of the standard CGI
> environment variables are available, and listed in the cgi specification.
> Variables will be substituted normally with the format $(VAR), escaped for
> use in a URL with the format $%(VAR), URL-encoding decoded with the format
> $=(VAR), and HTML-escaped with the format $&(VAR).

The description says the CGI _environment_ variables are available.
That's different than the CGI _input parameters_.  Of the latter, only
the predefined ones that htsearch uses are passed to template variables,
and "session" or "SESSION" isn't one of the predefined ones.  However,
you can define your own using...

allow_in_form:  session

(or capitalize "session" if you want it capitalized in the
htsearch URL - either way it will be capitalized in the template).
See http://www.htdig.org/attrs.html#allow_in_form

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to