According to Emma Jane Hogbin: > 1. I thought EXCERPT was supposed to mean "the place where the word > appears in the page." But if you look at my search results: > > >http://search.foreign-policy-dialogue.ca/cgi-bin/htsearch?config=fr&restrict=fr&restrict=&words=iraq > It's not the excerpt that's displayed. I'm not sure what it's actually > displaying though. (I know what it's showing on the page, but I'm not sure > why.)
See http://www.htdig.org/attrs.html#no_excerpt_show_top Also, given the rather gratuitous use of alt text in img tags (which is really supposed to convey the same message as the image when the latter can't be displayed), I'd also suggest you have a look at http://www.htdig.org/attrs.html#ignore_alt_text > 2. Is there a way to ignore all <meta> information? I've set > meta_description_factor: 0, but does that get the keywords as well? No, see http://www.htdig.org/attrs.html#keywords_factor as well as http://www.htdig.org/attrs.html#max_keywords and http://www.htdig.org/attrs.html#max_meta_description_length > 3. I spent yonks of time figuring out how to set up English and French > databases, but our web site isn't perfect and contains French text on > English pages. HOWEVER the engine is ok with this and displays accents > correctly even when using the english config file (with no locale set): > >http://search.foreign-policy-dialogue.ca/cgi-bin/htsearch?config=en&restrict=&words=iraq > That's cool and all, but how does it know? Just because the locale is > intalled on the system? Do I actually need to maintain two different > databases? I do htfuzzy accents on the French databases but not on > English. Is that right? I think it is, but now I'm starting to question > myself. That the accents are displayed by htsearch doesn't guarantee that the accented letters were indexed properly. (It's actually pretty unusual for htsearch NOT to display accented letters, as you had reported was happening on one of your systems.) On most systems where there are locale problems that prevent indexing of accented words, the accented letters still display as they should in htsearch. However, there is nothing preventing you from indexing several languages in one database, as long as the documents all use the same encoding (e.g. ISO-8859-1), and as long as the locale you select will recognize all accented letters in that encoding. E.g., you could use a locale of fr_CA to index both English and French documents together, into one database. You can also get by with a single accents database, as the accents algorithm isn't language-specific, as long as the language uses the ISO-8859-1 encoding. The only thing you'd need separate databases for would be the endings and synonyms fuzzy algorithms - you may still want to offer two different configs for those if you've gone through the trouble of setting them up in both languages, but there's no reason you couldn't use the same bilingual word database for either config. > 4. I'd like to put the SESSION=blah back into the search results links. > Even though the session is in the URL, it's not getting put back into the > links. See: > >http://search.foreign-policy-dialogue.ca/cgi-bin/htsearch?config=en&SESSION=5672c196e148691c7f305e4d61f501a9&restrict=&words=iraq > I've added $(SESSION) to the HTML templates. Isn't that what you're > supposed to do for HTML variables that you want to replace? In the > wrapper.html file it looks like this: > <a href="http://blah.com?SESSION=$(SESSION)">nav link</a>. But as you can > see in the URL above SESSION= is there, but not the actually value... > > From: http://htdig.org/hts_templates.html > There are many variables that can be substituted into these templates. Not > all of them make sense for each file, so not all of them will be > substituted for every file. In addition, all of the standard CGI > environment variables are available, and listed in the cgi specification. > Variables will be substituted normally with the format $(VAR), escaped for > use in a URL with the format $%(VAR), URL-encoding decoded with the format > $=(VAR), and HTML-escaped with the format $&(VAR). The description says the CGI _environment_ variables are available. That's different than the CGI _input parameters_. Of the latter, only the predefined ones that htsearch uses are passed to template variables, and "session" or "SESSION" isn't one of the predefined ones. However, you can define your own using... allow_in_form: session (or capitalize "session" if you want it capitalized in the htsearch URL - either way it will be capitalized in the template). See http://www.htdig.org/attrs.html#allow_in_form -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) ------------------------------------------------------- This SF.NET email is sponsored by: SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! http://www.vasoftware.com _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

