Re: [htdig] 3.1.3 engine on 3.1.5 db

2001-01-12 Thread Gilles Detillieux

According to Dave Salisbury:
  If
  you created your database with htdig 3.1.5, and want to search it with
  htsearch 3.1.3, that's a bad idea.  The most glaring bug in releases
  before 3.1.5 is in htsearch, so you really should upgrade it.
 
 I take it one of the worst things is the security hole which allows
 a user to view any file with read permissions ( ouch! )

That's the one!

 Is there any way to correct for this with a wrapper around htsearch?
 Reading the indices using 3.1.3 that were created by a 3.1.5 engine
 seems to work just fine.

There would be, but it might be a tad tricky.  The idea is to use a
backslash to quote any left quote (`), dollar sign ($) or backslash
(\) in the query string that is part of an input parameter value that
will get added to the config object as an internal attribute setting.
The lines in htsearch/htsearch.cc that do this are (from a grep):

config.Add("match_method", input["method"]);
config.Add("template_name", input["format"]);
config.Add("matches_per_page", input["matchesperpage"]);
config.Add("config", input["config"]);
config.Add("restrict", input["restrict"]);
config.Add("exclude", input["exclude"]);
config.Add("keywords", input["keywords"]);
config.Add("sort", input["sort"]);
config.Add(form_vars[i], input[form_vars[i]]);

The last one above is the tricky one, as it can be any input parameter
name that you use in allow_in_form.  Rather that limiting the backslash
escaping of special characters to only the values of these parameters,
it might be better to do the whole query string, but exclude a few
parameters where this might be undesirable.  I'd recommend NOT doing
this for the "words" input parameter, for instance, but I can't think
of any others right off-hand where you would not want to do this.

 Anyone out there want to bash Glimpse before I look into it.  
 I'm hoping to get it at least to compile on an SGI.

I won't do any bashing, but if htdig is your preference, I'd suggest not
giving up on it too quickly.  Did you have a look at David Adams' recent
post about an "IRIX compile fix"?  In it, he forwarded a message from
Bob MacCallum that explains a workaround to some problems on IRIX 6.5,
using cc, not gcc.  If you haven't already, you ought to try that before
abandoning htdig.

  On the other hand, if you have an existing database built with version
  3.1.3, and want to use it with the latest htsearch, that should work
  without any difficulty.  However, you'll lose out on several benefits
  in the latest htdig (better parsing of meta tags, parsing img alt text,
  fixed parsing of URL parameters, etc.), 
 
 Couldn't find what "fixed parsing of URL parameters" means.
 The query string is part of what's indexed??

The query string isn't indexed, but it's part of the URL.  3.1.3 mangled
bare ampersands () in the query string in an URL, and versions before
that didn't decode sequences like eacute; within an URL.  I think the
ChangeLog explains it better than the release notes.

Tue Nov 23 19:52:27 1999  Gilles Detillieux  [EMAIL PROTECTED]

* htdig/HTML.cc(transSGML), htdig/SGMLEntities.cc(translateAndUpdate):
Fix the infamous problem in htdig 3.1.3 of mangling URL parameters that
contain bare ampersands (), and not converting amp; entities in URLs.
...
Wed Sep  1 15:39:41 1999  Gilles Detillieux  [EMAIL PROTECTED]

* htdig/HTML.h, htdig/HTML.cc(do_tag, transSGML): Fix the HTML parser
to decode SGML entities within tag attributes.

  which you'll only get if you
  reindex with htdig 3.1.5.  Maybe none of these matter for your site,
  though.  See the release notes and ChangeLog for details.
 
 I don't think they're essential.

Except for the URL parameter mangling fix, of course.

-- 
Gilles R. Detillieux  E-mail: [EMAIL PROTECTED]
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] 3.1.3 engine on 3.1.5 db

2001-01-11 Thread Dave Salisbury

Thanks for the reply..

 If
 you created your database with htdig 3.1.5, and want to search it with
 htsearch 3.1.3, that's a bad idea.  The most glaring bug in releases
 before 3.1.5 is in htsearch, so you really should upgrade it.

I take it one of the worst things is the security hole which allows
a user to view any file with read permissions ( ouch! )

Is there any way to correct for this with a wrapper around htsearch?
Reading the indices using 3.1.3 that were created by a 3.1.5 engine
seems to work just fine.

Anyone out there want to bash Glimpse before I look into it.  
I'm hoping to get it at least to compile on an SGI.

Thank for any info.

Dave

 On the other hand, if you have an existing database built with version
 3.1.3, and want to use it with the latest htsearch, that should work
 without any difficulty.  However, you'll lose out on several benefits
 in the latest htdig (better parsing of meta tags, parsing img alt text,
 fixed parsing of URL parameters, etc.), 

Couldn't find what "fixed parsing of URL parameters" means.
The query string is part of what's indexed??

 which you'll only get if you
 reindex with htdig 3.1.5.  Maybe none of these matter for your site,
 though.  See the release notes and ChangeLog for details.

I don't think they're essential.

DS




To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html