Developers, I found a bug in v3.1.6, and probably in all future versions too. Here it is:
If you enter a "restrict" value in the URL for htsearch (not in the config file), it will be compared UNENCODED to the ENCODED URLs in htdig's database. For example, the following query: http://www.mvpix.com/cgi-bin/perl/search?words=%2A&restrict=/photos/021/Netherland%20Antilles/Bonaire/Places/Urban/&method=and&sort=date&format=short Will never match: http://www.mvpix.com/photos/021/Netherland%20Antilles/Bonaire/Places/Urban/Industry/20030511-062204.jpg.html I've fixed htsearch temporarily with the following code, but some thought probably should be given on how to address this. I suspect the solution is to compare both strings in their unencoded form. My snippet: [EMAIL PROTECTED]:/mnt/lan/src/htdig-3.1.6$ diff htsearch/htsearch.cc-orig htsearch/htsearch.cc 23a24 > #include "URL.h" 169,170c170,174 < if (input.exists("restrict")) < config.Add("restrict", input["restrict"]); --- > if (input.exists("restrict")) { > String restrict_url = input["restrict"]; > encodeURL(restrict_url, "-_./"); > config.Add("restrict", restrict_url); > } [EMAIL PROTECTED]:/mnt/lan/src/htdig-3.1.6$ Another side-effect of using 'config.Add("restrict", input["restrict"]);' un-encoded is that any spaces will be treated as ORs later on by this line 'urllist.Create(config["restrict"], "| \t\r\n\001");'. BTW, this same bug affects the "exclude" value too. Thanks, js. On Sun, Nov 16, 2003 at 11:15:09PM -0500, Jean-Sebastien Morisset wrote: > Guys, > > Shouldn't the following change to v3.1.6 work? > > ---START--- > > [EMAIL PROTECTED]:/mnt/lan/src/htdig-3.1.6$ diff htsearch/htsearch.cc-orig > htsearch/htsearch.cc > 220c220 > < urllist.Create(config["restrict"], "| \t\r\n\001"); > --- >> urllist.Create(config["restrict"], "|\t\r\n\001"); > > ---END--- > > It seems to have fixed the OR problem, but now I'm not getting any > matches. I've added "<!--RESTRICT:$(RESTRICT)-->" to the nomatch.html > file, and here is what it gives me: > > <!--RESTRICT:/photos/021/Netherland Antilles--> > > So it appears the space made it in there, but I don't understand why > htsearch isn't matching the URLs with it. > > Any ideas? I've tried a whole bunch of things, but nothing has worked so > far... > > BTW, here's a snippet from rundig showing the URLs it should match: > > 307:307:4:http://www.mvpix.com/photos/011/Netherland%20Antilles/Bonaire/Transportation/Flying/: > **-*-*******-*****-********- size = 6914 > 308:308:4:http://www.mvpix.com/photos/011/Netherland%20Antilles/Bonaire/Transportation/Automobiles/: > **-*-*******-*****-********- size = 6934 > 309:309:4:http://www.mvpix.com/photos/011/Netherland%20Antilles/Bonaire/Objects/Industrial/: > **-*-*******-*****-********- size = 6895 > 310:310:4:http://www.mvpix.com/photos/011/Netherland%20Antilles/Bonaire/Objects/Still%20Life/: > **-*-*******-*****-********- size = 6898 > > Thanks, > js. > > On Sun, Nov 16, 2003 at 05:10:19PM -0500, Jean-Sebastien Morisset wrote: >> Hi, >> >> I'm trying to use a restrict value with spaces - for example: >> >> restrict=/photos/021/Netherland%20Antilles/Bonaire/ >> >> Unfortunately, htdig v3.1.6 reads this as "/photos/021/Netherland" OR >> "Antilles/Bonaire/" when I would like it to read it as a single string. >> Is there a way to have it treat spaces as part of the string? > > ------------------------------------------------------- > This SF. Net email is sponsored by: GoToMyPC > GoToMyPC is the fast, easy and secure way to access your computer from > any Web browser or wireless device. Click here to Try it Free! > https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl -- Jean-Sebastien Morisset, Sr. UNIX Administrator <[EMAIL PROTECTED]> Personal Home Page <http://jsmoriss.mvlan.net/> JS & Melanie's Homebrewery <http://brewery.mvlan.net/> Underwater and Travel Photographs <http://www.mvpix.com/> ------------------------------------------------------- This SF. Net email is sponsored by: GoToMyPC GoToMyPC is the fast, easy and secure way to access your computer from any Web browser or wireless device. Click here to Try it Free! https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl _______________________________________________ ht://Dig Developer mailing list: [EMAIL PROTECTED] List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-dev