According to fx:
> it s very strange ...
> I show you my conf
> -------------------------------------------------------------
> database_dir:  /home/web/inerd/htdig/db
> database_base:  ${database_dir}/inerd
> #allow_virtual_hosts: true
> valid_extensions: .html .htm .shtml .php .php3 .asp .php
> start_url:  http://192.168.0.2
> limit_urls_to:  http://192.168.0.2
> exclude_urls:  /cgi-bin/ .cgi
> bad_extensions:  .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif\
>    .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi
> maintainer:  inerd
> max_head_length: 10000
> max_doc_size:  200000
> no_excerpt_show_top: false
> search_algorithm: exact:1 synonyms:0.5 endings:0.1
> search_results_wrapper: /home/web/inerd/www/htdig/wrapper_inerd.html
> nothing_found_file:     /home/web/inerd/www/htdig/nomatch_inerd.html
> ----------------------------------------------------
> the result of the htdig -i -vvv
> 
> ...
>    pushing http://192.168.0.2/index.php3
> +A tag: pos = 2, position = =/news/index.php3?idnews=3 class=news>
> href: http://192.168.0.2/news/index.php3?idnews=3 (La troisi�me)
> 
>    Rejected: Extension is not valid!

This error, just as the one below, indicates the URL is rejected because
it doesn't fit any of the patterns in valid_extensions.  Unfortunately,
the pattern matching doesn't take CGI parameters into account, so the
match fails.  I think this is a bug, which the patch below should fix.

> ...
> 
> ...
> *A tag: pos = 2, position = ="/services" class="navig1">
> href: http://192.168.0.2/services (services)
> 
>    Rejected: Extension is not valid!

In this case, the URL is rejected because of a bug in the new
valid_extensions attribute handling, as was pointed out by Warren
Jones about a month ago.

> ...
> 
> do you have any suggestion ?
> (I ve really tried a lot of things ... a real mystery)
> 
> thanx
> 
> ps : I use 3.1.4
> and my directory index is good :
> DirectoryIndex index.html index.htm index.shtml index.cgi index.php3

Here is a patch which I hope will fix both problems.  Please let me know
if it works.

--- htdig/Retriever.cc.valextbug        Thu Dec  9 18:28:44 1999
+++ htdig/Retriever.cc  Tue Feb  1 09:16:04 2000
@@ -702,9 +702,14 @@ Retriever::IsValidURL(char *u)
     //
     char       *ext = strrchr(url, '.');
     String     lowerext;
+    if (ext && strchr(ext, '/'))       // Ignore a dot if it's not in the
+      ext = NULL;                      // final component of the path.
     if (ext)
       {
        lowerext = ext;
+       int parm = lowerext.indexOf('?');       // chop off URL parameter
+       if (parm >= 0)
+           lowerext.chop(lowerext.length() - parm);
        lowerext.lowercase();
        if (invalids->Exists(lowerext))
          {

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

Reply via email to