Hi, all.  Thanks developers for working on such an ambitious project!

In testing htdig, 3.2.0b2, with just one html file, the AND operator is
working like OR, as far as I can tell.  Whether I select "method=all" or
"method=boolean" with ands in the query string, a query like "web fluble"
incorrectly returns the document (which contains "web" but not "fluble").  I
compiled 3.1.5 to see if I was doing anything really stupid, but with the
same document and an essentially identical config file, 3.1.5 returns the
correct results.  (However, I want to use phrase matching, so 3.1.5 isn't a
permanent solution for me.)

I've already changed permissions on the _weakcmpr database as before, and
simple searches work as expected ("web design" matches the document, "design
web" doesn't, "web" matches, "fluble" doesn't).

Has anyone bumped into this before?  I checked thru the archives of this
list and the Changelog from April 12 to May 30, and didn't find anything
similar.  My htdig.conf follows; the sample search page is at
<http://www.aptima.com/~cta/search-3.2.html> (although command line searches
return the same results); the one document indexed is index.html.

Also, I noticed that the attribute list in htdoc lists "version" (that an
attribute first appeared), while www.htdig.org doesn't.  Is there a reason
for this?

Thanks for any help with this...

--
Arthur Prokosch, <[EMAIL PROTECTED]>
Usability/Web Intern
Aptima, Inc. <http://www.aptima.com/>
781-935-3966 x26

-- begin htdig.conf (most comments stripped) --
start_url:              http://www.aptima.com/~cta/

# use file access for all URLs indexed
#
local_urls:             http://www.aptima.com/~cta/=/home/cta/public_html/

# don't fall back to HTTP, as www.aptima.com is unreachable from here
#
local_urls_only: true

limit_urls_to:          ${start_url}

exclude_urls:           /cgi-bin/ search.html

bad_extensions:         .cgi .wav .gz .z .sit .au .zip .tar .hqx .exe .com \
   .gif .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi

maintainer:             [EMAIL PROTECTED]

#max_head_length:       10000

max_doc_size:           200000

no_excerpt_show_top:    true

#search_algorithm:      exact:1 synonyms:0.5 endings:0.1
search_algorithm:       exact:1

# disable backlink weighting (which is on by default?)
#
backlink_factor:        0

# we could use synonyms (misspellings, really) when we start enabling
# text-box searches?


template_map:   Long long ${common_dir}/long.html \
    Short short ${common_dir}/short.html \
    Custom custom ${common_dir}/custom.html
template_name:  custom


next_page_text:         '[ Next &gt; ]'
no_next_page_text:
prev_page_text:         '[ &lt; Prev ]'
no_prev_page_text:
page_number_text:       1 2 3 4 5 6 7 8 9 10
no_page_number_text:    &gt;1&lt; &gt;2&lt; &gt;3&lt; &gt;4&lt; &gt;5&lt; \
     &gt;6&lt; &gt;7&lt; &gt;8&lt; &gt;9&lt; &gt;10&lt;

# local variables:
# mode: text
# eval: (if (eq window-system 'x) (progn (setq font-lock-keywords (list
'("^#.*" . font-lock-keyword-face) '("^[a-zA-Z][^ :]+" .
font-lock-function-name-face) '("[+$]*:" . font-lock-comment-face) ))
(font-lock-mode)))
# end:

-- end htdig.conf ---

-- begin redirected output from rundig -vvvvvv --
ht://dig Start Time: Wed Aug  2 11:32:38 2000
  1:0:http://www.aptima.com/~cta/
New server: www.aptima.com, 80
 - Persistent connections: enabled
 - HEAD before GET: disabled
 - Timeout: 30
 - Connection space: 0
 - Max Documents: -1
 - TCP retries: 1
 - TCP wait time: 5
Trying to retrieve robots.txt file
 pushed
pick: www.aptima.com, # servers = 1
> www.aptima.com supports HTTP persistent connections (infinite)
0:2:0:http://www.aptima.com/~cta/: Trying local files
  found existing file /home/cta/public_html/index.html
Read 43 from document
Read a total of 43 bytes
Tag: blink, matched -1
word: hi.@1
word: this@2
word: bad@3
word: web@4
Tag: /blink, matched -1
word: design.@5
head:  hi. this is bad web design.
 size = 43
pick: www.aptima.com, # servers = 1
> www.aptima.com supports HTTP persistent connections (infinite)
ht://dig End Time: Wed Aug  2 11:32:38 2000
ID: 2 URL: http://www.aptima.com/~cta/
-- end redirect --


------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

Reply via email to