Normally, I'd be scouring the list archives for this info, but they
appear to be broken right now.

I'm enclosing the relevant htdig.conf lines and a snippet of rundig's
output.

I'm wondering why my index.jsp is getting indexed, but the links on it
are not. Specifically, the pages in the output below are that I expected
to get indexed are http://www.foobar/site/about/about.jsp and 
http://www.foobar.org/site/events/events.jsp. 

The most frequent error message I'm getting is "Rejected: URL not in the
limits!" 

Is it reading the Javascript stuff (OnMouseXXX, etc.) and getting
confused? Or is it something in my configuration file?

Any ideas?

Thanks.

---- Snipped from rundig -vvv output ---

   Rejected: Extension is invalid!

url rejected: (level 1)http://www.foobar.org/site/sitestyle.css

image: http://www.foobar.org/site/pix/top_left_arc.gif

image: http://www.foobar.org/site/pix/top_blue_vert.gif

A tag: pos = 2, position = ="about/about.jsp"
onMouseOver="changeImage('top_about','top_about_roll')"
onMouseOut="changeImage('top_about','top_about')">

href: http://www.foobar/site/about/about.jsp (About Us )
 
   Rejected: URL not in the limits!

url rejected: (level 1)http://www.foobar.org/site/about/about.jsp

image: http://www.foobar.org/site/pix/shim.gif

A tag: pos = 2, position = ="events/events.jsp"
onMouseOver="changeImage('top_events','top_events_roll')"
onMouseOut="changeImage('top_events','top_events')">

href: http://www.foobar.org/site/events/events.jsp (Events ) 

---- most of my htdig.conf ---
database_dir:   /var/lib/htdig
limit_urls_to:    ${start_url}
exclude_urls:   /cgi-bin/ .cgi
bad_extensions:   .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif \
    .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi \
    .css
maintainer:   [EMAIL PROTECTED]
max_head_length:  50000
max_doc_size:   5000000
no_excerpt_show_top:  true
search_algorithm: exact:1 synonyms:0.5 endings:0.1
no_next_page_text:
no_prev_page_text:
start_url:    http://www.foobar.org/site/index.jsp
local_urls: http://www.foobar.com/=/home/kfish/www/kfish/
local_user_urls:  http://www.foobar.com/=/home/,/public_html/ 
-------------

-- 
Richard Seymour : Anarchy Software, Inc.
- * - - * - - - * -+- * - - - * - - * -
      `°º¤ø,¸             ¸,ø¤º°'
             `°º¤ø,¸¸,ø¤º°

_______________________________________________
htdig-general mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to