How deep should a good intranet crawl be...10-20?
I still can't get all of my site searchable..

Here is my situation...
I want to crawl just a local site for our intranet.   We have just
rolled out an asp only website from a pure html site.  I ran nutch on
the old site and got great results.  Since moving to this new site I am
have a devil of a time retrieving good information and missing a ton of
info all together.  I am not sure what settings I need to change to get
good results.  One setting that I have set does produce good results but
it seems to crawl other website and not just my domain.  The last line
of the crawl-urlfilter file I just replace the - with + so it does not
ignore other information. Our site is www.woodward.edu I was wondering
if someone on this list can crawl this site and only this domain and see
what they come up with.  Woodward.edu is the domain.  I am just stumped
as what to do next.  I am running a nightly build from January 26th
2006. 

My criteria for our local search is to be able to search PDF, images,
doc, and web content.  You can go here and see what the search page
pulls up http://search.woodward.edu .

Thanks for any help this list can provide.
Andy Morris 


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to