According to [EMAIL PROTECTED]:
> Geoff Hutchison:
> > >because all the sites to dig are on the local machine. How can I speed
> > up?
> > 
> > This is easy. See the various local_url attributes...
> > <http://www.htdig.org/attrs.html#local_urls>
> > 
> > If you index directly over the filesystem, you'll avoid HTTP and 
> > proxy issues all together.
> 
> Hm, this does not work, even if I use local_user_urls. Example:
> I want to dig basis.shacknet.nu . It is located in
> /spool/wwwoffle/http/basis.shacknet.nu
> So, I add
> local_urls: http://basis.shacknet.nu/ /spool/wwwoffle/http/basis.shacknet.nu

You need an "=" sign, not a space, between the URL and the directory name.
I think you also need a trailing slash on the directory name.

> in the htdig config. If I only allow only digging via file system, the site
> couldn't be found. Also if I point to /spool/wwwoffle/http/
> Could this be because I'm using wwwoffle? (if you don't know wwwoffle: I can
> use it as a proxy, I can surf on my local mirrored basis.shacknet.nu like
> I'm online)

As long as the whole web site is mirrored on the local cache, this should
work.

> > >If I run htdig with serverwait_time=0 (standard value), htdig does it job
> in about 1 and 1,5 hour. So far so god, but after this, when I'm searching
> in my database, a lot of site are missing. In the log I saw that several
> >> servers were not digged, in the log-file was something written like "New
> server:
> > >basis.shacknet.nu  no server running". According to this I've set the
> > >serverwait_time=1, because I read somewhere this could solve the problem.
> > 
> > Certainly indexing can be done at such a rate as to saturate network 
> > connections or bump the limit of the number of server processes 
> > available, etc. There may also be issues between the proxy and the 
> > htdig indexing (which would be good to know so we can fix them).
> > 
> 
> How can I change the thing with the server processes? I didn't found a
> similar config-attribute. Is it a system issue?

It's not an htdig config attribute.  It's part of your web server
configuration.  E.g., in Apache, you can set MaxClients in your httpd.conf,
to limit the total number of server processes.  See the Apache docs for
more info.  Of course, this is not an issue if you're indexing via
local_urls.

> Greg Holmes:
> >> Why this "no server running"-error? Any suggestions?
> 
> >Try this:
> >http://www.htdig.org/mail/2000/10/0215.htm
> >Hope it helps.
> 
> Do you think it solves the problem with local saved sites too? (before I
> play arround with sources :) )

No, this is just when you're indexing a slow server via HTTP.  What it
does is make htdig keep trying the same server over and over again,
even if it timed-out on previous URLs from that server.  Without this
hack, htdig normally keeps track of which servers are down and doesn't
try them again in an indexing run.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to