On 6/28/06, Ken Krugler <[EMAIL PROTECTED]> wrote:
Hi Doug,
>Did you ever resolve your 0.8 vs 0.7 crawling performance question? I'm
>running into a similar problem.
We wound up dramatically increasing the number of threads, which
seemed to help solve the bandwidth utilization problem. With Nutch
0.7 we were running about 200 threads per crawler, and with Nutch 0.8
it's more like 2000+ threads...though you have to reduce the thread
stack size in this type of configuration.
Hi Ken
Could you please give me some clue regarding the stack size you are
seeing the best bandwidth utilization... I have the following
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
max nice (-e) 20
file size (blocks, -f) unlimited
pending signals (-i) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) unlimited
max rt priority (-r) unlimited
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
What stack size should I play with the default seems to be 8192kb ?
also any onther parameters I should tweak? I often get too many open
files problem and I never could use my full bandwidth.. I am using
about 10% of my bandwidth. I have played around with ulimit -n "very
high number" which solves the "too many open files" but its not
utilizing all my bandwidth, any help will be very much appreciated.
Thanks
Zaheed
-- Ken
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"