Hello all, I wanted to inquiry about the general performance of nutch. I have seen this page here (http://digitalpebble.blogspot.cz/2013/09/nutch-fight-17-vs-221.html) where it takes
78minutes for 1 iteration with 3M urls/ 5K per iteration with 100 urls/host. I have myself the same setup as in the test but with currently only around 70k urls in the database. The steps fetch/parse go very quick but the steps generate/update take both _forever_. I have for 1 run about 12 hours and by far the most time is spent at update followed by generate. Is there ANYTHING I can do to speedup the process? I have a strong dedicated server with 52GB RAM. One thing I notice is that during generate/update ALL available RAM is used (Mem: 52438M total, 52267M used, 170M free, 191M buffers). I am thankful for any help/feedback! Domi

