Re: [jira] Created: (NUTCH-50) Benchmarks & Performance goals

Michael Nebel Fri, 29 Apr 2005 02:57:58 -0700

Hi Byron,

for myself, I wanted to know, how many connection my server can handle. So I made some tests myself. The observations where suprising but logical for me. I used some old desktop-pcs under linux (Pentium3-800 MHz, ~256 MB, IDE-Disks), so I hadn't much load to generate to see the bottlenecks :-)

The response time of nutch is not only dependant on the number of parallel requests, but also on the number of responses nutch returns (hitrate). The reason is simple: for each hit, the summary is loaded from disk. This causes much disk i/o which slowed my server much more down than the slow cpu and the low ram.

So I would suggest to use a static set of queries and an identical set of segments to generate the numbers.

A interesing number is the responsetime per hit and parallel request. I would expect, that the size of the index has an influence on the number of hits returned and an influence, how long it takes to locate the summaray on disk.

One question I still have is, how does the number and size of segments per search-server influence the response time? What is better: many small segments or one big. Looking at the servers I use for the tests - you can imagine my problem to run this kind of test :-)

Regards

        Michael

--
Michael Nebel
Internet: http://www.netluchs.de/

Re: [jira] Created: (NUTCH-50) Benchmarks & Performance goals

Reply via email to