Anyone have any ideas on what a benchmark should contain to set some standards? and/or's, single machine, memory limits and different os's and file systems and all that come into play.
A good, realistic, benchmark is to replay a log of queries. Warm things up with one batch of queries, then benchmark with separate batches, making sure not to reuse batches, so that you don't benefit too much from caching.
You need a load-simulator to run the queries. Something like http_load, jmeter, grinder, or somesuch. Check out http://www.opensourcetesting.org/performance.php for a list of tools. I haven't used any of these, only home grown things, so I can't make a recommendation.
The methodology we used at Excite was to try batches of queries at increasing rates until latencies and loads started to spike, then we'd found our limit.
Doug
------------------------------------------------------- This SF.Net email is sponsored by: SourceForge.net Broadband Sign-up now for SourceForge Broadband and get the fastest 6.0/768 connection for only $19.95/mo for the first 3 months! http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
