Jay Pound wrote:
Doug I also ran into this when I was testing ndfs the system would have to wait for the namenode to tell the datanodes what data to recieve and which data to replicate
When did you test this? Which version of Nutch? How many nodes? My benchmark results from just a few days ago. There've been a lot of fixes in the past week and NDFS now works much better.
I'm currently setting up lustre to see how it works, its at the kernel level that it operates, do you think if the namenode was not java that it would perform better? I plan on running a system where the namenode (metadata) server will have to perform thousands of i-o's a sec,concurrently updating indexes of multiple segments simultaniously, updating the db on one machine, and fetching multiple segments on multiple machines, all accessing the same logical filesystem at the same time.
While running the benchmark the namenode was typically using only 2% of its 1Ghz CPU.
PS: where can I find out about the mapreduce, I read the presentations, but I dont get the core concept of it?
http://labs.google.com/papers/mapreduce.html
PSS: via chips aernt very fpu powerfull try an opteron for your namenode, I bet you will see a huge improvement in speed, even over xeon's p4's etc... I was only able to test 5 machines but I was able to saturate 50-60mb a sec to each (mainly replication throughput running level 1)
Via is not my first choice of CPU, it's simply what the Internet Archive has given me to use. With hundreds of datanodes a Via-based namenode could become a bottleneck. Right now it is not.
Doug ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
