Dear All: The job tracker and the web interface for the namenode and jobtracker displays various information about the jobs for both pseudo mode and cluser mode (which I am yet to setup, see my other post please). However, which variables would you suggest be used to compare the performance and scalibility of pseudo vs cluster mode? I am sure time, total map and reduce jobs, dfs % used, heap size. Is there any other variable that can help determine how well each mode ran their jobs? I want to create graphs to compare both modes for my project. Thank you.
Cheers, A Df
