Hello,
I'm new to Hadoop and successfully built a fully distributed cluster of 3 nodes (1 master, 2 slaves) as a proof of concept. I have some questions below. Is there a dashboard to monitor the progress of a mapreduce computation? 1. I'm looking to ensure the computation gets allocated and uses the correct number of computation nodes 2. Monitor computation on the nodes (up/down/in-progress/completed) 3. If possible direct computation to specific group of nodes (depending on the computation priority). Similarly for HDFS 1. Ensure data file gets replicated to the correct number of nodes 2. If possible prioritize data replication (i.e. replicate data files that are accessed frequently to nodes that have better hardware, so some sort of load balancing distribution) Many Thanks, Caesar.
