Have you looked at DataNode metrics? $ sudo -u hdfs bash -c 'echo -e "open `cat $HADOOP_SECURE_DN_PID_DIR/hadoop_secure_dn.pid`\n get -b Hadoop:name=DataNode,service=DataNode *" | /usr/bin/java -jar /home/rajive/w/jmx/jmxterm.jar -n ' Welcome to JMX terminal. Type "help" for available commands. #Connection to 31177 is opened #mbean = Hadoop:name=DataNode,service=DataNode: tag.context = dfs; tag.sessionId = null; tag.hostName = phanpy-dn001.hadoop.apache.org; bytes_written = 3048581049273; bytes_read = 497034196749; blocks_written = 326391; blocks_read = 249262; blocks_replicated = 401423; blocks_removed = 29844; blocks_verified = 858155; block_verification_failures = 0; reads_from_local_client = 6671; reads_from_remote_client = 242591; writes_from_local_client = 7331; writes_from_remote_client = 319056; readBlockOp_num_ops = 249262; readBlockOp_avg_time = 17.0; writeBlockOp_num_ops = 326387; writeBlockOp_avg_time = 61.5; blockChecksumOp_num_ops = 22431; blockChecksumOp_avg_time = 59.0; copyBlockOp_num_ops = 0; copyBlockOp_avg_time = 0.0; replaceBlockOp_num_ops = 0; replaceBlockOp_avg_time = 0.0; heartBeats_num_ops = 590662; heartBeats_avg_time = 19.0; blockReports_num_ops = 571; blockReports_avg_time = 6205.0;
Mark question wrote on 10/21/11 at 17:27:23 -0700: >Hello, > > I wonder if there is a way to measure how many of the data blocks have >transferred over the network? Or more generally, how many times where there >a connection/contact between different machines? > > I thought of checking the Namenode log file which usually shows blk_.... >from src= to dst ... but I'm not sure if it's correct to count those lines. > >Any ideas are helpful. >Mark >
pgpYCKzhsZVY0.pgp
Description: PGP signature
