Hi all, Happy (belated) New Year. I've been a bit quiet since I went on vacation for a few weeks recently, and since I have some other projects going on. But today I wanted to highlight some really cool stuff that's been happening in the HTrace and broader tracing community recently.
Last weekend, I gave an "intro to htrace" talk this weekend at the Scale 14x Linux conference. See https://www.socallinuxexpo.org/scale/14x/presentations/introducing-apache-htrace . Also there is a video here: https://www.youtube.com/watch?v=t-TwCLwYIGE It starts at 6:50, since the first few minutes of the recording are just me setting up the VGA connection and microphone :P I thought this one went really well (especially the demo), and hopefully will get the word out to even more people. In HDFS, we've been exploring adding annotations to certain trace spans to collect even more useful information. For example, Zhe Zhang posted a patch to HDFS-9576 to add "position" and "length" annotations to DFSInputStream#byteArrayRead, etc. spans. This should give us data on things like the average and median read length in a set of HDFS requests. Similarly, I posted HDFS-9674 which adds the "maximum write latency" to OpWriteBlock spans generated by the DataNode. This is very handy when analyzing HDFS write pipelines. I think we will see more of these really helpful annotations, and they will expand our ideas about what HTrace can do. A few weeks back, I wrote a blog post for my employer, Cloudera. It talks about setting up HTrace and htraced on a CDH5.5 cluster. See http://blog.cloudera.com/blog/2015/12/new-in-cloudera-labs-apache-htrace-incubating/ Hopefully HTrace can "bridge the chasm" between being a developer tool, and being a trusted ops tool. We have some ways to go, but having these precompiled packages for CDH5.5 is a big step forward. (OK, I'll shut up about vendor stuff now...) Another really cool thing is that Sean Busbey and others in the YCSB community are working on integrating HTrace. YCSB is a very popular benchmark for big data / Hadoopy workloads. The github issue is here: https://github.com/brianfrankcooper/YCSB/issues/415 cheers, Colin
