I did some analysis on the performance of Hadoop-based workflows. Some of the results are counter-intuitive so I thought the community at large would be interested:
http://nathanmarz.com/blog/hadoop-mathematics/ Would love to hear any feedback or comments you have.
