Hadoop noob here, just starting to learn it, as we're planning to start using it heavily in our processing. Just wondering, though, which version of the code I should start learning/working with.
It looks like the Hadoop API changed pretty significantly from 0.19 to 0.20 (e.g., org.apache.hadoop.mapred -> org.apache.hadoop.mapreduce), which leads me to think I should start with the new API. OTOH, since the new release is a ".0" release after some of these major API overhauls, I'm wondering if it's stable enough for us to start using in production. Where'd be best for me to start learning? TIA, DR