It's not clear to me how to stitch together multiple map reduce jobs. Without using cascading or something else like it, is the method basically to write to a intermediate spot, and have the next stage read from there?
If so, how are jobs responsible for cleaning up the temp/intermediate data they create? What happens if stage 1 completes, and state 2 doesn't, do the stage 1 files get left around? Does anyone have some insight they could share? Thanks.